We're going to be adding some advertisements to the forums! If you notice any weirdness around this or spot bad/inappropriate ads, please make a thread in the bugs forum.

Thread Search: Missed Posts and False Positives

I'm getting false positives and missing posts with the in-thread search at the bottom of threads on mobile. Been noticing the missing posts for a while, never caught a false positive before

Ex: Go to the D&D Mueller thread and search FBI

I get this post as the second hit

https://forums.penny-arcade.com/discussion/comment/39576352/#Comment_39576352

Which doesn't contain that string at all, and it misses the one from the previous page:
Jragghen wrote: »
https://www.thedailybeast.com/exclusive-fbi-seizes-control-of-russian-botnet

At this point I cannot find evidence that it was Mueller's team that did so, but the FBI has seized one of the main servers of Russia's botnet which was used in 2016.

Which contains it at least once as a whole word match.

Re: false positives, it seems to treat "lie" and "lying" as synonyms, it just won't highlight the matched synonym in the result. But I can't imagine what it would be confusing FBI with if that's the case.

Lie / lying is a good example of weird spottiness, as they each return the same 4 posts in the Mueller thread;

The OP (May 6)
Two on page 3 (May 8)
And that same post of mine noted as a false positive, which uses it in a quote of a previous post inside a spoiler.

ArbitraryDescriptor on
JebusUD

Posts

  • IrukaIruka Registered User, Moderator mod
    Search is subject to a series of indexing that is relatively thorough but takes time to build. Editing posts is one of those things that will generally cause this to be more pronounced, as it would need to be reindexed and that's subject to one of the longer cycles of the process. Generally you can answer most search questions with some variety of "this would be fine in a little bit", with that "little bit" being up to 2-3 weeks for a full refresh.

    It looks like its pulling the FBI from the URL slugs as a mark, its truncated in the search results and the post was edited three times, so there might be something that was pulled out of it but is still showing up. Jragghens post is from two days ago and might need a moment to be indexed if something got in the way of the process at that moment.

  • ArbitraryDescriptorArbitraryDescriptor Registered User regular
    Iruka wrote: »
    It looks like its pulling the FBI from the URL slugs as a mark, its truncated in the search results and the post was edited three times, so there might be something that was pulled out of it but is still showing up.

    Jragghens post is from two days ago and might need a moment to be indexed if something got in the way of the process at that moment.
    Gotcha. So a post might pop positive if the indexer cached an old version. (In this case I just missed the "FBI" in the wapo URL.)

    And: Posts are not always (successfully) indexed in chronological order, and that whole thread is still inside the conceivable window it takes to straighten itself out.

    Sounds representative of my recent issues searching for sub-month-old posts. Thanks for the explanation!


Sign In or Register to comment.