New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Invalid document ordering (last inserted should be returned first) #183
Comments
Hello there! Results are guaranteed to be returned by inverse insertion order. That means that most recent inserts will always come first. Sonic prioritize recent inserts in all queries. Though, suggest queries are not time-aware and will return whatever alphabetical ordering comes first. |
Hello, I observed quite the opposite behavior with v1.2.3:
I expected the results to be |
Thanks for the report, this does not look normal, indeed. I'll have a look whenever I process pending Sonic issues. |
is there any workaround to this issue? |
Not just yet. |
I've got news on this one: basically, the ordering is all right whenever exact match queries are submitted to Sonic, as it can enumerate the results for the term in natural storage order from the KV store (which is, most recent yields first). Now, if no exact match is found in the search index for provided search term, then the FST graph store will be used to predict alternate search words, which are yielded in an alphabetical-ordered way. The KV store is then queried against each alternate word that the FST yields, until the results page is full. Unfortunately, given the way the word prediction algorithm works, there is no way the latest-comes-first ordering would work in certain cases (ie. the test cases submitted in this issue). Note that the latest-comes-first ordering is still honored per-suggested word raised. Meaning that if you have sufficient results raised for eg. corrected word "hash" from "has", so that the results page is filled, then the objects returned will come in the expected latest-comes-first ordering. If the page is incompletely filled and the next suggested word is eg. "hazard", then "hazard" object results will be appended AFTER ordered "hash" results; thus breaking the time-based ordering. |
Similarly to #135, I'm considering using this to replace an ElasticSearch cluster. We have more than a couple of fields:
However, it should be reasonably straightforward to implement these using multiple searches and buckets, with frontend filtering. It may or may not be any faster than ES; that's part of what I'd like to find out.
The biggest problem is date, which is a range query. Is there any guaranteed ordering to query returns? If they're in any order other than insertion time, then we'd need to retriever every match and filter them in the frontend, which is unlikely to be net-positive.
The text was updated successfully, but these errors were encountered: