Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid document ordering (last inserted should be returned first) #183

Closed
Baughn opened this issue Oct 16, 2019 · 6 comments
Closed

Invalid document ordering (last inserted should be returned first) #183

Baughn opened this issue Oct 16, 2019 · 6 comments
Assignees
Labels
bug Something isn't working
Milestone

Comments

@Baughn
Copy link

Baughn commented Oct 16, 2019

Similarly to #135, I'm considering using this to replace an ElasticSearch cluster. We have more than a couple of fields:

  • Message text
  • User
  • Tags
  • Message date

However, it should be reasonably straightforward to implement these using multiple searches and buckets, with frontend filtering. It may or may not be any faster than ES; that's part of what I'd like to find out.

The biggest problem is date, which is a range query. Is there any guaranteed ordering to query returns? If they're in any order other than insertion time, then we'd need to retriever every match and filter them in the frontend, which is unlikely to be net-positive.

@valeriansaliou
Copy link
Owner

Hello there!

Results are guaranteed to be returned by inverse insertion order. That means that most recent inserts will always come first. Sonic prioritize recent inserts in all queries.

Though, suggest queries are not time-aware and will return whatever alphabetical ordering comes first.

@valeriansaliou valeriansaliou added the question Further information is requested label Oct 17, 2019
@fpeterschmitt
Copy link

fpeterschmitt commented Oct 27, 2019

Hello,

I observed quite the opposite behavior with v1.2.3:

CONNECTED <sonic-server v1.2.3>
START control SecretPassword
STARTED control protocol(1) buffer(20000)
TRIGGER consolidate
OK
TRIGGER consolidate
OK
TRIGGER consolidate
OK

─────────────────────
CONNECTED <sonic-server v1.2.3>
START ingest SecretPassword
STARTED ingest protocol(1) buffer(20000)
FLUSHC shipment
RESULT 1


PUSH shipment shipment ID1 "text"
OK
PUSH shipment shipment ID2 "text"
OK

PUSH shipment shipment ID3 "ID3"
OK
PUSH shipment shipment ID4 "ID4"
OK

PUSH shipment shipment ID5 "id:ID5"
OK
PUSH shipment shipment ID6 "id:ID6"
OK
─────────────────
QUERY shipment shipment "text"
PENDING rcKnspVK
EVENT QUERY rcKnspVK ID2 ID1

QUERY shipment shipment "ID"
PENDING ZmCu6XYC
EVENT QUERY ZmCu6XYC ID3 ID4

QUERY shipment shipment "ID"
PENDING RF7TfOKk
EVENT QUERY RF7TfOKk ID3 ID4 ID5 ID6

I expected the results to be ID6 ID5 ID4 ID3 :/

@valeriansaliou
Copy link
Owner

Thanks for the report, this does not look normal, indeed. I'll have a look whenever I process pending Sonic issues.

@valeriansaliou valeriansaliou added bug Something isn't working and removed question Further information is requested labels Oct 28, 2019
@valeriansaliou valeriansaliou changed the title Document ordering? Invalid document ordering (last inserted should be returned first) Oct 28, 2019
@valeriansaliou valeriansaliou added this to the v1.3.0 milestone Oct 28, 2019
@valeriansaliou valeriansaliou self-assigned this Nov 24, 2019
@valeriansaliou valeriansaliou modified the milestones: v1.3.0, v1.2.4 Nov 25, 2019
@miqe
Copy link

miqe commented Jun 19, 2020

is there any workaround to this issue?

@valeriansaliou
Copy link
Owner

Not just yet.

@valeriansaliou
Copy link
Owner

I've got news on this one: basically, the ordering is all right whenever exact match queries are submitted to Sonic, as it can enumerate the results for the term in natural storage order from the KV store (which is, most recent yields first).

Now, if no exact match is found in the search index for provided search term, then the FST graph store will be used to predict alternate search words, which are yielded in an alphabetical-ordered way. The KV store is then queried against each alternate word that the FST yields, until the results page is full.

Unfortunately, given the way the word prediction algorithm works, there is no way the latest-comes-first ordering would work in certain cases (ie. the test cases submitted in this issue). Note that the latest-comes-first ordering is still honored per-suggested word raised. Meaning that if you have sufficient results raised for eg. corrected word "hash" from "has", so that the results page is filled, then the objects returned will come in the expected latest-comes-first ordering. If the page is incompletely filled and the next suggested word is eg. "hazard", then "hazard" object results will be appended AFTER ordered "hash" results; thus breaking the time-based ordering.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants