Search through millions of documents in milliseconds ⚡️

Overview

the milli logo

a concurrent indexer combined with fast and relevant search algorithms

Introduction

This repository contains the core engine used in MeiliSearch.

It contains a library that can manage one and only one index. MeiliSearch manages the multi-index itself. Milli is unable to store updates in a store: it is the job of something else above and this is why it is only able to process one update at a time.

This repository contains crates to quickly debug the engine:

  • There are benchmarks located in the benchmarks crate.
  • The http-ui crate is a simple HTTP dashboard to tests the features like for real!
  • The infos crate is used to dump the internal data-structure and ensure correctness.
  • The search crate is a simple command-line that helps run flamegraph on top of it.
  • The helpers crate is only used to modify the database inplace, sometimes.

Compile and run the HTTP debug server

You can specify the number of threads to use to index documents and many other settings too.

cd http-ui
cargo run --release -- --db my-database.mdb -vvv --indexing-jobs 8

Index your documents

It can index a massive amount of documents in not much time, I already achieved to index:

  • 115m songs (song and artist name) in ~48min and take 81GiB on disk.
  • 12m cities (name, timezone and country ID) in ~4min and take 6GiB on disk.

These metrics are done on a MacBook Pro with the M1 processor.

You can feed the engine with your CSV (comma-separated, yes) data like this:

printf "id,name,age\n1,hello,32\n2,kiki,24\n" | http POST 127.0.0.1:9700/documents content-type:text/csv

Don't forget to specify the id of the documents. Also, note that it supports JSON and JSON streaming: you can send them to the engine by using the content-type:application/json and content-type:application/x-ndjson headers respectively.

Querying the engine via the website

You can query the engine by going to the HTML page itself.

Contributing

You can setup a git-hook to stop you from making a commit too fast. It'll stop you if:

  • Any of the workspaces does not build
  • Your code is not well-formatted

These two things are also checked in the CI, so ignoring the hook won't help you merge your code. But if you need to, you can still add --no-verify when creating your commit to ignore the hook.

To enable the hook, run the following command from the root of the project:

cp script/pre-commit .git/hooks/pre-commit
Comments
  • Fix clippy error to add clippy job on Ci

    Fix clippy error to add clippy job on Ci

    Related PR

    This PR is for #673

    What does this PR do?

    • ~~add Run Clippy job to CI (rust.yml)~~
    • apply cargo clippy --fix command
    • fix some cargo clippy error manually (but warnings still remain on tests)

    PR checklist

    Please check if your PR fulfills the following requirements:

    • [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
    • [x] Have you read the contributing guidelines?
    • [x] Have you made sure that the title is accurate and descriptive of the changes?
    no breaking 
    opened by unvalley 47
  • Geosearch for zero radius

    Geosearch for zero radius

    Pull Request

    Related issue

    Fixes #3167 (https://github.com/meilisearch/meilisearch/issues/3167)

    What does this PR do?

    • allows Geosearch with zero radius to return the specified location when the coordinates match perfectly (instead of returning nothing). See link for more details.

    PR checklist

    Please check if your PR fulfills the following requirements:

    • [ X ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
    • [ X ] Have you read the contributing guidelines?
    • [ X ] Have you made sure that the title is accurate and descriptive of the changes?

    Thank you so much for contributing to Meilisearch!

    bug no breaking 
    opened by amab8901 23
  • Speed up debug builds

    Speed up debug builds

    Note: this draft PR is based on https://github.com/meilisearch/milli/pull/601 , for no particular reason.

    What does this PR do?

    Make a series of changes with the goal of speeding up debug builds:

    1. Add an all_languages feature which compiles charabia with its default features activated. The all_languages feature is activated by default. But running:
    cargo build --no-default-features
    

    on milli is now much faster.

    1. Reduce the debug optimisation level from 3 to 0, except for a few critical dependencies.

    2. Compile the build dependencies quicker as well. Previously, all build dependencies were compiled with opt-level = 3. Now, only the critical build dependencies are compiled with optimisations.

    3. Reduce the amount of code generated by the documents! macro

    4. Make the "progress update" closure provided to indexing functions a trait object instead of a generic parameter. This avoids monomorphising the indexing code multiple times needlessly.

    Results

    Initial build times on my computer before and after these changes: | | cargo check | cargo check --no-default-features | cargo test | cargo test --lib | cargo test --no-default-features | cargo test --lib --no-default-features | |--------|-------------|-----------------------------------|------------|------------------|----------------------------------|----------------------------------------| | before | 1m05s | 1m05s | 2m06s | 1m47s | 2m06 | 1m47s | | after | 28.9s | 13.1s | 40s | 38s | 23s | 21s |

    no breaking maintenance 
    opened by loiclec 21
  • Stop tracking the Cargo.lock and add cache + windows to the CI

    Stop tracking the Cargo.lock and add cache + windows to the CI

    We reuse the same ~/.cargo and ./target directory between each run on the same OS and rust toolchain. The key to decide if we can use the cache or not is: $OS_NAME-$RUST_TOOLCHAIN-$HASH(Cargo.toml)

    We also removed the Cargo.lock from this repository. Indeed, milli is a library and should not track the Cargo.lock

    And finally, we enabled the tests on windows-latest. Since lmdb has been updated, this is now possible.

    opened by irevoire 15
  • chore: added `IN`,`NOT IN` to `invalid_filter` msg

    chore: added `IN`,`NOT IN` to `invalid_filter` msg

    Pull Request

    Related issue

    Fixes https://github.com/meilisearch/meilisearch/issues/3004

    What does this PR do?

    • Improves correct error msg in response

    PR checklist

    Please check if your PR fulfills the following requirements:

    • [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
    • [x] Have you read the contributing guidelines?
    • [x] Have you made sure that the title is accurate and descriptive of the changes?

    Thank you so much for contributing to Meilisearch!

    no breaking 
    opened by Pranav-yadav 13
  • Store detected language per document during indexing

    Store detected language per document during indexing

    Pull Request

    Related issue

    Fixes #646

    What does this PR do?

    • [x] create a new database

      • [x] implementation
    • [x] update this database during indexing

      • [x] implementation
      • [x] tests
    • [x] update this database during deletion

      • [x] implementation
      • [x] tests

    PR checklist

    Please check if your PR fulfills the following requirements:

    • [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
    • [x] Have you read the contributing guidelines?
    • [x] Have you made sure that the title is accurate and descriptive of the changes?

    Thank you so much for contributing to Meilisearch!

    no breaking 
    opened by f3r10 12
  • fix distinct count bug

    fix distinct count bug

    fix https://github.com/meilisearch/meilisearch/issues/2152

    I think the issue was that we didn't take off the excluded candidates from the initial candidates when returning the candidates with the search result.

    no breaking 
    opened by MarinPostma 12
  • Fix phrase search containing stop words

    Fix phrase search containing stop words

    Pull Request

    This a WIP draft PR I wanted to create to let other potential contributors know that I'm working on this issue. I'll be completing this in a few hours from opening this.

    Related issue

    Fixes #661 and towards fixing meilisearch/meilisearch#2905

    What does this PR do?

    • [x] Change Phrase Operation to use a Vec<Option<String>> instead of Vec<String> where None corresponds to a stop word
    • [x] Update all other uses of phrase operation
    • [x] Update resolve_phrase
    • [x] Update create_primitive_query?
    • [x] Add test

    PR checklist

    Please check if your PR fulfills the following requirements:

    • [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
    • [x] Have you read the contributing guidelines?
    • [x] Have you made sure that the title is accurate and descriptive of the changes?
    no breaking 
    opened by Samyak2 11
  • Document batch support

    Document batch support

    This pr adds support for document batches in milli. It changes the API of the IndexDocuments builder by adding a add_documents method. The API of the updates is changed a little, with the UpdateBuilder being renamed to IndexerConfig and being passed to the update builders. This makes it easier to pass around structs that need to access the indexer config, rather that extracting the fields each time. This change impacts many function signatures and simplify them.

    The change in not thorough, and may require another PR to propagate to the whole codebase. I restricted to the necessary for this PR.

    opened by MarinPostma 11
  • Fixing piles of clippy errors.

    Fixing piles of clippy errors.

    Related issue

    No issue fixed. Simply cleaning up some code for clippy on the march towards a clean build when #659 is merged.

    What does this PR do?

    Most of these are calling clone when the struct supports Copy.

    Many are using & and &mut on self when the function they are called from already has an immutable or mutable borrow so this isn't needed.

    I tried to stay away from actual changes or places where I'd have to name fresh variables.

    PR checklist

    Please check if your PR fulfills the following requirements:

    • [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
    • [x] Have you read the contributing guidelines?
    • [x] Have you made sure that the title is accurate and descriptive of the changes?
    no breaking 
    opened by ehiggs 10
  • Release v0.26.4: returns facets even when there is no value associated to it + bug fix during indexation + bug fix on typo-tolerance

    Release v0.26.4: returns facets even when there is no value associated to it + bug fix during indexation + bug fix on typo-tolerance

    I cherry-picked the commits from these PRs

    • https://github.com/meilisearch/milli/pull/518
    • https://github.com/meilisearch/milli/pull/522
    • https://github.com/meilisearch/milli/pull/520

    release-v0.26.4 has been started from the tag v0.26.3 and not from main

    The release tag v0.26.4 will be done on the branch release-v0.26.4 once this PR is merge

    no breaking 
    opened by curquiza 10
  • fix: add POSIX semaphore feature

    fix: add POSIX semaphore feature

    Hi! Please take a look at this PR first. It will need to be merged before this PR can be merged.

    In summary, this PR adds support for running milli on iOS and macOS when using App Sandbox, which is required when submitting to the App Store (and required for all iOS apps; macOS apps can get away without needing the App Sandbox but it's recommended).

    Under the hood, this is done by adding a cargo feature to use LMDB's POSIX semaphores instead of SysV semaphores (which is LMDB's default). POSIX semaphores comply with Apple's App Sandbox and actually might bring slight performance improvements on *nix platforms.

    opened by GregoryConrad 1
  • Refactor search benchmarks

    Refactor search benchmarks

    Refactor and extend the search benchmarks.

    1. Use the new search parameter to test the set-based, iterative, and dynamic implementations independently. This allows us to surface performance improvements or regressions that could be hidden in the past. e.g. if there's a regression in the iterative implementation, but the benchmark just happened to use the set-based version all the time, we wouldn't detect the regression.
    2. Avoid reindexing the dataset between groups of searches. Instead, the dataset is indexed only once at the beginning, and then the settings are changed between each group of searches if needed.
    3. Avoid collecting and analysing the time that each individual search query takes. Instead, collect the total time that the whole group of search requests took. This will speed up the benchmarks and make it easier to read their results. Note that we should then try to group searches by similarity.
    4. Add new search requests that are known to have performance problems. Previously, the benchmarks only tested relatively easy search requests.
    5. Limit the number of samples used by the benchmarking tool to 10. This will also speed up the benchmarks, at the cost of adding some noise. However, I don't think noise will be a big problem, since we mostly want to detect large changes in performance. And since the benchmarks are now fast, we can also easily re-run them.
    benchmarks no breaking 
    opened by loiclec 1
  • Remove the

    Remove the "iterative" versions of the search algorithms

    The ranking rules proximity, sort, and attribute have two different implementation strategies. The first one (set-based) queries milli's databases and performs set operations on roaring bitmaps to find buckets of document ids. The second one (iterative) iterates on each candidate document and analyse their contents in order to sort them.

    Currently, we switch between the set-based and iterative implementation strategy based on the number of candidate documents that need to be sorted. In the proximity criterion, this is done with this constant:

    /// Threshold on the number of candidates that will make
    /// the system choose between one algorithm or another.
    const CANDIDATES_THRESHOLD: u64 = 1000;
    

    There are however, a few problems with this approach:

    1. The CANDIDATES_THRESHOLD will always be arbitrary and suboptimal depending on the kind of data that was indexed. Maybe a value of 1000 is the best choice for small documents containing just a few dozen words, but for people with documents that weigh >500 kB, we may opt into the iterative approach too soon and take a heavy performance penalty.

    2. We have to maintain two different implementations and update them both whenever we make a change to the behaviour of a ranking rule, which is difficult. It is also difficult to ensure that both implementations are equivalent. In fact, some ranking rules already behave differently depending on the implementation strategy that was chosen. For example, in proximity a difference occurs but only in some specific cases (e.g. when we have documents/queries with consecutive identical words), which is okay. But for attribute, it appears that there is a large difference between the two implementations.

    3. It is harder to benchmark search requests correctly. We might make a change in the iterative or set-based version of the algorithm, and then misjudged the impact of the change because the alternative implementation is used instead. (This is partly fixed by https://github.com/meilisearch/milli/pull/742 ).

    4. It is also harder to detect bugs in the implementation of the ranking rules, for the same reason as in (3)

    Ideally, when we refactor the search algorithms, we should aim to make the set-based strategy fast enough such that it is reasonable to use it even when sorting only two candidate documents. It would allow us to reduce the size of the code base and make performance/correctness problems more visible.

    Additionally, if we remove the iterative versions of the proximity and attribute ranking rules, we can also remove the docid_word_positions database, which will reduce the size of the index.

    thoughts querying benchmarks no breaking maintenance performance 
    opened by loiclec 0
  • Simplify primary key inference

    Simplify primary key inference

    Pull Request

    Related issue

    Related to https://github.com/meilisearch/meilisearch/issues/3233

    What does this PR do?

    User PoV

    • Change primary key inference to only consider a value as a candidate when it ends with "id", rather than when it simply contains "id".
    • Change primary key inference to always fail when there are multiple candidates.
    • Replace UserError::MissingPrimaryKey with UserError::NoPrimaryKeyCandidateFound and UserError::MultiplePrimaryKeyCandidatesFound

    Implementation-wise

    • Remove uses of UserError::MissingPrimaryKey not pertaining to inference. This introduces a possible panicking path.

    PR checklist

    Please check if your PR fulfills the following requirements:

    • [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
    • [x] Have you read the contributing guidelines?
    • [x] Have you made sure that the title is accurate and descriptive of the changes?

    Thank you so much for contributing to Meilisearch!

    indexing DB breaking API breaking 
    opened by dureuill 1
  • Enhance language detection

    Enhance language detection

    related to https://github.com/meilisearch/milli/issues/682

    TODO

    • [x] Extract detected Language during indexing and store it in a dedicated database
    • [ ] at search time: Use this database to create an allow_list and pass it to the TokenizerBuilder of Charabia
    • [ ] ensure that https://github.com/meilisearch/meilisearch/issues/2403 is fixed
    opened by ManyTheFish 0
  • Store fuzzy/bucketed positions in `word_position_docids` database

    Store fuzzy/bucketed positions in `word_position_docids` database

    Pull Request

    Related issue

    Fixes (when merged into meilisearch) https://github.com/meilisearch/meilisearch/issues/3222

    Implementation

    The design is described well in the related issue. For details of how different relative positions are grouped together, see the test bucketed_position.

    Basically, we no longer store the exact position of words that appear far into an attribute, but instead group relative positions together in buckets whose size grows exponentially with the original position. This is done to improve the relevancy and the performance of the attribute ranking rule.


    This is a draft until https://github.com/meilisearch/milli/pull/742 is merged and the results of the benchmarks are available.


    EDIT: I also realised just now that the iterative version of the algorithm needs to be updated as well!

    enhancement indexing querying DB breaking performance 
    opened by loiclec 0
Releases(v0.37.5)
  • v0.37.5(Dec 21, 2022)

  • v0.37.4(Dec 15, 2022)

    For v0.30.4 of Meilisearch, done on the release-v0.37.4 branch, starting itself from v0.37.3 tag

    Changes

    • Fix finite pagination with placeholder search (#743) @ManyTheFish

    Thanks again to @ManyTheFish ! 🎉

    Source code(tar.gz)
    Source code(zip)
  • v0.37.3(Dec 14, 2022)

    For v0.30.3 of Meilisearch, done on the release-v0.37.3 branch, starting itself from v0.37.2 tag

    Changes

    • Fix bug 2945/3021 (missing key in documents database) (#734) @loiclec
    • Fix typo initial candidates computation (#737) @ManyTheFish

    Thanks again to @ManyTheFish, @loiclec! 🎉

    Source code(tar.gz)
    Source code(zip)
  • v0.37.2(Dec 8, 2022)

    For v0.30.2 of Meilisearch, done on the release-v0.37.2 branch, starting itself from v0.37.1 tag

    Changes

    • Fix bug in filter search (#727) @loiclec
    • Fix distincted exhaustive hits (#729) @ManyTheFish
    Source code(tar.gz)
    Source code(zip)
  • v0.37.1(Dec 6, 2022)

    For v0.30.1 of Meilisearch, done on the release-v0.37.1 branch, starting itself from v0.37.0 tag

    Changes

    • Fix bulk facet indexing bug (#712) @loiclec
    • Make soft deletion optional in document addition and deletion + add lots of tests (#720) @loiclec
    • Geosearch for zero radius (#722) @amab8901
    • Fix bug in handling of soft deleted documents when updating settings (#723) @loiclec

    Thanks again to @Kerollmops, @amab8901, and @loiclec! 🎉

    Source code(tar.gz)
    Source code(zip)
  • v0.33.6(Dec 1, 2022)

  • v0.37.0(Nov 17, 2022)

    DB breaking

    • Force vendoring of LMDB even if a system version is available (#699) @dureuill

    Changes

    • ci: Use pre-compiled binaries for faster CI (#685) @azzamsa
    • Fix Facet Indexing bugs (#696) @loiclec
    • Fix bug in prefix DB indexing (#697) @loiclec

    Thanks again to @azzamsa, @dureuill, @loiclec ! 🎉

    Source code(tar.gz)
    Source code(zip)
  • v0.36.0(Nov 9, 2022)

  • v0.35.1(Nov 9, 2022)

    Changes

    • Fix phrase search containing stop words (#664) @Samyak2
    • Deleted empty files (#675) @SKVKPandey
    • chore: added IN,NOT IN to invalid_filter msg (#676) @Pranav-yadav
    • run the tests in all workspaces (#677) @irevoire
    • Fix clippy error to add clippy job on Ci (#659) @unvalley
    • Add clippy job (#673) @unvalley
    • Handle non-finite floats consistently in filters (#689) @dureuill
    • Fix soft deleted bug settings (#690) @Kerollmops

    Thanks again to @Kerollmops, @Pranav-yadav, @SKVKPandey, @Samyak2, @curquiza, @dependabot, @dependabot[bot], @dureuill, @irevoire, and @unvalley! 🎉

    Source code(tar.gz)
    Source code(zip)
  • v0.35.0(Oct 26, 2022)

    API breaking

    • Introduce an indexation abortion function when indexing documents (#616) @Kerollmops
    • Refactor the Facets databases to enable incremental indexing (#619) @loiclec

    DB breaking

    • Reduce the size of the word_pair_proximity database (#639) @loiclec
    • Refactor the Facets databases to enable incremental indexing (#619) @loiclec

    Changes

    • Add method to create a new Index with specific creation dates (#669) @loiclec
    • Fix many Clippy errors part 2 (#668) @ehiggs

    Thanks again to @Kerollmops, @curquiza, @ehiggs, @loiclec ! 🎉

    Source code(tar.gz)
    Source code(zip)
  • v0.34.0(Oct 24, 2022)

    API breaking

    • Add settings to force milli to exhaustively compute the total number of hits (#586) @ManyTheFish

    DB breaking

    • Add proximity calculation for the same word (#658) @msvaljek

    Changes

    • Use an unstable algorithm for grenad::Sorter when possible (#635) @loiclec
    • Remove unused infos, http-ui, and milli/fuzz, crates (#636) @loiclec
    • Remove helpers crate (#641) @loiclec
    • Remove LTO in release profile (#642) @loiclec
    • Re-upload milli's logo (#654) @jeertmans
    • Speed up debug builds (#604) @loiclec
    • Upgrade all dependencies (#655) @loiclec
    • Fixing piles of clippy errors. (#665) @ehiggs
    • Fix #652 - Change Spelling of author in README.md (#653) @anirudhRowjee
    • Add missing logging timer to extractors (#650) @vishalsodani

    Thanks again to @ManyTheFish, @akki1306, @anirudhRowjee, @curquiza, @ehiggs, @jeertmans, @loiclec, @msvaljek and @vishalsodani! 🎉

    Source code(tar.gz)
    Source code(zip)
  • v0.33.5(Nov 14, 2022)

  • v0.33.4(Sep 13, 2022)

    Changes

    • Revert "Remove Bors required test for Windows" (#631) @curquiza
    • Upgrade ubuntu-18.04 to 20.04 (#633) @curquiza
    • Make charabia default feature optional (#632) @vincent-herlemont
    • We avoid skipping errors in the indexing pipeline (#637) @Kerollmops

    Thanks again to @Kerollmops, @curquiza, and @vincent-herlemont! 🎉

    Source code(tar.gz)
    Source code(zip)
  • v0.33.3(Sep 7, 2022)

    Changes

    • Add CI to update the Milli version (#621) @curquiza
    • Add dependabot for GHA (#623) @curquiza
    • Make sure that long words are ignored (#628) @Kerollmops

    Thanks again to @Kerollmops, @curquiza! 🎉

    Source code(tar.gz)
    Source code(zip)
  • v0.33.2(Sep 1, 2022)

  • v0.33.1(Aug 31, 2022)

    Changes

    • Remove the artifacts of the past (#615) @irevoire
    • Accept integers as document ids again (#617) @Kerollmops

    Thanks again to @Kerollmops, and @irevoire! 🎉

    Source code(tar.gz)
    Source code(zip)
  • v0.33.0(Aug 23, 2022)

    API breaking

    • Add EXISTS filter (#556) @loiclec
    • Better threshold (#607) @irevoire
    • Filter operators: NOT + IN[..] (#596) @loiclec
    • Matching query terms policy (#598) @ManyTheFish
    • Upgrade charabia v0.6.0 (#611) @ManyTheFish

    DB breaking

    • Add EXISTS filter (#556) @loiclec

    Changes

    • fix: Remove whitespace trimming during document id validation (#599) @ManyTheFish
    • Stop reindexing already indexed documents (#579) @irevoire
    • Simplify some unit tests (#600) @loiclec
    • Introduce snapshot tests (#601) @loiclec
    • Make binaries faster on release profile through better compile options (#606) @loiclec
    • Use mimalloc as the default allocator (#602) @loiclec
    • Optimise facets indexing (#590) @loiclec
    • Fix(Search): Fix phrase search candidates computation (#594) @ManyTheFish
    • Fix soft deleted documents (#608) @ManyTheFish
    • Word prefix pair proximity docids indexation refactor (#587) @loiclec
    • Remove Bors required test for Windows (#612) @curquiza

    Thanks again to @ManyTheFish, @curquiza, @irevoire and @loiclec! 🎉

    Source code(tar.gz)
    Source code(zip)
  • v0.32.0(Jul 21, 2022)

    DB breaking

    • Fasten documents deletion and update (#557) @irevoire

    Changes

    • Fix not equal filter when field contains both number and strings (#568) @GraDKh
    • Bump uuid to 1.1.2 (#578) @Kerollmops
    • Use BufReader to read datasets in benchmarks (#583) @loiclec
    • Fix name of "release_date" facet in movies benchmarks (#588) @loiclec
    • Enriched documents batch reader (#561) @Kerollmops

    Thanks again to @GraDKh, @Kerollmops, @ManyTheFish, @bigstream-dmytro, @irevoire and @loiclec! 🎉

    Source code(tar.gz)
    Source code(zip)
  • v0.31.2(Jul 21, 2022)

    Release done on releaes-v0.31.2 branch, starting from v0.31.1 tag, and not from main. Will be used for Meilisearch v0.28.1 (hotfix)

    Fixes

    • Ensure that matches are all highlighted (#592) @ManyTheFish
    Source code(tar.gz)
    Source code(zip)
  • v0.31.1(Jun 23, 2022)

  • v0.31.0(Jun 22, 2022)

    DB breaking

    • Rename the limitedTo parameter into maxTotalHits (#564) @Kerollmops

    Changes

    • Improve the estimatedNbHits when a distinctAttribute is specified (#563) @Kerollmops

    Thanks again to @Kerollmops ! 🎉

    Source code(tar.gz)
    Source code(zip)
  • v0.30.0(Jun 20, 2022)

    Fixes

    • Avoid having an ending separator before crop marker (#559) @ManyTheFish

    API breaking

    • Ensure that Index methods are not bypassed by Meilisearch (#554) @ManyTheFish

    Changes

    • Deletion benchmarks (#558) @ManyTheFish

    Thanks again to @ManyTheFish! 🎉

    Source code(tar.gz)
    Source code(zip)
  • v0.29.3(Jun 9, 2022)

  • v0.29.2(Jun 9, 2022)

    Changes

    • Setup the new limits on the number of facet values to return (#548) @Kerollmops
    • Add the two new pagination and faceting settings (#550) @Kerollmops

    Thanks again to @Kerollmops ! 🎉

    Source code(tar.gz)
    Source code(zip)
  • v0.29.1(Jun 8, 2022)

  • v0.26.6(Jun 7, 2022)

    ⚠️ This release was done on release-v0.26.6, a branch starting from v0.26.5 and not from main

    What's Changed

    • bump version number by @irevoire in https://github.com/meilisearch/milli/pull/545
    • Fix wrong internal ids assignments (for details: see full changelog link below)

    Full Changelog: https://github.com/meilisearch/milli/compare/v0.26.5...v0.26.6

    Source code(tar.gz)
    Source code(zip)
  • v0.29.0(Jun 6, 2022)

    API breaking

    • Integrate charabia (#540) @ManyTheFish
    • Refactor matching word (#542) @ManyTheFish

    DB breaking

    • Integrate charabia (#540) @ManyTheFish

    Changes

    • Improves ranking rules error message (#536) @matthias-wright
    • Speedup exact words (#538) @MarinPostma
    • Reintroduce the max values by facet limit (#535) @Kerollmops

    Thanks again to @Kerollmops, @ManyTheFish, @MarinPostma, @curquiza and @matthias-wright! 🎉

    Source code(tar.gz)
    Source code(zip)
  • v0.28.0(May 18, 2022)

    API breaking

    • Improve geosearch error messages and accept string as coord (#523) @irevoire
    • fix the searchable fields bug when a field is nested (#530) @irevoire

    DB breaking

    • fix the searchable fields bug when a field is nested (#530) @irevoire

    Changes

    • Add a test to make sure that long words are handled (#504) @Kerollmops
    • normalize exact words (#505) @MarinPostma
    • First version of new CONTRIBUTING.md (#508) @curquiza
    • Remove pr_status from bors settings (#509) @curquiza
    • Improve the README (#515) @Kerollmops
    • Stop flattening every field (#514) @irevoire
    • fix cargo warnings (#506) @MarinPostma
    • Make nightly CI run every week (#517) @curquiza
    • Return facets even when there is no value associated to it (#518) @Kerollmops
    • deny warnings in CI (#507) @MarinPostma
    • Do not generate keys that are too long for LMDB (#522) @Kerollmops
    • fix mistake in Settings initialization (#520) @MarinPostma
    • fix the mixed dataset geosearch indexing bug (#531) @irevoire

    Thanks again to @Kerollmops, @ManyTheFish, @MarinPostma, @curquiza and @irevoire! 🎉

    Source code(tar.gz)
    Source code(zip)
  • v0.26.5(May 16, 2022)

    Release done on release-v0.26.5 branch, starting from the v0.26.4 tag, and not from main

    • Fix mixed dataset geosearch bug #529 @irevoire
    • Fix searchable fields bug when a field is nested #528 @irevoire
    Source code(tar.gz)
    Source code(zip)
  • v0.26.4(May 4, 2022)

    Changes

    • Return facets even when there is no value associated to it (#518) @Kerollmops
    • Do not generate keys that are too long for LMDB (#522) @Kerollmops
    • fix mistake in Settings initialization (#520) @MarinPostma

    Thanks again to @Kerollmops, and @MarinPostma! 🎉

    Source code(tar.gz)
    Source code(zip)
Owner
MeiliSearch
Powerful, fast, and an easy to use search engine
MeiliSearch
Easy c̵̰͠r̵̛̠ö̴̪s̶̩̒s̵̭̀-t̶̲͝h̶̯̚r̵̺͐e̷̖̽ḁ̴̍d̶̖̔ ȓ̵͙ė̶͎ḟ̴͙e̸̖͛r̶̖͗ë̶̱́ṉ̵̒ĉ̷̥e̷͚̍ s̷̹͌h̷̲̉a̵̭͋r̷̫̊ḭ̵̊n̷̬͂g̵̦̃ f̶̻̊ơ̵̜ṟ̸̈́ R̵̞̋ù̵̺s̷̖̅ţ̸͗!̸̼͋

Rust S̵̓i̸̓n̵̉ I̴n̴f̶e̸r̵n̷a̴l mutability! Howdy, friendly Rust developer! Ever had a value get m̵̯̅ð̶͊v̴̮̾ê̴̼͘d away right under your nose just when

null 294 Dec 23, 2022
EasyAlgolia is a Rust crate designed for utilizing the Algolia admin client. It simplifies the process of updating and inserting documents into Algolia's search index.

crate link EasyAlgolia is a Rust crate designed for utilizing the Algolia admin client. It simplifies the process of updating and inserting documents

faizal khan 3 Mar 20, 2024
Rapidly Search and Hunt through Windows Event Logs

Rapidly Search and Hunt through Windows Event Logs Chainsaw provides a powerful ‘first-response’ capability to quickly identify threats within Windows

F-Secure Countercept 1.8k Dec 31, 2022
Shogun search - Learning the principle of search engine. This is the first time I've written Rust.

shogun_search Learning the principle of search engine. This is the first time I've written Rust. A search engine written in Rust. Current Features: Bu

Yuxiang Liu 5 Mar 9, 2022
Finding all pairs of similar documents time- and memory-efficiently

Finding all pairs of similar documents This software provides time- and memory-efficient all pairs similarity searches in documents. Problem definitio

null 53 Jan 2, 2023
A simple and lightweight fuzzy search engine that works in memory, searching for similar strings (a pun here).

simsearch A simple and lightweight fuzzy search engine that works in memory, searching for similar strings (a pun here). Documentation Usage Add the f

Andy Lok 116 Dec 10, 2022
Lightning Fast, Ultra Relevant, and Typo-Tolerant Search Engine

MeiliSearch Website | Roadmap | Blog | LinkedIn | Twitter | Documentation | FAQ ⚡ Lightning Fast, Ultra Relevant, and Typo-Tolerant Search Engine ?? M

MeiliSearch 31.6k Dec 31, 2022
High-performance log search engine.

NOTE: This project is under development, please do not depend on it yet as things may break. MinSQL MinSQL is a log search engine designed with simpli

High Performance, Kubernetes Native Object Storage 359 Nov 27, 2022
Perlin: An Efficient and Ergonomic Document Search-Engine

Table of Contents 1. Perlin Perlin Perlin is a free and open-source document search engine library build on top of perlin-core. Since the first releas

CurrySoftware GmbH 70 Dec 9, 2022
Tantivy is a full-text search engine library inspired by Apache Lucene and written in Rust

Tantivy is a full text search engine library written in Rust. It is closer to Apache Lucene than to Elasticsearch or Apache Solr in the sense it is no

tantivy 7.4k Dec 28, 2022
A full-text search and indexing server written in Rust.

Bayard Bayard is a full-text search and indexing server written in Rust built on top of Tantivy that implements Raft Consensus Algorithm and gRPC. Ach

Bayard Search 1.8k Dec 26, 2022
AI-powered search engine for Rust

txtai: AI-powered search engine for Rust txtai executes machine-learning workflows to transform data and build AI-powered text indices to perform simi

NeuML 69 Jan 2, 2023
A full-text search engine in rust

Toshi A Full-Text Search Engine in Rust Please note that this is far from production ready, also Toshi is still under active development, I'm just slo

Toshi Search 3.8k Jan 7, 2023
🔍TinySearch is a lightweight, fast, full-text search engine. It is designed for static websites.

tinysearch TinySearch is a lightweight, fast, full-text search engine. It is designed for static websites. TinySearch is written in Rust, and then com

null 2.2k Dec 31, 2022
🔎 Impossibly fast web search, made for static sites.

Stork Impossibly fast web search, made for static sites. Stork is two things. First, it's an indexer: it indexes your loosely-structured content and c

James Little 2.5k Dec 27, 2022
🦔 Fast, lightweight & schema-less search backend. An alternative to Elasticsearch that runs on a few MBs of RAM.

?? Fast, lightweight & schema-less search backend. An alternative to Elasticsearch that runs on a few MBs of RAM.

Valerian Saliou 17.4k Jan 2, 2023
⚡ Insanely fast, 🌟 Feature-rich searching. lnx is the adaptable deployment of the tantivy search engine you never knew you wanted. Standing on the shoulders of giants.

✨ Feature Rich | ⚡ Insanely Fast An ultra-fast, adaptable deployment of the tantivy search engine via REST. ?? Standing On The Shoulders of Giants lnx

lnx 679 Jan 1, 2023
⚡ Insanely fast, 🌟 Feature-rich searching. lnx is the adaptable deployment of the tantivy search engine you never knew you wanted. Standing on the shoulders of giants.

✨ Feature Rich | ⚡ Insanely Fast An ultra-fast, adaptable deployment of the tantivy search engine via REST. ?? Standing On The Shoulders of Giants lnx

lnx 0 Apr 25, 2022
Cross-platform, cross-browser, cross-search-engine duckduckgo-like bangs

localbang Cross-platform, cross-browser, cross-search-engine duckduckgo-like bangs What are "bangs"?? Bangs are a way to define where to search inside

Jakob Kruse 7 Nov 23, 2022