A full-text search engine in rust

Overview

Toshi

A Full-Text Search Engine in Rust

License: MIT Codacy Badge Actions Status codecov Join the chat at https://gitter.im/toshi-search/Toshi dependency status

Please note that this is far from production ready, also Toshi is still under active development, I'm just slow.

Description

Toshi is meant to be a full-text search engine similar to Elasticsearch. Toshi strives to be to Elasticsearch what Tantivy is to Lucene.

Motivations

Toshi will always target stable Rust and will try our best to never make any use of unsafe Rust. While underlying libraries may make some use of unsafe, Toshi will make a concerted effort to vet these libraries in an effort to be completely free of unsafe Rust usage. The reason I chose this was because I felt that for this to actually become an attractive option for people to consider it would have to have be safe, stable and consistent. This was why stable Rust was chosen because of the guarantees and safety it provides. I did not want to go down the rabbit hole of using nightly features to then have issues with their stability later on. Since Toshi is not meant to be a library, I'm perfectly fine with having this requirement because people who would want to use this more than likely will take it off the shelf and not modify it. My motivation was to cater to that use case when building Toshi.

Build Requirements

At this current time Toshi should build and work fine on Windows, Mac OS X, and Linux. From dependency requirements you are going to need 1.39.0 and Cargo installed in order to build. You can get rust easily from rustup.

Configuration

There is a default configuration file in config/config.toml:

host = "127.0.0.1"
port = 8080
path = "data2/"
writer_memory = 200000000
log_level = "info"
json_parsing_threads = 4
bulk_buffer_size = 10000
auto_commit_duration = 10
experimental = false

[experimental_features]
master = true
nodes = [
    "127.0.0.1:8081"
]

[merge_policy]
kind = "log"
min_merge_size = 8
min_layer_size = 10_000
level_log_size = 0.75
Host

host = "localhost"

The hostname Toshi will bind to upon start.

Port

port = 8080

The port Toshi will bind to upon start.

Path

path = "data/"

The data path where Toshi will store its data and indices.

Writer Memory

writer_memory = 200000000

The amount of memory (in bytes) Toshi should allocate to commits for new documents.

Log Level

log_level = "info"

The detail level to use for Toshi's logging.

Json Parsing

json_parsing_threads = 4

When Toshi does a bulk ingest of documents it will spin up a number of threads to parse the document's json as it's received. This controls the number of threads spawned to handle this job.

Bulk Buffer

bulk_buffer_size = 10000

This will control the buffer size for parsing documents into an index. It will control the amount of memory a bulk ingest will take up by blocking when the message buffer is filled. If you want to go totally off the rails you can set this to 0 in order to make the buffer unbounded.

Auto Commit Duration

auto_commit_duration = 10

This controls how often an index will automatically commit documents if there are docs to be committed. Set this to 0 to disable this feature, but you will have to do commits yourself when you submit documents.

Merge Policy
[merge_policy]
kind = "log"

Tantivy will merge index segments according to the configuration outlined here. There are 2 options for this. "log" which is the default segment merge behavior. Log has 3 additional values to it as well. Any of these 3 values can be omitted to use Tantivy's default value. The default values are listed below.

min_merge_size = 8
min_layer_size = 10_000
level_log_size = 0.75

In addition there is the "nomerge" option, in which Tantivy will do no merging of segments.

Experimental Settings
experimental = false

[experimental_features]
master = true
nodes = [
    "127.0.0.1:8081"
]

In general these settings aren't ready for usage yet as they are very unstable or flat out broken. Right now the distribution of Toshi is behind this flag, so if experimental is set to false then all these settings are ignored.

Building and Running

Toshi can be built using cargo build --release. Once Toshi is built you can run ./target/release/toshi from the top level directory to start Toshi according to the configuration in config/config.toml

You should get a startup message like this.

  ______         __   _   ____                 __
 /_  __/__  ___ / /  (_) / __/__ ___ _________/ /
  / / / _ \(_-</ _ \/ / _\ \/ -_) _ `/ __/ __/ _ \
 /_/  \___/___/_//_/_/ /___/\__/\_,_/_/  \__/_//_/
 Such Relevance, Much Index, Many Search, Wow
 
 INFO  toshi::index > Indexes: []

You can verify Toshi is running with:

curl -X GET http://localhost:8080/

which should return:

{
  "name": "Toshi Search",
  "version": "0.1.1"
}

Once toshi is running it's best to check the requests.http file in the root of this project to see some more examples of usage.

Example Queries

Term Query
{ "query": {"term": {"test_text": "document" } }, "limit": 10 }
Fuzzy Term Query
{ "query": {"fuzzy": {"test_text": {"value": "document", "distance": 0, "transposition": false } } }, "limit": 10 }
Phrase Query
{ "query": {"phrase": {"test_text": {"terms": ["test","document"] } } }, "limit": 10 }
Range Query
{ "query": {"range": { "test_i64": { "gte": 2012, "lte": 2015 } } }, "limit": 10 }
Regex Query
{ "query": {"regex": { "test_text": "d[ou]{1}c[k]?ument" } }, "limit": 10 }
Boolean Query
{ "query": {"bool": {"must": [ { "term": { "test_text": "document" } } ], "must_not": [ {"range": {"test_i64": { "gt": 2017 } } } ] } }, "limit": 10 }
Usage

To try any of the above queries you can use the above example

curl -X POST http://localhost:8080/test_index -H 'Content-Type: application/json' -d '{ "query": {"term": {"test_text": "document" } }, "limit": 10 }'

Also, to note, limit is optional, 10 is the default value. It's only included here for completeness.

Running Tests

cargo test

What is a Toshi?

Toshi is a three year old Shiba Inu. He is a very good boy and is the official mascot of this project. Toshi personally reviews all code before it is committed to this repository and is dedicated to only accepting the highest quality contributions from his human. He will, though, accept treats for easier code reviews.

Comments
  • toshi-proto fails to build on macOS Catalina

    toshi-proto fails to build on macOS Catalina

    Describe the bug When compiling toshi, the compilation process halts at toshi-proto with the following error message:

     /Toshi/toshi-proto $ cargo build --release                                                                                                       
       Compiling toshi-proto v0.1.0 (/Users/bucko/Toshi/toshi-proto)
    error: failed to run custom build command for `toshi-proto v0.1.0 (/Users/bucko/Toshi/toshi-proto)`
    
    Caused by:
      process didn't exit successfully: `/Users/bucko/Toshi/target/release/build/toshi-proto-879e8b6972e136dd/build-script-build` (exit code: 101)
    --- stderr
    thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/libcore/result.rs:1165:5
    stack backtrace:
       0:        0x107bcbf25 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hb58f5347bb1ea7b9
       1:        0x107bed9f1 - core::fmt::write::ha9abb3322774bd4f
       2:        0x107bdbe39 - std::io::Write::write_fmt::h72ef3cad46ef5da7
       3:        0x107bcdb43 - std::panicking::default_hook::{{closure}}::hb0f024d11ad4eda7
       4:        0x107bcd86f - std::panicking::default_hook::h8f36dfbe4fa71efd
       5:        0x107bce27c - std::panicking::rust_panic_with_hook::h19f1d0cf9bc48900
       6:        0x107bcdd39 - std::panicking::continue_panic_fmt::ha03dadffb484b681
       7:        0x107bcdc89 - rust_begin_unwind
       8:        0x107bf5b9c - core::panicking::panic_fmt::hf2ccf250d7b43fe7
       9:        0x107bf59a9 - core::result::unwrap_failed::h4d74b9130c1d473e
      10:        0x107ae7971 - tonic_build::fmt::hcc47c796d443ae31
      11:        0x107aba0fd - build_script_build::main::hc8f0475db060ef8d
      12:        0x107a9f11b - std::rt::lang_start::{{closure}}::h27ced1c1e74dca7f
      13:        0x107bcdc68 - std::panicking::try::do_call::he4301933c114d277
      14:        0x107bdde1f - __rust_maybe_catch_panic
      15:        0x107bcac63 - std::rt::lang_start_internal::h93184985107106ec
      16:        0x107abbd79 - main
    

    To Reproduce Steps to reproduce the behavior:

    1. On macOS Catalina 10.15.1, run git clone https://github.com/toshi-search/Toshi.git
    2. cd Toshi
    3. export RUST_BACKTRACE=full
    4. cargo build --release
    5. See error (copied & pasted above)

    Expected behavior Toshi compiles successfully

    Desktop (please complete the following information):

    • OS: macOS Catalina 10.15.1
    • Rust Version: 1.41.0
    Does not build 
    opened by mediumrarez 12
  • Cluster Tracking Issue

    Cluster Tracking Issue

    This issue tracks implementation of clustering into Toshi.

    • [x] Decide on clustering method (master/client, masterless). Relates to #15
    • [x] Add CLI flag for Consul cluster address. Relates to #18
    • [x] Add CLI flag for cluster name. Relates to #19
    • [x] Decide on shard replication scheme. Relates to #16.
    • [ ] Enumerate failure modes
    • [ ] Write tests for failure modes (network splits, etc)
    • [ ] Write module to register with Consul. Relates to #24.

    * High jacking Fletcher's issue for my own tracking * - Steve

    I've laid out some of the groundwork for this with tower-grpc, so I should be able to get rolling with this today.

    Things I'd like to do by end of the year

    • [ ] Create master election
      • [ ] A way to just tell Toshi the masters and nodes without consul doing it
      • [x] Have nodes report the indexes they have to consul for Placement Driver
    • [x] Extend IndexHandle to understand it might have to do an rpc call to fullfill a query
    • [x] Move current IndexHandle into LocalIndexHandle
    • [x] Move Index Searching into IndexHandle and out of IndexCatalog
    • [x] Extend IndexCatalog to include both LocalIndexHandle and RemoteIndexHandle
    • [ ] Attempt to re-factor / re-use as much of the searching / indexing code available to fullfill RPC needs
    • [x] Get a working, naive base case working for distributed, index creation, document addition and searching.
    • [ ] Finish the 3000 commendation score Shield in The Division
    magic unicorn clustering 
    opened by fhaynes 9
  • How to sort search?

    How to sort search?

    If RE-indexing is not supported (#490 ) I wonder how to properly sort a search result?

    In fact, I understand I can create multiple indexes for each sort order I need (e.g: my_index_asc, my_index_desc), however that means I should be adding new documents in the correct order which is not realistic.

    That being said, I don't get the limit option on queries if that does not come with a sorting instruction. Maybe I misunderstood something, can someone shed some lights on this ?

    Thanks :)

    enhancement good first issue 
    opened by Jonarod 8
  • Unexpected character in root json response

    Unexpected character in root json response

    After issuing the command curl -X GET http://localhost:8080 -output - the following response is returned by toshi:

    ){"name":"Toshi Search","version":"0.1.1"}
    

    As you can see, this is invalid json, there's an aditional character at the start of the response and it vary between: -, +, ,, ), etc. I don't know why it's returning this, seems like garbage to me. Removing the Deflate middleware somehow solves the problem. If you try to change the name in ToshiInfo the initial byte changes too, pretty strange.

    bug invalid 
    opened by mauri870 7
  • Bump remove_dir_all from 0.5.2 to 0.5.3

    Bump remove_dir_all from 0.5.2 to 0.5.3

    Bumps remove_dir_all from 0.5.2 to 0.5.3.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language
    • @dependabot badge me will comment on this PR with code to add a "Dependabot enabled" badge to your readme

    Additionally, you can set the following in your Dependabot dashboard:

    • Update frequency (including time of day and day of week)
    • Pull request limits (per update run and/or open at any time)
    • Automerge options (never/patch/minor, and dev/runtime dependencies)
    • Out-of-range updates (receive only lockfile updates, if desired)
    • Security updates (receive only security updates, if desired)
    dependencies 
    opened by dependabot-preview[bot] 6
  • Bump hashbrown from 0.4.0 to 0.5.0

    Bump hashbrown from 0.4.0 to 0.5.0

    Bumps hashbrown from 0.4.0 to 0.5.0.

    Changelog

    Sourced from hashbrown's changelog.

    [v0.5.0] - 2019-06-12

    Fixed

    • Resize with a more conservative amount of space after deletions. (#86)

    Changed

    • Exposed the Layout of the failed allocation in CollectionAllocErr::AllocErr. (#89)
    Commits
    • e7cd4a5 Version 0.5.0
    • 2c16341 Auto merge of #89 - SimonSapin:layout, r=Amanieu
    • c2e26e7 Expose the Layout of the failed allocation in CollectionAllocErr::AllocErr
    • 14bb115 Auto merge of #87 - josephrocca:patch-1, r=Amanieu
    • 80dc544 Add simple usage code
    • b240ee0 Auto merge of #86 - edre:rehash, r=Amanieu
    • afc20d7 Resize with a more conservative amount of space when inserting after deletions.
    • 119f429 Auto merge of #84 - edre:bench, r=Amanieu
    • 8783490 Update insert_erase benchmark to keep the table size constant.
    • 143609d Auto merge of #83 - edre:bench, r=Amanieu
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot ignore this [patch|minor|major] version will close this PR and stop Dependabot creating any more for this minor/major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language
    • @dependabot badge me will comment on this PR with code to add a "Dependabot enabled" badge to your readme

    Additionally, you can set the following in your Dependabot dashboard:

    • Update frequency (including time of day and day of week)
    • Automerge options (never/patch/minor, and dev/runtime dependencies)
    • Pull request limits (per update run and/or open at any time)
    • Out-of-range updates (receive only lockfile updates, if desired)
    • Security updates (receive only security updates, if desired)

    Finally, you can contact us by mentioning @dependabot.

    dependencies 
    opened by dependabot-preview[bot] 6
  • Failed registering node: connection refused

    Failed registering node: connection refused

    After successfull build of release target (Ubuntu), while running

    $ RUST_BACKTRACE=1 ./target/release/toshi
    

    I keep on getting:

     INFO  toshi > Settings { host: "127.0.0.1", port: 8080, path: "data/", place_addr: "0.0.0.0:8082", log_level: "info", writer_memory: 200000000, json_parsing_threads: 4, auto_commit_duration: 10, bulk_buffer_size: 10000, merge_policy: ConfigMergePolicy { kind: "log", min_merge_size: Some(8), min_layer_size: Some(10000), level_log_size: Some(0.75) }, consul_addr: "127.0.0.1:8500", cluster_name: "kitsune", enable_clustering: true, master: true, nodes: ["127.0.0.1:8081", "127.0.0.1:8082"] }
    
      ______         __   _   ____                 __
     /_  __/__  ___ / /  (_) / __/__ ___ _________/ /
      / / / _ \(_-</ _ \/ / _\ \/ -_) _ `/ __/ __/ _ \
     /_/  \___/___/_//_/_/ /___/\__/\_,_/_/  \__/_//_/
     Such Relevance, Much Index, Many Search, Wow
     
     ERROR toshi > Error: Failed registering Node: Inner(Inner(Error { kind: Connect, cause: Os { code: 111, kind: ConnectionRefused, message: "Connection refused" } }))
    thread 'main' panicked at 'internal error: entered unreachable code: Shutdown signal channel should not error, This is a bug.', src/bin/toshi.rs:68:22
    stack backtrace:
       0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
                 at src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:49
       1: std::panicking::default_hook::{{closure}}
                 at src/libstd/sys_common/backtrace.rs:71
                 at src/libstd/sys_common/backtrace.rs:59
                 at src/libstd/panicking.rs:211
       2: std::panicking::rust_panic_with_hook
                 at src/libstd/panicking.rs:227
                 at src/libstd/panicking.rs:491
       3: std::panicking::continue_panic_fmt
                 at src/libstd/panicking.rs:398
       4: std::panicking::begin_panic_fmt
                 at src/libstd/panicking.rs:353
       5: toshi::main::{{closure}}
       6: <futures::task_impl::Spawn<T>>::enter::{{closure}}
       7: toshi::main
       8: std::rt::lang_start::{{closure}}
       9: main
      10: __libc_start_main
      11: _start
    

    netstat shows me that 8080 isn't in use by another process, and running command with sudo doesn't change anything. Message clearly state that this is a bug. So... is there a solution or not?

    bad error handling 
    opened by vicrac 6
  • toshi panic at run time due to tokio

    toshi panic at run time due to tokio

    The following exception threw at run time when running toshi in latest update

    thread 'tokio-runtime-worker' panicked at 'assertion failed: now >= lock.elapsed', /home/my-user/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.2.0/src/time/driver/mod.rs:260:9

    opened by abdelmonem17 5
  • Build issues

    Build issues

    Hi. I am trying to run this on ubuntu. The build command gives the following error-

    error: failed to run custom build command for toshi-proto v0.1.0 (/home/ubuntu/Toshi/toshi-proto)

    Caused by: process didn't exit successfully: /home/ubuntu/Toshi/target/release/build/toshi-proto-d68a507dc1a704df/build-script-build (exit code: 1) --- stderr error running rustfmt: Os { code: 2, kind: NotFound, message: "No such file or directory" }

    How can it be rectified? Thank you.

    opened by mjl4043 5
  • Replace futures mpsc and oneshot by tokio-sync

    Replace futures mpsc and oneshot by tokio-sync

    The tokio team released the tokio-sync crate which offers a mpsc and oneshot channel analogous to the ones in the futures crate.

    In some benchs, the new mpsc shows up to 7x improvement over the futures version. The oneshot implementation only provides a slight performance improvement, but when updated to the std version of Future a much better performance can be expected.

    toshi-search#115

    opened by mauri870 5
  • Improve error reporting when loading index_catalog fails

    Improve error reporting when loading index_catalog fails

    Hello!

    I tried building and running Toshi on https://gitpod.io, and I got this mysterious error:

    $ cargo build --release
    [...]
    $ target/release/toshi
    Error Encountered - IO Error: No such file or directory (os error 2)
    

    This pull request helped me better understand the problem:

    $ target/release/toshi
    Error loading index_catalog from path "data/" - IO Error: No such file or directory
    

    And indeed:

    $ mkdir data
    $ target/release/toshi
     INFO  toshi::index > Indexes: []
    
      ______         __   _   ____                 __
     /_  __/__  ___ / /  (_) / __/__ ___ _________/ /
      / / / _ \(_-</ _ \/ / _\ \/ -_) _ `/ __/ __/ _ \
     /_/  \___/___/_//_/_/ /___/\__/\_,_/_/  \__/_//_/
     Such Relevance, Much Index, Many Search, Wow
    
    opened by jankeromnes 5
  • Reindexing an existing index

    Reindexing an existing index

    Is your feature request related to a problem? Please describe. I am not able to find a way to re-index or even delete an existing index. Once the index is created, I have to assume that the schema will never change.

    Does another search engine have this functionality? Can you describe it's function? N/A

    Do you have a specific use case you are trying to solve? Create an index for a given schema Add some documents to your created index. Update the schema by introducing two additional fields. You should be able to add documents or editing existing ones with the two newly introduced fields.

    Additional context N/A

    opened by seekeramento 4
  • How close to Elasticsearch do you want to be?

    How close to Elasticsearch do you want to be?

    We're looking for a lightweight alternative to Elastic Search that we could use in automated tests. It's not 100% clear from the README if the goal of this project is to create something that's compatible in API to Elastic Search.

    For it to work for our purpose we need the REST API to be the same and for Toshi to accept queries in the same format. Does this align with how you see the goals of this project?

    Thank you for working on it. It's a great idea.

    opened by taras 1
  • POSTd data over 8kb in size fails

    POSTd data over 8kb in size fails

    Describe the bug curl -v -H "Content-Type: application/json" --data-binary @bulkbody.json http://localhost:8000/emails/_bulk

    If bulkbody.json is over 8kb in size, toshi returns 400 bad request {"message":"Error in Index: 'The provided string is not valid JSON'"}

    To Reproduce Steps to reproduce the behavior: Post a a request over 8kb in size

    Expected behavior Should be able to accept 8kb POST data or be configurable

    Screenshots If applicable, add screenshots to help explain your problem.

    Desktop (please complete the following information):

    • OS: [e.g. iOS]
    • Rust Version:
    • Version [e.g. 22]

    Additional context Add any other context about the problem here.

    opened by prawnsalad 3
  • More guidances in need

    More guidances in need

    Hi folks,

    I want to say, your were doing a great job on migrating/creating a full-text search engine by rust, maybe it will take a piece of cake from the ES.

    I cloned the project, compiled and did some tests based on the requests in requests.http and others, Toshi works well and its operations were quite similar with ES which means a lot for me.

    env: Mac rustc: 1.55.0 branch: master

    But, unfortunately, I got some blocks on my further research:

    1. how to config & deploy multiple instance in a single cloud instance? I can't find the guidance on the config & deploys through. Do I need to copy the execution file toshi along with the config folder to another path? Do I need to keep the hierarchical structure of the folders?
    2. what are the differences among these config files, config.toml config-bench.toml config-rpc.toml, each of them contains the similar content in master branch, which config file do I need in which scenario?
    3. do we have other way to do operations in Toshi other than the restful api? e.g. some rust (or maybe other language) sdk or some?
    opened by godlockin 3
  • Is there any way to use a custom tokenizer?

    Is there any way to use a custom tokenizer?

    Is your feature request related to a problem? Please describe. https://github.com/tantivy-search/tantivy#features One of the features tantivy provides is to support custom tokenizers. For example tantivy-jieba. Is it possible for Toshi to support this feature?

    Does another search engine have this functionality? Can you describe it's function?

    Do you have a specific use case you are trying to solve?

    Additional context

    opened by dzcpy 5
  • Search for all fields instead of a single field

    Search for all fields instead of a single field

    Currently when searching we need to provide the fields we explicitly need to search for. e.g.

    { "query": { "term": { "lyrics": "cool" } }, "limit": 10 }

    Is it possible to specify a query that can search on all the fields? Am I thinking about it in right way? I have been able to do so in elasticsearch. Also looking at tantivy wikipedia example it seems we should be able to do it.

    opened by prabhatsharma 1
Owner
Toshi Search
Toshi Search
A full-text search engine in rust

Toshi A Full-Text Search Engine in Rust Please note that this is far from production ready, also Toshi is still under active development, I'm just slo

Toshi Search 3.8k Jan 7, 2023
Tantivy is a full text search engine library written in Rust.

Tantivy is a full text search engine library written in Rust. It is closer to Apache Lucene than to Elasticsearch or Apache Solr in the sense it is no

Quickwit OSS 7.4k Dec 30, 2022
Tantivy is a full-text search engine library inspired by Apache Lucene and written in Rust

Tantivy is a full-text search engine library written in Rust. It is closer to Apache Lucene than to Elasticsearch or Apache Solr in the sense it is no

Quickwit OSS 7.5k Jan 9, 2023
🔍TinySearch is a lightweight, fast, full-text search engine. It is designed for static websites.

tinysearch TinySearch is a lightweight, fast, full-text search engine. It is designed for static websites. TinySearch is written in Rust, and then com

null 2.2k Dec 31, 2022
A full-text search and indexing server written in Rust.

Bayard Bayard is a full-text search and indexing server written in Rust built on top of Tantivy that implements Raft Consensus Algorithm and gRPC. Ach

Bayard Search 1.8k Dec 26, 2022
Shogun search - Learning the principle of search engine. This is the first time I've written Rust.

shogun_search Learning the principle of search engine. This is the first time I've written Rust. A search engine written in Rust. Current Features: Bu

Yuxiang Liu 5 Mar 9, 2022
AI-powered search engine for Rust

txtai: AI-powered search engine for Rust txtai executes machine-learning workflows to transform data and build AI-powered text indices to perform simi

NeuML 69 Jan 2, 2023
A Rust API search engine

Roogle Roogle is a Rust API search engine, which allows you to search functions by names and type signatures. Progress Available Queries Function quer

Roogle 342 Dec 26, 2022
Python bindings for Milli, the embeddable Rust-based search engine powering Meilisearch

milli-py Python bindings for Milli, the embeddable Rust-based search engine powering Meilisearch. Due to limitations around Rust lifecycles, methods a

Alexandro Sanchez 92 Feb 21, 2023
A simple and lightweight fuzzy search engine that works in memory, searching for similar strings (a pun here).

simsearch A simple and lightweight fuzzy search engine that works in memory, searching for similar strings (a pun here). Documentation Usage Add the f

Andy Lok 116 Dec 10, 2022
Lightning Fast, Ultra Relevant, and Typo-Tolerant Search Engine

MeiliSearch Website | Roadmap | Blog | LinkedIn | Twitter | Documentation | FAQ ⚡ Lightning Fast, Ultra Relevant, and Typo-Tolerant Search Engine ?? M

MeiliSearch 31.6k Dec 31, 2022
High-performance log search engine.

NOTE: This project is under development, please do not depend on it yet as things may break. MinSQL MinSQL is a log search engine designed with simpli

High Performance, Kubernetes Native Object Storage 359 Nov 27, 2022
Perlin: An Efficient and Ergonomic Document Search-Engine

Table of Contents 1. Perlin Perlin Perlin is a free and open-source document search engine library build on top of perlin-core. Since the first releas

CurrySoftware GmbH 70 Dec 9, 2022
⚡ Insanely fast, 🌟 Feature-rich searching. lnx is the adaptable deployment of the tantivy search engine you never knew you wanted. Standing on the shoulders of giants.

✨ Feature Rich | ⚡ Insanely Fast An ultra-fast, adaptable deployment of the tantivy search engine via REST. ?? Standing On The Shoulders of Giants lnx

lnx 679 Jan 1, 2023
⚡ Insanely fast, 🌟 Feature-rich searching. lnx is the adaptable deployment of the tantivy search engine you never knew you wanted. Standing on the shoulders of giants.

✨ Feature Rich | ⚡ Insanely Fast An ultra-fast, adaptable deployment of the tantivy search engine via REST. ?? Standing On The Shoulders of Giants lnx

lnx 0 Apr 25, 2022
Cross-platform, cross-browser, cross-search-engine duckduckgo-like bangs

localbang Cross-platform, cross-browser, cross-search-engine duckduckgo-like bangs What are "bangs"?? Bangs are a way to define where to search inside

Jakob Kruse 7 Nov 23, 2022
Configurable quick search engine shortcuts for your terminal and browser.

Quicksearch Configurable quick search engine shortcuts for your terminal and browser. Installation Run cargo install quicksearch to install Configurat

Rahul Pai 2 Oct 14, 2022
Image search example by approximate nearest-neighbor library In Rust

rust-ann-search-example Image search example by approximate nearest-neighbor library In Rust use - tensorflow 0.17.0 - pretrain ResNet50 - hora (Ru

vaaaaanquish 8 Jan 3, 2022
🔎 Impossibly fast web search, made for static sites.

Stork Impossibly fast web search, made for static sites. Stork is two things. First, it's an indexer: it indexes your loosely-structured content and c

James Little 2.5k Dec 27, 2022