A full-text search engine in rust

Toshi Search

Last update: Jan 7, 2023

Related tags

Text search search rust search-engine elasticsearch indexing rust-lang

Overview

Toshi

A Full-Text Search Engine in Rust

Please note that this is far from production ready, also Toshi is still under active development, I'm just slow.

Description

Toshi is meant to be a full-text search engine similar to Elasticsearch. Toshi strives to be to Elasticsearch what Tantivy is to Lucene.

Motivations

Toshi will always target stable Rust and will try our best to never make any use of unsafe Rust. While underlying libraries may make some use of unsafe, Toshi will make a concerted effort to vet these libraries in an effort to be completely free of unsafe Rust usage. The reason I chose this was because I felt that for this to actually become an attractive option for people to consider it would have to have be safe, stable and consistent. This was why stable Rust was chosen because of the guarantees and safety it provides. I did not want to go down the rabbit hole of using nightly features to then have issues with their stability later on. Since Toshi is not meant to be a library, I'm perfectly fine with having this requirement because people who would want to use this more than likely will take it off the shelf and not modify it. My motivation was to cater to that use case when building Toshi.

Build Requirements

At this current time Toshi should build and work fine on Windows, Mac OS X, and Linux. From dependency requirements you are going to need 1.39.0 and Cargo installed in order to build. You can get rust easily from rustup.

Configuration

There is a default configuration file in config/config.toml:

host = "127.0.0.1"
port = 8080
path = "data2/"
writer_memory = 200000000
log_level = "info"
json_parsing_threads = 4
bulk_buffer_size = 10000
auto_commit_duration = 10
experimental = false

[experimental_features]
master = true
nodes = [
    "127.0.0.1:8081"
]

[merge_policy]
kind = "log"
min_merge_size = 8
min_layer_size = 10_000
level_log_size = 0.75

Host

host = "localhost"

The hostname Toshi will bind to upon start.

Port

port = 8080

The port Toshi will bind to upon start.

Path

path = "data/"

The data path where Toshi will store its data and indices.

Writer Memory

writer_memory = 200000000

The amount of memory (in bytes) Toshi should allocate to commits for new documents.

Log Level

log_level = "info"

The detail level to use for Toshi's logging.

Json Parsing

json_parsing_threads = 4

When Toshi does a bulk ingest of documents it will spin up a number of threads to parse the document's json as it's received. This controls the number of threads spawned to handle this job.

Bulk Buffer

bulk_buffer_size = 10000

This will control the buffer size for parsing documents into an index. It will control the amount of memory a bulk ingest will take up by blocking when the message buffer is filled. If you want to go totally off the rails you can set this to 0 in order to make the buffer unbounded.

Auto Commit Duration

auto_commit_duration = 10

This controls how often an index will automatically commit documents if there are docs to be committed. Set this to 0 to disable this feature, but you will have to do commits yourself when you submit documents.

Merge Policy

[merge_policy]
kind = "log"

Tantivy will merge index segments according to the configuration outlined here. There are 2 options for this. "log" which is the default segment merge behavior. Log has 3 additional values to it as well. Any of these 3 values can be omitted to use Tantivy's default value. The default values are listed below.

min_merge_size = 8
min_layer_size = 10_000
level_log_size = 0.75

In addition there is the "nomerge" option, in which Tantivy will do no merging of segments.

Experimental Settings

experimental = false

[experimental_features]
master = true
nodes = [
    "127.0.0.1:8081"
]

In general these settings aren't ready for usage yet as they are very unstable or flat out broken. Right now the distribution of Toshi is behind this flag, so if experimental is set to false then all these settings are ignored.

Building and Running

Toshi can be built using cargo build --release. Once Toshi is built you can run ./target/release/toshi from the top level directory to start Toshi according to the configuration in config/config.toml

You should get a startup message like this.

  ______         __   _   ____                 __
 /_  __/__  ___ / /  (_) / __/__ ___ _________/ /
  / / / _ \(_-</ _ \/ / _\ \/ -_) _ `/ __/ __/ _ \
 /_/  \___/___/_//_/_/ /___/\__/\_,_/_/  \__/_//_/
 Such Relevance, Much Index, Many Search, Wow
 
 INFO  toshi::index > Indexes: []

You can verify Toshi is running with:

curl -X GET http://localhost:8080/

which should return:

{
  "name": "Toshi Search",
  "version": "0.1.1"
}

Once toshi is running it's best to check the requests.http file in the root of this project to see some more examples of usage.

Example Queries

Term Query

{ "query": {"term": {"test_text": "document" } }, "limit": 10 }

Fuzzy Term Query

{ "query": {"fuzzy": {"test_text": {"value": "document", "distance": 0, "transposition": false } } }, "limit": 10 }

Phrase Query

{ "query": {"phrase": {"test_text": {"terms": ["test","document"] } } }, "limit": 10 }

Range Query

{ "query": {"range": { "test_i64": { "gte": 2012, "lte": 2015 } } }, "limit": 10 }

Regex Query

{ "query": {"regex": { "test_text": "d[ou]{1}c[k]?ument" } }, "limit": 10 }

Boolean Query

{ "query": {"bool": {"must": [ { "term": { "test_text": "document" } } ], "must_not": [ {"range": {"test_i64": { "gt": 2017 } } } ] } }, "limit": 10 }

Usage

To try any of the above queries you can use the above example

curl -X POST http://localhost:8080/test_index -H 'Content-Type: application/json' -d '{ "query": {"term": {"test_text": "document" } }, "limit": 10 }'

Also, to note, limit is optional, 10 is the default value. It's only included here for completeness.

Running Tests

cargo test

What is a Toshi?

Toshi is a three year old Shiba Inu. He is a very good boy and is the official mascot of this project. Toshi personally reviews all code before it is committed to this repository and is dedicated to only accepting the highest quality contributions from his human. He will, though, accept treats for easier code reviews.

Comments

toshi-proto fails to build on macOS Catalina

Describe the bug When compiling toshi, the compilation process halts at toshi-proto with the following error message:

 /Toshi/toshi-proto $ cargo build --release                                                                                                       
   Compiling toshi-proto v0.1.0 (/Users/bucko/Toshi/toshi-proto)
error: failed to run custom build command for `toshi-proto v0.1.0 (/Users/bucko/Toshi/toshi-proto)`

Caused by:
  process didn't exit successfully: `/Users/bucko/Toshi/target/release/build/toshi-proto-879e8b6972e136dd/build-script-build` (exit code: 101)
--- stderr
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/libcore/result.rs:1165:5
stack backtrace:
   0:        0x107bcbf25 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hb58f5347bb1ea7b9
   1:        0x107bed9f1 - core::fmt::write::ha9abb3322774bd4f
   2:        0x107bdbe39 - std::io::Write::write_fmt::h72ef3cad46ef5da7
   3:        0x107bcdb43 - std::panicking::default_hook::{{closure}}::hb0f024d11ad4eda7
   4:        0x107bcd86f - std::panicking::default_hook::h8f36dfbe4fa71efd
   5:        0x107bce27c - std::panicking::rust_panic_with_hook::h19f1d0cf9bc48900
   6:        0x107bcdd39 - std::panicking::continue_panic_fmt::ha03dadffb484b681
   7:        0x107bcdc89 - rust_begin_unwind
   8:        0x107bf5b9c - core::panicking::panic_fmt::hf2ccf250d7b43fe7
   9:        0x107bf59a9 - core::result::unwrap_failed::h4d74b9130c1d473e
  10:        0x107ae7971 - tonic_build::fmt::hcc47c796d443ae31
  11:        0x107aba0fd - build_script_build::main::hc8f0475db060ef8d
  12:        0x107a9f11b - std::rt::lang_start::{{closure}}::h27ced1c1e74dca7f
  13:        0x107bcdc68 - std::panicking::try::do_call::he4301933c114d277
  14:        0x107bdde1f - __rust_maybe_catch_panic
  15:        0x107bcac63 - std::rt::lang_start_internal::h93184985107106ec
  16:        0x107abbd79 - main

To Reproduce Steps to reproduce the behavior:

On macOS Catalina 10.15.1, run git clone https://github.com/toshi-search/Toshi.git
cd Toshi
export RUST_BACKTRACE=full
cargo build --release
See error (copied & pasted above)

Expected behavior Toshi compiles successfully

Desktop (please complete the following information):

OS: macOS Catalina 10.15.1
Rust Version: 1.41.0

Does not build

opened by mediumrarez 12

Cluster Tracking Issue
This issue tracks implementation of clustering into Toshi.

[x] Decide on clustering method (master/client, masterless). Relates to #15

[x] Add CLI flag for Consul cluster address. Relates to #18

[x] Add CLI flag for cluster name. Relates to #19

[x] Decide on shard replication scheme. Relates to #16.

[ ] Enumerate failure modes

[ ] Write tests for failure modes (network splits, etc)

[ ] Write module to register with Consul. Relates to #24.

* High jacking Fletcher's issue for my own tracking * - Steve

I've laid out some of the groundwork for this with tower-grpc, so I should be able to get rolling with this today.

Things I'd like to do by end of the year

[ ] Create master election

[ ] A way to just tell Toshi the masters and nodes without consul doing it

[x] Have nodes report the indexes they have to consul for Placement Driver

[x] Extend IndexHandle to understand it might have to do an rpc call to fullfill a query

[x] Move current IndexHandle into LocalIndexHandle

[x] Move Index Searching into IndexHandle and out of IndexCatalog

[x] Extend IndexCatalog to include both LocalIndexHandle and RemoteIndexHandle

[ ] Attempt to re-factor / re-use as much of the searching / indexing code available to fullfill RPC needs

[x] Get a working, naive base case working for distributed, index creation, document addition and searching.

[ ] Finish the 3000 commendation score Shield in The Division

magic unicorn clustering
opened by fhaynes 9
How to sort search?

If RE-indexing is not supported (#490 ) I wonder how to properly sort a search result?

In fact, I understand I can create multiple indexes for each sort order I need (e.g: my_index_asc, my_index_desc), however that means I should be adding new documents in the correct order which is not realistic.

That being said, I don't get the limit option on queries if that does not come with a sorting instruction. Maybe I misunderstood something, can someone shed some lights on this ?

Thanks :)
enhancement good first issue

opened by Jonarod 8
Unexpected character in root json response
After issuing the command curl -X GET http://localhost:8080 -output - the following response is returned by toshi:

){"name":"Toshi Search","version":"0.1.1"}

As you can see, this is invalid json, there's an aditional character at the start of the response and it vary between: -, +, ,, ), etc. I don't know why it's returning this, seems like garbage to me. Removing the Deflate middleware somehow solves the problem. If you try to change the name in ToshiInfo the initial byte changes too, pretty strange.
bug invalid
opened by mauri870 7
Bump remove_dir_all from 0.5.2 to 0.5.3
Bumps remove_dir_all from 0.5.2 to 0.5.3.

Commits

0c218da Release 0.5.3

11ba75f Replace "cfg(test)" with "cfg(doctest)" for readme testing (#17)

dd12f96 lib: improve docs and lints (#15)

e2747c6 chore: Include README.md (#14)

See full diff in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

@dependabot badge me will comment on this PR with code to add a "Dependabot enabled" badge to your readme

Additionally, you can set the following in your Dependabot dashboard:

Update frequency (including time of day and day of week)

Pull request limits (per update run and/or open at any time)

Automerge options (never/patch/minor, and dev/runtime dependencies)

Out-of-range updates (receive only lockfile updates, if desired)

Security updates (receive only security updates, if desired)

dependencies
opened by dependabot-preview[bot] 6
Bump hashbrown from 0.4.0 to 0.5.0
Bumps hashbrown from 0.4.0 to 0.5.0.

Changelog

Sourced from hashbrown's changelog.

[v0.5.0] - 2019-06-12

Fixed

Resize with a more conservative amount of space after deletions. (#86)

Changed

Exposed the Layout of the failed allocation in CollectionAllocErr::AllocErr. (#89)

Commits

e7cd4a5 Version 0.5.0

2c16341 Auto merge of #89 - SimonSapin:layout, r=Amanieu

c2e26e7 Expose the Layout of the failed allocation in CollectionAllocErr::AllocErr

14bb115 Auto merge of #87 - josephrocca:patch-1, r=Amanieu

80dc544 Add simple usage code

b240ee0 Auto merge of #86 - edre:rehash, r=Amanieu

afc20d7 Resize with a more conservative amount of space when inserting after deletions.

119f429 Auto merge of #84 - edre:bench, r=Amanieu

8783490 Update insert_erase benchmark to keep the table size constant.

143609d Auto merge of #83 - edre:bench, r=Amanieu

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot ignore this [patch|minor|major] version will close this PR and stop Dependabot creating any more for this minor/major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

@dependabot badge me will comment on this PR with code to add a "Dependabot enabled" badge to your readme

Additionally, you can set the following in your Dependabot dashboard:

Update frequency (including time of day and day of week)

Automerge options (never/patch/minor, and dev/runtime dependencies)

Pull request limits (per update run and/or open at any time)

Out-of-range updates (receive only lockfile updates, if desired)

Security updates (receive only security updates, if desired)

Finally, you can contact us by mentioning @dependabot.

dependencies
opened by dependabot-preview[bot] 6

Failed registering node: connection refused

After successfull build of release target (Ubuntu), while running

$ RUST_BACKTRACE=1 ./target/release/toshi

I keep on getting:

 INFO  toshi > Settings { host: "127.0.0.1", port: 8080, path: "data/", place_addr: "0.0.0.0:8082", log_level: "info", writer_memory: 200000000, json_parsing_threads: 4, auto_commit_duration: 10, bulk_buffer_size: 10000, merge_policy: ConfigMergePolicy { kind: "log", min_merge_size: Some(8), min_layer_size: Some(10000), level_log_size: Some(0.75) }, consul_addr: "127.0.0.1:8500", cluster_name: "kitsune", enable_clustering: true, master: true, nodes: ["127.0.0.1:8081", "127.0.0.1:8082"] }

  ______         __   _   ____                 __
 /_  __/__  ___ / /  (_) / __/__ ___ _________/ /
  / / / _ \(_-</ _ \/ / _\ \/ -_) _ `/ __/ __/ _ \
 /_/  \___/___/_//_/_/ /___/\__/\_,_/_/  \__/_//_/
 Such Relevance, Much Index, Many Search, Wow
 
 ERROR toshi > Error: Failed registering Node: Inner(Inner(Error { kind: Connect, cause: Os { code: 111, kind: ConnectionRefused, message: "Connection refused" } }))
thread 'main' panicked at 'internal error: entered unreachable code: Shutdown signal channel should not error, This is a bug.', src/bin/toshi.rs:68:22
stack backtrace:
   0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
             at src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:49
   1: std::panicking::default_hook::{{closure}}
             at src/libstd/sys_common/backtrace.rs:71
             at src/libstd/sys_common/backtrace.rs:59
             at src/libstd/panicking.rs:211
   2: std::panicking::rust_panic_with_hook
             at src/libstd/panicking.rs:227
             at src/libstd/panicking.rs:491
   3: std::panicking::continue_panic_fmt
             at src/libstd/panicking.rs:398
   4: std::panicking::begin_panic_fmt
             at src/libstd/panicking.rs:353
   5: toshi::main::{{closure}}
   6: <futures::task_impl::Spawn<T>>::enter::{{closure}}
   7: toshi::main
   8: std::rt::lang_start::{{closure}}
   9: main
  10: __libc_start_main
  11: _start

netstat shows me that 8080 isn't in use by another process, and running command with sudo doesn't change anything. Message clearly state that this is a bug. So... is there a solution or not?

bad error handling

opened by vicrac 6

toshi panic at run time due to tokio

The following exception threw at run time when running toshi in latest update

thread 'tokio-runtime-worker' panicked at 'assertion failed: now >= lock.elapsed', /home/my-user/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.2.0/src/time/driver/mod.rs:260:9

opened by abdelmonem17 5
Build issues

Hi. I am trying to run this on ubuntu. The build command gives the following error-

error: failed to run custom build command for toshi-proto v0.1.0 (/home/ubuntu/Toshi/toshi-proto)

Caused by: process didn't exit successfully: /home/ubuntu/Toshi/target/release/build/toshi-proto-d68a507dc1a704df/build-script-build (exit code: 1) --- stderr error running rustfmt: Os { code: 2, kind: NotFound, message: "No such file or directory" }

How can it be rectified? Thank you.

opened by mjl4043 5
Replace futures mpsc and oneshot by tokio-sync

The tokio team released the tokio-sync crate which offers a mpsc and oneshot channel analogous to the ones in the futures crate.

In some benchs, the new mpsc shows up to 7x improvement over the futures version. The oneshot implementation only provides a slight performance improvement, but when updated to the std version of Future a much better performance can be expected.

toshi-search#115

opened by mauri870 5

Improve error reporting when loading index_catalog fails

Hello!

I tried building and running Toshi on https://gitpod.io, and I got this mysterious error:

$ cargo build --release
[...]
$ target/release/toshi
Error Encountered - IO Error: No such file or directory (os error 2)

This pull request helped me better understand the problem:

$ target/release/toshi
Error loading index_catalog from path "data/" - IO Error: No such file or directory

And indeed:

$ mkdir data
$ target/release/toshi
 INFO  toshi::index > Indexes: []

  ______         __   _   ____                 __
 /_  __/__  ___ / /  (_) / __/__ ___ _________/ /
  / / / _ \(_-</ _ \/ / _\ \/ -_) _ `/ __/ __/ _ \
 /_/  \___/___/_//_/_/ /___/\__/\_,_/_/  \__/_//_/
 Such Relevance, Much Index, Many Search, Wow

opened by jankeromnes 5

Reindexing an existing index

Is your feature request related to a problem? Please describe. I am not able to find a way to re-index or even delete an existing index. Once the index is created, I have to assume that the schema will never change.

Does another search engine have this functionality? Can you describe it's function? N/A

Do you have a specific use case you are trying to solve? Create an index for a given schema Add some documents to your created index. Update the schema by introducing two additional fields. You should be able to add documents or editing existing ones with the two newly introduced fields.

Additional context N/A

opened by seekeramento 4
How close to Elasticsearch do you want to be?

We're looking for a lightweight alternative to Elastic Search that we could use in automated tests. It's not 100% clear from the README if the goal of this project is to create something that's compatible in API to Elastic Search.

For it to work for our purpose we need the REST API to be the same and for Toshi to accept queries in the same format. Does this align with how you see the goals of this project?

Thank you for working on it. It's a great idea.

opened by taras 1
POSTd data over 8kb in size fails
Describe the bug curl -v -H "Content-Type: application/json" --data-binary @bulkbody.json http://localhost:8000/emails/_bulk

If bulkbody.json is over 8kb in size, toshi returns 400 bad request {"message":"Error in Index: 'The provided string is not valid JSON'"}

To Reproduce Steps to reproduce the behavior: Post a a request over 8kb in size

Expected behavior Should be able to accept 8kb POST data or be configurable

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

OS: [e.g. iOS]

Rust Version:

Version [e.g. 22]

Additional context Add any other context about the problem here.
opened by prawnsalad 3
More guidances in need
Hi folks,

I want to say, your were doing a great job on migrating/creating a full-text search engine by rust, maybe it will take a piece of cake from the ES.

I cloned the project, compiled and did some tests based on the requests in requests.http and others, Toshi works well and its operations were quite similar with ES which means a lot for me.

env: Mac rustc: 1.55.0 branch: master

But, unfortunately, I got some blocks on my further research:

how to config & deploy multiple instance in a single cloud instance? I can't find the guidance on the config & deploys through. Do I need to copy the execution file toshi along with the config folder to another path? Do I need to keep the hierarchical structure of the folders?

what are the differences among these config files, config.toml config-bench.toml config-rpc.toml, each of them contains the similar content in master branch, which config file do I need in which scenario?

do we have other way to do operations in Toshi other than the restful api? e.g. some rust (or maybe other language) sdk or some?
opened by godlockin 3
Is there any way to use a custom tokenizer?

Is your feature request related to a problem? Please describe. https://github.com/tantivy-search/tantivy#features One of the features tantivy provides is to support custom tokenizers. For example tantivy-jieba. Is it possible for Toshi to support this feature?

Does another search engine have this functionality? Can you describe it's function?

Do you have a specific use case you are trying to solve?

Additional context

opened by dzcpy 5
Search for all fields instead of a single field

Currently when searching we need to provide the fields we explicitly need to search for. e.g.

{ "query": { "term": { "lyrics": "cool" } }, "limit": 10 }

Is it possible to specify a query that can search on all the fields? Am I thinking about it in right way? I have been able to do so in elasticsearch. Also looking at tantivy wikipedia example it seems we should be able to do it.

opened by prabhatsharma 1

Owner

Toshi Search

GitHub

A full-text search engine in rust

Toshi A Full-Text Search Engine in Rust Please note that this is far from production ready, also Toshi is still under active development, I'm just slo

3.8k Jan 7, 2023

Tantivy is a full text search engine library written in Rust.

Tantivy is a full text search engine library written in Rust. It is closer to Apache Lucene than to Elasticsearch or Apache Solr in the sense it is no

7.4k Dec 30, 2022

Tantivy is a full-text search engine library inspired by Apache Lucene and written in Rust

Tantivy is a full-text search engine library written in Rust. It is closer to Apache Lucene than to Elasticsearch or Apache Solr in the sense it is no

7.5k Jan 9, 2023

🔍TinySearch is a lightweight, fast, full-text search engine. It is designed for static websites.

tinysearch TinySearch is a lightweight, fast, full-text search engine. It is designed for static websites. TinySearch is written in Rust, and then com

2.2k Dec 31, 2022

A full-text search and indexing server written in Rust.

Bayard Bayard is a full-text search and indexing server written in Rust built on top of Tantivy that implements Raft Consensus Algorithm and gRPC. Ach

1.8k Dec 26, 2022

Shogun search - Learning the principle of search engine. This is the first time I've written Rust.

shogun_search Learning the principle of search engine. This is the first time I've written Rust. A search engine written in Rust. Current Features: Bu

5 Mar 9, 2022

AI-powered search engine for Rust

txtai: AI-powered search engine for Rust txtai executes machine-learning workflows to transform data and build AI-powered text indices to perform simi

69 Jan 2, 2023

A Rust API search engine

Roogle Roogle is a Rust API search engine, which allows you to search functions by names and type signatures. Progress Available Queries Function quer

342 Dec 26, 2022

Python bindings for Milli, the embeddable Rust-based search engine powering Meilisearch

milli-py Python bindings for Milli, the embeddable Rust-based search engine powering Meilisearch. Due to limitations around Rust lifecycles, methods a

92 Feb 21, 2023

A simple and lightweight fuzzy search engine that works in memory, searching for similar strings (a pun here).

simsearch A simple and lightweight fuzzy search engine that works in memory, searching for similar strings (a pun here). Documentation Usage Add the f

116 Dec 10, 2022

Lightning Fast, Ultra Relevant, and Typo-Tolerant Search Engine

31.6k Dec 31, 2022

High-performance log search engine.

NOTE: This project is under development, please do not depend on it yet as things may break. MinSQL MinSQL is a log search engine designed with simpli

High Performance, Kubernetes Native Object Storage

359 Nov 27, 2022

Perlin: An Efficient and Ergonomic Document Search-Engine

Table of Contents 1. Perlin Perlin Perlin is a free and open-source document search engine library build on top of perlin-core. Since the first releas

70 Dec 9, 2022

⚡ Insanely fast, 🌟 Feature-rich searching. lnx is the adaptable deployment of the tantivy search engine you never knew you wanted. Standing on the shoulders of giants.

✨ Feature Rich | ⚡ Insanely Fast An ultra-fast, adaptable deployment of the tantivy search engine via REST. ?? Standing On The Shoulders of Giants lnx

679 Jan 1, 2023

⚡ Insanely fast, 🌟 Feature-rich searching. lnx is the adaptable deployment of the tantivy search engine you never knew you wanted. Standing on the shoulders of giants.

✨ Feature Rich | ⚡ Insanely Fast An ultra-fast, adaptable deployment of the tantivy search engine via REST. ?? Standing On The Shoulders of Giants lnx

0 Apr 25, 2022

EasyAlgolia is a Rust crate designed for utilizing the Algolia admin client. It simplifies the process of updating and inserting documents into Algolia's search index.

crate link EasyAlgolia is a Rust crate designed for utilizing the Algolia admin client. It simplifies the process of updating and inserting documents

3 Mar 20, 2024

A full-text search engine in rust

Related tags

Overview

Toshi

A Full-Text Search Engine in Rust

Description

Motivations

Build Requirements

Configuration

Host

Port

Path

Writer Memory

Log Level

Json Parsing

Bulk Buffer

Auto Commit Duration

Merge Policy

Experimental Settings

Building and Running

Example Queries

Term Query

Fuzzy Term Query

Phrase Query

Range Query

Regex Query

Boolean Query

Usage

Running Tests

What is a Toshi?

Comments

* High jacking Fletcher's issue for my own tracking * - Steve

[v0.5.0] - 2019-06-12

Fixed

Changed

Owner

Toshi Search

A full-text search engine in rust

Tantivy is a full text search engine library written in Rust.

Tantivy is a full-text search engine library inspired by Apache Lucene and written in Rust

🔍TinySearch is a lightweight, fast, full-text search engine. It is designed for static websites.

A full-text search and indexing server written in Rust.

Shogun search - Learning the principle of search engine. This is the first time I've written Rust.

AI-powered search engine for Rust

A Rust API search engine

Python bindings for Milli, the embeddable Rust-based search engine powering Meilisearch

A simple and lightweight fuzzy search engine that works in memory, searching for similar strings (a pun here).

Lightning Fast, Ultra Relevant, and Typo-Tolerant Search Engine

High-performance log search engine.

Perlin: An Efficient and Ergonomic Document Search-Engine

⚡ Insanely fast, 🌟 Feature-rich searching. lnx is the adaptable deployment of the tantivy search engine you never knew you wanted. Standing on the shoulders of giants.

⚡ Insanely fast, 🌟 Feature-rich searching. lnx is the adaptable deployment of the tantivy search engine you never knew you wanted. Standing on the shoulders of giants.

Cross-platform, cross-browser, cross-search-engine duckduckgo-like bangs

Configurable quick search engine shortcuts for your terminal and browser.

Image search example by approximate nearest-neighbor library In Rust

EasyAlgolia is a Rust crate designed for utilizing the Algolia admin client. It simplifies the process of updating and inserting documents into Algolia's search index.