🔎 A simple in-memory search for collections and key-value stores.

Overview

Indicium Search

🔎 A simple in-memory search for collections (Vec, HashMap, BTreeMap, etc) and key-value stores. Features autocompletion.

There are many incredible search engines available for Rust. Many seem to require compiling a separate server binary. I wanted something simple and light-weight - an easy-to-use crate that could conveniently search structs and collections within my own binary. So, I made indicium.

alt text

While indicium was made with web apps in mind, it is an in-memory search and it does not scale indefinitely or to Facebook or Google size. Even in such an environment, it would still be a convenient way of searching large static lists (such as currencies, languages, countries, etc.) It's also great for applications where there is an anticipated scale limit (i.e. searching a list of company assets, list of users in a corporate intranet, etc.)

Indicium easily can handle 10,000's records without breaking a sweat. This crate is primarily limited by available memory. However, depending on the nature your data-set and if there keywords that are repeated many times, performance may begin to degrade at a point.

What's New?

  • 0.3.0: Added new search type SearchType::Live which is for "search as you type" interfaces. It is sort of a hybrid between autocomplete and SearchType::And. It will search using an (incomplete) string and return keys as the search results. Each resulting key can then be used to retrieve the full record from its collection to be rendered & displayed to the user.

Quick Start Guide

For our Quick Start Guide example, we will be searching inside of the following struct:

struct MyStruct {
    title: String,
    year: u16,
    body: String,
}

1. Implementing Indexable

To begin, we must make our record indexable. We'll do this by implementing the Indexable trait for our struct. The idea is to return a String for every field that we would like to be indexed. Example:

use indicium::simple::Indexable;

impl Indexable for MyStruct {
    fn strings(&self) -> Vec<String> {
        vec![
            self.title.clone(),
            self.year.to_string(),
            self.body.clone(),
        ]
    }
}

Don't forget that you may make numbers, numeric identifiers, enums, and other types indexable by converting them to a String and including them in the returned Vec .

2. Indexing a Collection

To index an existing collection, we can iterate over the collection. For each record, we will insert it into the search index. This should look something like these two examples:

Vec

use indicium::simple::SearchIndex;

let my_vec: Vec<MyStruct> = Vec::new();

// In the case of a `Vec` collection, we use the index as our key.  A
// `Vec` index is a `usize` type. Therefore we will instantiate
// `SearchIndex` as `SearchIndex
   
    `.
   

let mut search_index: SearchIndex<usize> = SearchIndex::default();

my_vec
    .iter()
    .enumerate()
    .for_each(|(index, element)|
        search_index.insert(&index, element)
    );

HashMap

use std::collections::HashMap;
use indicium::simple::SearchIndex;

let my_hash_map: HashMap<String, MyStruct> = HashMap::new();

// In the case of a `HashMap` collection, we use the hash map's key as
// the `SearchIndex` key. In our hypothetical example, we will use
// MyStruct's `title` as a the key which is a `String` type. Therefore
// we will instantiate `HashMap
   
    ` as HashMap
    
      and
    
   
// `SearchIndex
   
    ` as `SearchIndex
    
     `.
    
   

let mut search_index: SearchIndex<String> = SearchIndex::default();

my_hash_map
    .iter()
    .for_each(|(key, value)|
        search_index.insert(key, value)
    );

As long as the Indexable trait was implemented for your value type, the above examples will index a previously populated Vec or HashMap. However, the preferred method for large collections is to insert into the SearchIndex as you insert into your collection (Vec, HashMap, etc.)

Once the index has been populated, you can use the search and autocomplete methods.

3. Searching

The search method will return keys as the search results. Each resulting key can then be used to retrieve the full record from its collection.

Basic usage:

= search_index.search("William"); assert_eq!(resulting_keys, vec![&2, &3]); ">
let mut search_index: SearchIndex<usize> = SearchIndex::default();

search_index.insert(&0, &MyType::from("Harold Godwinson"));
search_index.insert(&1, &MyType::from("Edgar Ætheling"));
search_index.insert(&2, &MyType::from("William the Conqueror"));
search_index.insert(&3, &MyType::from("William Rufus"));
search_index.insert(&4, &MyType::from("Henry Beauclerc"));

let resulting_keys: Vec<&usize> = search_index.search("William");

assert_eq!(resulting_keys, vec![&2, &3]);

Search only supports exact keyword matches and does not use fuzzy matching. Consider providing the autocomplete feature to your users as an ergonomic alternative to fuzzy matching.

5. Autocompletion

The autocomplete method will provide several autocompletion options for the last keyword in the supplied string.

Basic usage:

= search_index.autocomplete("a very big bi"); assert_eq!( autocomplete_options, vec!["a very big bird", "a very big birthday"] ); ">
let mut search_index: SearchIndex<usize> =
    SearchIndexBuilder::default()
        .autocomplete_type(&AutocompleteType::Global)
        .build();

search_index.insert(&0, &MyType::from("apple"));
search_index.insert(&1, &MyType::from("ball"));
search_index.insert(&2, &MyType::from("bath"));
search_index.insert(&3, &MyType::from("bird"));
search_index.insert(&4, &MyType::from("birthday"));
search_index.insert(&5, &MyType::from("red"));
search_index.insert(&6, &MyType::from("truck"));

let autocomplete_options: Vec<String> =
    search_index.autocomplete("a very big bi");

assert_eq!(
    autocomplete_options,
    vec!["a very big bird", "a very big birthday"]
);
You might also like...
Configurable quick search engine shortcuts for your terminal and browser.

Quicksearch Configurable quick search engine shortcuts for your terminal and browser. Installation Run cargo install quicksearch to install Configurat

EasyAlgolia is a Rust crate designed for utilizing the Algolia admin client. It simplifies the process of updating and inserting documents into Algolia's search index.
EasyAlgolia is a Rust crate designed for utilizing the Algolia admin client. It simplifies the process of updating and inserting documents into Algolia's search index.

crate link EasyAlgolia is a Rust crate designed for utilizing the Algolia admin client. It simplifies the process of updating and inserting documents

High-performance log search engine.

NOTE: This project is under development, please do not depend on it yet as things may break. MinSQL MinSQL is a log search engine designed with simpli

AI-powered search engine for Rust

txtai: AI-powered search engine for Rust txtai executes machine-learning workflows to transform data and build AI-powered text indices to perform simi

A full-text search engine in rust

Toshi A Full-Text Search Engine in Rust Please note that this is far from production ready, also Toshi is still under active development, I'm just slo

🔍TinySearch is a lightweight, fast, full-text search engine. It is designed for static websites.
🔍TinySearch is a lightweight, fast, full-text search engine. It is designed for static websites.

tinysearch TinySearch is a lightweight, fast, full-text search engine. It is designed for static websites. TinySearch is written in Rust, and then com

🔎 Impossibly fast web search, made for static sites.
🔎 Impossibly fast web search, made for static sites.

Stork Impossibly fast web search, made for static sites. Stork is two things. First, it's an indexer: it indexes your loosely-structured content and c

🦔 Fast, lightweight & schema-less search backend. An alternative to Elasticsearch that runs on a few MBs of RAM.
🦔 Fast, lightweight & schema-less search backend. An alternative to Elasticsearch that runs on a few MBs of RAM.

🦔 Fast, lightweight & schema-less search backend. An alternative to Elasticsearch that runs on a few MBs of RAM.

⚡ Insanely fast, 🌟 Feature-rich searching. lnx is the adaptable deployment of the tantivy search engine you never knew you wanted.  Standing on the shoulders of giants.
⚡ Insanely fast, 🌟 Feature-rich searching. lnx is the adaptable deployment of the tantivy search engine you never knew you wanted. Standing on the shoulders of giants.

✨ Feature Rich | ⚡ Insanely Fast An ultra-fast, adaptable deployment of the tantivy search engine via REST. 🌟 Standing On The Shoulders of Giants lnx

Comments
  • Hello AutoCompleteType have unexcepted behavior

    Hello AutoCompleteType have unexcepted behavior

    • unexcepted code
    use indicium::simple::Indexable;
    
    struct MyType<'a>(&'a str);
    
    impl<'a> Indexable for MyType<'a> {
        fn strings(&self) -> Vec<String> {
            vec![self.0.to_string()]
        }
    }
    
    impl From<&'static str> for MyType<'static> {
        fn from(input: &'static str) -> Self {
            MyType(input)
        }
    }
    
    use indicium::simple::AutocompleteType;
    use indicium::simple::SearchIndex;
    use indicium::simple::SearchIndexBuilder;
    
    fn main() {
        let mut search_index: SearchIndex<usize> = SearchIndexBuilder::default()
            .autocomplete_type(AutocompleteType::Global)
            .build();
    
        search_index.insert(&0, &MyType::from("apple"));
        search_index.insert(&1, &MyType::from("ball"));
        search_index.insert(&3, &MyType::from("bird"));
        search_index.insert(&4, &MyType::from("birthday"));
        search_index.insert(&5, &MyType::from("red"));
    
        let autocomplete_options: Vec<String> = search_index.autocomplete("a very big bi");
    
        assert_eq!(
            autocomplete_options,
            vec!["a very big bird", "a very big birthday"]
        );
    }
    

    Behavor: the left do not equal with right, because the 'a' is not autocomplete thread 'main' panicked at 'assertion failed: (left == right) left: ["very big bird", "very big birthday"], right: ["a very big bird", "a very big birthday"]', src/main.rs:34:5

    • rustc version 1.62.0
    • rustup show Default host: aarch64-apple-darwin rustup home: /Users/cs/.rustup

    installed toolchains

    stable-aarch64-apple-darwin (default) 1.61.0-aarch64-apple-darwin

    active toolchain

    stable-aarch64-apple-darwin (default) rustc 1.62.0 (a8314ef7d 2022-06-27)

    opened by csh0101 2
A simple and lightweight fuzzy search engine that works in memory, searching for similar strings (a pun here).

simsearch A simple and lightweight fuzzy search engine that works in memory, searching for similar strings (a pun here). Documentation Usage Add the f

Andy Lok 116 Dec 10, 2022
Shogun search - Learning the principle of search engine. This is the first time I've written Rust.

shogun_search Learning the principle of search engine. This is the first time I've written Rust. A search engine written in Rust. Current Features: Bu

Yuxiang Liu 5 Mar 9, 2022
A simple mass search & replace tool

sr A simple mass search & replace tool. To install: $ brew install sr Usage Use the -s <pattern> flag to search for any regex in your entire codebase

Vincent Driessen 13 Nov 25, 2022
weggli is a fast and robust semantic search tool for C and C++ codebases. It is designed to help security researchers identify interesting functionality in large codebases.

weggli Introduction weggli is a fast and robust semantic search tool for C and C++ codebases. It is designed to help security researchers identify int

Google Project Zero 2k Jan 5, 2023
Lightning Fast, Ultra Relevant, and Typo-Tolerant Search Engine

MeiliSearch Website | Roadmap | Blog | LinkedIn | Twitter | Documentation | FAQ ⚡ Lightning Fast, Ultra Relevant, and Typo-Tolerant Search Engine ?? M

MeiliSearch 31.6k Dec 31, 2022
Perlin: An Efficient and Ergonomic Document Search-Engine

Table of Contents 1. Perlin Perlin Perlin is a free and open-source document search engine library build on top of perlin-core. Since the first releas

CurrySoftware GmbH 70 Dec 9, 2022
Tantivy is a full-text search engine library inspired by Apache Lucene and written in Rust

Tantivy is a full text search engine library written in Rust. It is closer to Apache Lucene than to Elasticsearch or Apache Solr in the sense it is no

tantivy 7.4k Dec 28, 2022
A full-text search and indexing server written in Rust.

Bayard Bayard is a full-text search and indexing server written in Rust built on top of Tantivy that implements Raft Consensus Algorithm and gRPC. Ach

Bayard Search 1.8k Dec 26, 2022
Rapidly Search and Hunt through Windows Event Logs

Rapidly Search and Hunt through Windows Event Logs Chainsaw provides a powerful ‘first-response’ capability to quickly identify threats within Windows

F-Secure Countercept 1.8k Dec 31, 2022
Tantivy is a full-text search engine library inspired by Apache Lucene and written in Rust

Tantivy is a full-text search engine library written in Rust. It is closer to Apache Lucene than to Elasticsearch or Apache Solr in the sense it is no

Quickwit OSS 7.5k Jan 9, 2023