Fast TLSH-compatible Fuzzy Hashing Library in pure Rust

Overview

fast-tlsh: Fast TLSH-compatible Fuzzy Hashing Library in pure Rust

TLSH stands for Trendmicro Locality Sensitive Hash. TLSH can be used to detect similar files.

You can generate / parse / compare (TLSH-compatible) LSHs with this crate.

Thanks to SIMD-friendly optimizations and its memory layout, comparing two LSHs are significantly faster than the original implementation. Even if you turn off real SIMD (to forbid any unsafe code), it employs pseudo-SIMD operations and additional tables to speed up the comparison.

Also, it speeds up generating fuzzy hashes (~50% faster) using the "double update" table optimization.

Crate Features (Major)

  • alloc and std (default)
    This crate supports no_std (by disabling both of them) and alloc and std are built on the minimum no_std implementation. Those features enable implementations that depend on alloc and std, respectively.
  • easy-functions (default)
    It provides easy-to-use high-level functions.
  • simd (default; fast but unsafe)
    This crate is unsafe by default (due to the use of SIMD instructions). But you can benefit from other optimizations even if you disable it.
  • detect-features (default; marginally slow but convenient)
    This feature depends on std.
    If the simd feature is enabled and there's the case where switching between SIMD and non-SIMD implementations are feasible, it turns on the runtime checks to switch the implementation dynamically.
  • opt-default (default; Recommended if no default features are enabled)
    This crate implements number of optimizations and may be tuned separately. If you turn off all default features, all such optimizations are turned off. You may enable this feature for recommended set of optimizations except real SIMD-based ones (that are generally unsafe).
  • opt-embedded-default (Turn off the default features if you use this)
    By default, this crate is optimized for cache-rich environment. For embedded devices with a smaller cache memory, you may use this feature to turn off generating large tables. It makes the code slightly bigger but currently reduces the static memory footprint by 128.25KiB.
  • strict-parser
    It enables the strict parser which enforces additional validity. This is disabled by default (because it is not implemented in the official implementation) but enabling it will make the parser more robust.
  • unsafe (marginally fast but unsafe)
    Other unsafe features not related to SIMD are masked behind the default-disabled feature: unsafe. Note that, enabling this feature will not (normally) speed up the program significantly.
  • unstable
    This feature enables some features specific to the Nightly Rust except portable SIMD. Note that this feature heavily depends on the version of rustc and should not be considered stable (don't expect SemVer-compatible semantics).
  • serde
    It enables integration with Serde to serialize / deserialize fuzzy hashes.
  • tests-slow
    They will enable "slow" tests (including fuzzing tests).

For all features (including minor tuning-related ones), see the documentation.

You might also like...
Prisma2D - Fast, API agnostic, software only 2D graphics crate in pure Rust.

Prisma2D: Ultra-fast CPU 2D graphics Prisma2D is a blazingly fast, efficient yet minimal crate for basic 2D graphics on the CPU. for Rust. With Prisma

A command-line shell like fish, but POSIX compatible.
A command-line shell like fish, but POSIX compatible.

A command-line shell like fish, but POSIX compatible.

A lightweight ShareX-compatible image uploader server

Imitari Imitari is a project to provide an easy and lightweight server for images. It is compatible with software like ShareX. The projects goals are

A todo list from terminal. Compatible with Linux and Windows, not tested on macOS

todocli A todo list from terminal. Compatible with Linux and Windows, not tested on macOS (not planing on support tho). Note: Currently being rewritte

Generate perfect Vyper compatible code headers every time.

headers-vy Generate perfect Vyper-compatible code headers every time. Build You need Rust and Cargo installed on your machine. See the installation gu

A fediverse compatible forum and link aggregator. Downstream from Lemmy.
A fediverse compatible forum and link aggregator. Downstream from Lemmy.

Pangora A fediverse compatible forum and link aggregator. Downstream from Lemmy. About The Project Pangora is a forum and link aggregator. You can pos

Pure-Rust rewrite of the Linux fontconfig library (no system dependencies) - using ttf-parser and allsorts

rust-fontconfig Pure-Rust rewrite of the Linux fontconfig library (no system dependencies) - using allsorts as a font parser in order to parse .woff,

Pure rust library for reading / writing DNG files providing access to the raw data in a zero-copy friendly way.

DNG-rs   A pure rust library for reading / writing DNG files providing access to the raw data in a zero-copy friendly way. Also containing code for re

A new pure-Rust library for cross-platform low-level access to USB devices.

nusb A new pure-Rust library for cross-platform low-level access to USB devices. Documentation Compared to rusb and libusb Pure Rust, no dependency on

Owner
Tsukasa OI
Tsukasa OI
Skim - Fuzzy Finder in rust!

Life is short, skim! Half of our life is spent on navigation: files, lines, commands… You need skim! It is a general fuzzy finder that saves you time.

Takayuki Maeda 0 May 16, 2021
Fuzzy Index for Python, written in Rust. Works like error-tolerant dict, keyed by a human input.

FuzzDex FuzzDex is a fast Python library, written in Rust. It implements an in-memory fuzzy index that works like an error-tolerant dictionary keyed b

Tomasz bla Fortuna 8 Dec 15, 2022
A multi-page fuzzy launcher for your terminal, written in Rust.

fr33zmenu A multi-page fuzzy launcher for your terminal, written in Rust. Supports theming and multiple keybind schemes, including basic vim keybinds.

null 3 Dec 15, 2022
Save cli commands and fuzzy find them later

crow - cli command memorizer What is crow? | Installation | Usage | FAQ What is crow? crow (command row) is a CLI tool to help you memorize CLI comman

sandstorm 7 Feb 17, 2022
Fzf - A command-line fuzzy finder

fzf is a general-purpose command-line fuzzy finder. It's an interactive Unix filter for command-line that can be used with any list; files, command hi

Junegunn Choi 49.1k Jan 8, 2023
🧠 A command-line utility for switching git branches more easily. Switch branches interactively or use a fuzzy search to find that long-forgotten branch name.

git-smart-checkout A git command extension for switching git branches more efficiently. About Interactively switch branches or fuzzy search for that f

Cezar Craciun 51 Dec 29, 2022
A simple, modern fuzzy finder tool to run examples in a Cargo project.

cargo-rx cargo-rx is a simple, modern Runner for Examples in a Cargo project. This crate provides a single executable: rx. Basically anywhere you woul

Ritvik Nag 14 Dec 2, 2022
Integrate Mcfly with fzf to combine a solid command history database with a widely-loved fuzzy search UI

McFly fzf integration Integrate McFly with fzf to combine a solid command history database with a widely-loved fuzzy search UI Features: Advanced hist

null 11 Jan 25, 2023
⚡ An extremely fast cross-compatible system information tool.

Lightfetch A extremely fast command-line system information tool written in Rust ⚡ . Gallery Sadly there isn't much to showcase right now. Download Av

bwtecode 2 Sep 12, 2022
⚡ An extremely fast cross-compatible system information tool.

Lightfetch A extremely fast command-line system information tool written in Rust ⚡ . Gallery Sadly there isn't much to showcase right now. Download Av

bwtecode 2 Sep 12, 2022