Blockoli is a high-performance tool for code indexing, embedding generation and semantic search tool for use with LLMs.

Overview

blockoli logo

blockoli πŸ₯¦πŸ”Ž

Blockoli is a high-performance tool for code indexing, embedding generation and semantic search tool for use with LLMs. blockoli is built in Rust and uses the ASTerisk crate for semantic code parsing. blockoli allows you to efficiently index, store, and search code blocks and their embeddings using vector similarity.

Features

  • Index code blocks from a codebase πŸ“‚πŸ”
  • Generate vector embeddings for code blocks using a pre-trained model πŸ€–πŸ§ 
  • Store code blocks and their embeddings in a SQLite database (Support for Qdrant soon!) πŸ’ΎπŸ—„οΈ
  • Perform efficient similarity search on code blocks using vector embeddings (k-d tree algorithm) πŸ”Žβš‘
  • REST API for easy integration with other tools and platforms πŸŒπŸ”—
  • Fast and memory-efficient implementation using Rust βš‘πŸ’»

Installation (from source)

  1. Ensure you have Rust installed on your system. You can install it from the official Rust website: https://www.rust-lang.org/tools/install

  2. Clone the blockoli repository:

git clone https://github.com/stitionai/blockoli.git
cd blockoli
  1. Download tree-sitter grammar files
mkdir grammars
chmod +x get-grammar.sh
./get-grammar.sh
  1. Build the project:
cargo build --release
  1. Run the server:
./target/release/blockoli <port>

Replace <port> with the desired port number for the server.

Usage

Blockoli provides a REST API for indexing and searching code blocks. Here are some example API endpoints:

  • POST /project: Create a new project
  • GET /project/{project_name}: Get information about a project
  • DELETE /project/{project_name}: Delete a project
  • POST /project/generate: Generate embeddings for code blocks in a project
  • POST /search/{code_block}: Search for similar code blocks in a project
  • POST /get_blocks/{project_name}: Get all function blocks in a project
  • POST /search_blocks/{function_block}: Search for function blocks in a project
  • POST /search_by_function/{function_name}: Search for blocks by function name in a project

Refer to the routes.rs file for detailed information about each API endpoint and its parameters.

Configuration

ASTerisk uses a configuration file named asterisk.toml for specifying indexing options. Modify this file to customize the behavior of the indexer according to your needs.

Contribution Guidelines

Contributions to Blockoli are welcome! If you find a bug, have a feature request, or want to contribute code improvements, please open an issue or submit a pull request on the GitHub repository.

When contributing code, please ensure that your changes are well-tested and follow the Rust coding conventions and style guidelines.

Contribution

Ways to contribute:

  • Suggest a feature
  • Report a bug
  • Fix something and open a pull request
  • Help document the code
  • Spread the word

License

Licensed under the MIT License, see LICENSE for more information.

Liked the project?

Support the project by starring the repository. ⭐


You might also like...
A high-performance web-based geospatial visualization tool with an emphasis LEO satellites and lunar missions.

A high-performance web-based geospatial visualization tool with an emphasis LEO satellites and lunar missions. Written in Rust to target WebGPU, with WebGL2 backwards compatibility.

A lightweight and high-performance order-book designed to process level 2 and trades data. Available in Rust and Python

ninjabook A lightweight and high-performance order-book implemented in Rust, designed to process level 2 and trades data. Available in Python and Rust

πŸ¦€ OpenAPI code generation 🐷

Pig πŸ¦€ OpenAPI code generation 🐷 Install cargo install --git [email protected]:truchi/pig.git --locked Usage πŸ¦€ OpenAPI code generation 🐷 Usage: pig [

High-performance and normalised trading interface capable of executing across many financial venues

High-performance and normalised trading interface capable of executing across many financial venues. Also provides a feature rich simulated exchange to assist with backtesting and dry-trading.

Schemars is a high-performance Python serialization library, leveraging Rust and PyO3 for efficient handling of complex objects

Schemars Introduction Schemars is a Python package, written in Rust and leveraging PyO3, designed for efficient and flexible serialization of Python c

High-performance asynchronous computation framework for system simulation

Asynchronix A high-performance asynchronous computation framework for system simulation. What is this? Warning: this page is at the moment mostly addr

A modern high-performance open source file analysis library for automating localization tasks

πŸ§› Filecount Filecount is a modern high-performance open source file analysis library for automating localization tasks. It enables you to add file an

High-performance, low-level framework for composing flexible web integrations

High-performance, low-level framework for composing flexible web integrations. Used mainly as a dependency of `barter-rs` project

High performance wlroots screen recording, featuring hardware encoding

wl-screenrec High performance wlroots based screen recorder. Uses dma-buf transfers to get surface, and uses the GPU to do both the pixel format conve

Owner
Asterisk
AI-augmented security audits for modern development teams.
Asterisk
Use LLMs to generate strongly-typed values

Magic Instantiate Quickstart use openai_magic_instantiate::*; #[derive(MagicInstantiate)] struct Person { // Descriptions can help the LLM unders

Grant Slatton 4 Feb 20, 2024
EmbedAnything is a powerful python library designed to streamline the creation and management of embedding pipelines

EmbedAnything is a powerful python library designed to streamline the creation and management of embedding pipelines. Built in Rust with no heavy dependencies.

Starlight 39 May 7, 2024
Rust library for integrating local LLMs (with llama.cpp) and external LLM APIs.

Table of Contents About The Project Getting Started Roadmap Contributing License Contact A rust interface for the OpenAI API and Llama.cpp ./server AP

Shelby Jenkins 4 Dec 18, 2023
A program that provides LLMs with the ability to complete complex tasks using plugins.

SmartGPT SmartGPT is an experimental program meant to provide LLMs (particularly GPT-3.5 and GPT-4) with the ability to complete complex tasks without

Corman 8 Apr 19, 2023
TUI interface for LLMs written in Rust πŸ”₯

Tenere TUI interface for LLMs written in Rust ?? Demo ?? Supported LLMs Only ChatGPT is supported for the moment. But I'm planning to support more mod

BADR 22 Apr 22, 2023
πŸ›  SmartGPT is an experimental program meant to provide LLMs

?? SmartGPT is an experimental program meant to provide LLMs (particularly GPT-3.5 and GPT-4) with the ability to complete complex tasks without user input by breaking them down into smaller problems, and collecting information using the internet and other external sources.

n0y0u 3 Feb 25, 2024
Semantic find-and-replace using tree-sitter-based macro expansion

Semantic find-and-replace using tree-sitter-based macro expansion

Isaac Clayton 15 Nov 10, 2022
Split text into semantic chunks, up to a desired chunk size. Supports calculating length by characters and tokens

Large language models (LLMs) can be used for many tasks, but often have a limited context size that can be smaller than documents you might want to use. To use documents of larger length, you often have to split your text into chunks to fit within this context size.

Ben Brandt 4 May 8, 2023
A tree-sitter based AST difftool to get meaningful semantic diffs

diffsitter Disclaimer diffsitter is very much a work in progress and nowhere close to production ready (yet). Contributions are always welcome! Summar

Afnan Enayet 1.3k Jan 8, 2023
Integrate a Rust project with semantic-release

semantic-release-cargo semantic-release-cargo integrates a cargo-based Rust project with semantic-release. This solves two use cases: publishing to cr

null 5 Jan 16, 2023