50 Repositories
Rust llm-inference Libraries
The fastest CLI tool for prompting LLMs. Including support for prompting several LLMs at once!
cai - The fastest CLI tool for prompting LLMs Features Build with Rust ๐ฆ for supreme performance and speed! ๐๏ธ Support for models by Groq, OpenAI, A
Infer a JSON schema from example data, produce nonsense synthetic data (drivel) according to the schema
drivel drivel is a command-line tool written in Rust for inferring a schema from an example JSON (or JSON lines) file, and generating synthetic data (
Evaluate LLM-generated COBOL
COBOLEval: LLM Evaluation for COBOL COBOLEval is a dataset to evaluate the code generation abilities of Large Language Models on the COBOL programming
A Rust LLaMA project to load, serve and extend LLM models
OpenLLaMA Overview A Rust LLaMA project to load, serve and extend LLM models. Key Objectives Support both GGML and HF(HuggingFace) models Support a st
Cloud Native Buildpack that builds an OCI image with Ollama and a large language model.
Ollama Cloud Native Buildpack This buildpack builds an OCI image with Ollama and a large language model. Configure your model by an Ollama Modelfile o
nl-sh: Natural Language Shell
The Natural Language Shell integrates GPT4 or local GGUF-formatted models directly into the terminal experience, allowing operators to describe their tasks in either POSIX commands or fluent human language
An egui app for prompting a local offline LLM.
An egui app for prompting a local offline LLM. Description coze is a small egui application for prompting a local offline LLM using the Huggingface ca
Rust library for integrating local LLMs (with llama.cpp) and external LLM APIs.
Table of Contents About The Project Getting Started Roadmap Contributing License Contact A rust interface for the OpenAI API and Llama.cpp ./server AP
Putting a brain behind `cat`๐โโฌ Integrating language models in the Unix commands ecosystem through text streams.
smartcat (sc) Puts a brain behind cat! CLI interface to bring language models in the Unix ecosystem and allow power users to make the most out of llms
Terminal UI to chat with large language models (LLM) using different model backends, and integrations with your favourite editors!
Oatmeal Terminal UI to chat with large language models (LLM) using different model backends, and integrations with your favourite editors! Overview In
Efficent platform for inference and serving local LLMs including an OpenAI compatible API server.
candle-vllm Efficient platform for inference and serving local LLMs including an OpenAI compatible API server. Features OpenAI compatible API server p
A single-binary, GPU-accelerated LLM server (HTTP and WebSocket API) written in Rust
Poly Poly is a versatile LLM serving back-end. What it offers: High-performance, efficient and reliable serving of multiple local LLM models Optional
Slack chat bot written in Rust that allows the user to interact with a large language model.
A Slack chat bot written in Rust that allows the user to interact with a large language model. Creating an App on Slack, first steps Go to https://api
An LLM-powered (CodeLlama or OpenAI) local diff code review tool.
augre An LLM-powered (CodeLlama or OpenAI) local diff code review tool. Binary Usage Install Windows: $ iwr https://github.com/twitchax/augre/releases
The Fast Vector Similarity Library is designed to provide efficient computation of various similarity measures between vectors.
Fast Vector Similarity Library Introduction The Fast Vector Similarity Library is designed to provide efficient computation of various similarity meas
LLaMA2 port for Rust inspired by llama2.c
llama2-rs LLaMA2 port for Rust inspired by llama2.c. TODOs: Implement loading of the model Implement forward pass Implement generation Implement token
Attempt to summarize text from `stdin`, using a large language model (locally and offline), to `stdout`
summarize-cli Attempt to summarize text from stdin, using a large language model (locally and offline), to stdout. cargo build --release target/releas
A localized open-source AI server that is better than ChatGPT.
๐ฏAI00 RWKV Server English | ไธญๆ | ๆฅๆฌ่ช AI00 RWKV Server is an inference API server based on the RWKV model. It supports VULKAN inference acceleration a
Solving context limits when working with AI LLM models by implementing a "chunkable" attribute on your prompt structs.
Promptize Promptize attempts to solve the issues with context limits when working with AI systems. It allows a user to add an attribute to their struc
The simplest implementation of LLM-backed vector search on Postgres.
pg_vectorize under development The simplest implementation of LLM-backed vector search on Postgres. -- initialize an existing table select vectorize.i
Super-simple, fully Rust powered "memory" (doc store + semantic search) for LLM projects, semantic search, etc.
memex Super simple "memory" for LLM projects, semantic search, etc. Running the service Note that if you're running on Apple silicon (M1/M2/etc.), it'
Bring the power of pre-signed URLs to your apps. Signway is a gateway for redirecting authentic signed URLs to the requested API
A gateway that proxies signed requests to other APIs. Check the docs for more info. If you are looking for the managed version checkout this link http
`dfx new --type=rust` + burn-rs MNIST web inference example
ic-mnist The frontend provides a canvas where users can draw a digit. The drawn digit is then sent to the backend canister running burn-rs for inferen
Multi-platform desktop app to download and run Large Language Models(LLM) locally in your computer.
Multi-platform desktop app to download and run Large Language Models(LLM) locally in your computer ๐ Download | Give it a Star โญ | Share it on Twitte
auto-rust is an experimental project that aims to automatically generate Rust code with LLM (Large Language Models) during compilation, utilizing procedural macros.
Auto Rust auto-rust is an experimental project that aims to automatically generate Rust code with LLM (Large Language Models) during compilation, util
Unofficial python bindings for the rust llm library. ๐โค๏ธ๐ฆ
llm-rs-python: Python Bindings for Rust's llm Library Welcome to llm-rs, an unofficial Python interface for the Rust-based llm library, made possible
LLM-chain Rust Template Repository
Jumpstart your llm-chain projects with the llm-chain-template repository! This template provides a foundation for using the llm-chain library, complete with example code and instructions to get you started effortlessly.
LLaMa 7b with CUDA acceleration implemented in rust. Minimal GPU memory needed!
LLaMa 7b in rust This repo contains the popular LLaMa 7b language model, fully implemented in the rust programming language! Uses dfdx tensors and CUD
๐ง WIP ๐ง Vector database plugin for Postgres, written in Rust, specifically designed for LLM.
pgvecto.rs pgvecto.rs is a Postgres extension that provides vector similarity search functions. It is written in Rust and based on pgrx. Features cosi
TUI interface for LLMs written in Rust ๐ฅ
Tenere TUI interface for LLMs written in Rust ๐ธ Demo ๐ Supported LLMs Only ChatGPT is supported for the moment. But I'm planning to support more mod
A program that provides LLMs with the ability to complete complex tasks using plugins.
SmartGPT SmartGPT is an experimental program meant to provide LLMs (particularly GPT-3.5 and GPT-4) with the ability to complete complex tasks without
Run LLaMA inference on CPU, with Rust ๐ฆ๐๐ฆ
LLaMA-rs Do the LLaMA thing, but now in Rust ๐ฆ ๐ ๐ฆ Image by @darthdeus, using Stable Diffusion LLaMA-rs is a Rust port of the llama.cpp project. Th
Rust+OpenCL+AVX2 implementation of LLaMA inference code
RLLaMA RLLaMA is a pure Rust implementation of LLaMA large language model inference.. Supported features Uses either f16 and f32 weights. LLaMA-7B, LL
Run LLaMA inference on CPU, with Rust ๐ฆ๐๐ฆ
LLaMA-rs Do the LLaMA thing, but now in Rust ๐ฆ ๐ ๐ฆ Image by @darthdeus, using Stable Diffusion LLaMA-rs is a Rust port of the llama.cpp project. Th
Believe in AI democratization. llama for nodejs backed by llama-rs, work locally on your laptop CPU. support llama/alpaca model.
llama-node Large Language Model LLaMA on node.js This project is in an early stage, the API for nodejs may change in the future, use it with caution.
A small, basical and unoptimized version of RWKV in Rust written by someone with no math or ML knowledge.
Smol Rust RWKV What is it? A simple example of the RWKV approach to language models written in Rust by someone that knows basically nothing about math
`llm-chain` is a powerful rust crate for building chains in large language models allowing you to summarise text and complete complex tasks
llm-chain ๐ llm-chain is a collection of Rust crates designed to help you work with Large Language Models (LLMs) more effectively. Our primary focus
A natural language shell interface for *nix systems
Orphic A natural language shell interface for *nix systems. Overview Orphic is a CLI tool that uses GPT to translate complex tasks into shell commands
pyke Diffusers is a modular Rust library for optimized Stable Diffusion inference ๐ฎ
pyke Diffusers is a modular Rust library for pretrained diffusion model inference to generate images, videos, or audio, using ONNX Runtime as a backen
Using OpenAI Codex's "davinci-edit" Model for Gradual Type Inference
OpenTau: Using OpenAI Codex for Gradual Type Inference Current implementation is focused on TypeScript Python implementation comes next Requirements r
A statically-typed, interpreted programming language, with generics and type inference
Glide A programming language. Currently, this includes: Static typing Generics, with monomorphization Type inference on function calls func identityT
Wonnx - a GPU-accelerated ONNX inference run-time written 100% in Rust, ready for the web
Wonnx is a GPU-accelerated ONNX inference run-time written 100% in Rust, ready for the web. Supported Platforms (enabled by wgpu) API Windows Linux &
An implementation of a predicative polymorphic language with bidirectional type inference and algebraic data types
Vinilla Lang Vanilla is a pure functional programming language based on System F, a classic but powerful type system. Merits Simple as it is, Vanilla
A fusion of OTP lib/dialyzer + lib/compiler for regular Erlang with type inference
Typed ERLC The Problem I have a dream, that one day there will be an Erlang compiler, which will generate high quality type-correct code from deduced
Tiny, no-nonsense, self-contained, Tensorflow and ONNX inference
Sonos' Neural Network inference engine. This project used to be called tfdeploy, or Tensorflow-deploy-rust. What ? tract is a Neural Network inference
Orkhon: ML Inference Framework and Server Runtime
Orkhon: ML Inference Framework and Server Runtime Latest Release License Build Status Downloads Gitter What is it? Orkhon is Rust framework for Machin
Snips NLU rust implementation
Snips NLU Rust Installation Add it to your Cargo.toml: [dependencies] snips-nlu-lib = { git = "https://github.com/snipsco/snips-nlu-rs", branch = "mas
Orkhon: ML Inference Framework and Server Runtime
Orkhon: ML Inference Framework and Server Runtime Latest Release License Build Status Downloads Gitter What is it? Orkhon is Rust framework for Machin
Tiny, no-nonsense, self-contained, Tensorflow and ONNX inference
Sonos' Neural Network inference engine. This project used to be called tfdeploy, or Tensorflow-deploy-rust. What ? tract is a Neural Network inference
A static, type inferred and embeddable language written in Rust.
gluon Gluon is a small, statically-typed, functional programming language designed for application embedding. Features Statically-typed - Static typin