Rust library for integrating local LLMs (with llama.cpp) and external LLM APIs.

Overview

Contributors Forks Stargazers Issues MIT License

Table of Contents
  1. About The Project
  2. Getting Started
  3. Roadmap
  4. Contributing
  5. License
  6. Contact

A rust interface for the OpenAI API and Llama.cpp ./server API

  • A unified API for testing and integrating OpenAI and HuggingFace LLM models.
  • Load models from HuggingFace with just a URL.
  • Uses Llama.cpp server API rather than bindings, so as long as the Llama.cpp server API remains stable this project will remain usable.
  • Prebuilt agents - not chatbots - to unlock the true power of LLMs.

Easily switch between models and APIs

// Use an OpenAI model
let llm_definition = LlmDefinition::OpenAiLlm(OpenAiLlmModels::Gpt35Turbo)
// Or use a model from hugging face
let zephyr_7b_chat = LlamaLlmModel::new(
    url, 
    LlamaPromptFormat::Mistral7BChat, 
    Some(2000), // Max tokens for model AKA context size
);

let response = basic_text_gen::generate(
        &LlmDefinition::LlamaLlm(zephyr_7b_chat),
        Some("Howdy!"),
    )
    .await?;
eprintln!(response)

Get deterministic responses from LLMs

if !boolean_classifier::classify(
        llm_definition,
        Some(hopefully_a_list),
        Some("Is the attached feature a list of content split into discrete entries?"),
    )
    .await?
    {
        panic!("{}, was not properly split into a list!", hopefully_a_list)
    }

Dependencies

async-openai is used to interact with the OpenAI API. A modifed version of the async-openai crate is used for the Llama.cpp server. If you just need an OpenAI API interface, I suggest using the async-openai crate.

Hugging Face's rust client is used for model downloads from the huggingface hub.

(back to top)

Getting Started

Step-by-step guide

  1. Clone repo:
git clone https://github.com/ShelbyJenkins/llm_client.git
cd llm_client
  1. Optional: Build devcontainer from llm_client/.devcontainer/devcontainer.json This will build out a dev container with nvidia dependencies installed.

  2. Add llama.cpp:

git submodule init 
git submodule update
  1. Build llama.cpp ( This is dependent on your hardware. Please see full instructions here):
// Example build for nvidia gpus
cd llm_client/src/providers/llama_cpp/llama_cpp
make LLAMA_CUBLAS=1
  1. Test llama.cpp ./server
cargo run -p llm_client --bin server_runner start --model_url "https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GGUF/blob/main/mistral-7b-instruct-v0.2.Q8_0.gguf"

This will download and load the given model, and then start the server.

When you see llama server listening at http://localhost:8080, you can load the llama.cpp UI in your browser.

Stop the server with cargo run -p llm_client --bin server_runner stop.

  1. Using OpenAi: Add a .env file in the llm_client dir with the var OPENAI_API_KEY=<key>

Examples

Roadmap

  • Automate starting the llama.cpp with specified model
  • Handle the various prompt formats of LLM models more gracefully
  • Unit tests
  • Add additional classifier agents:
    • many from many
    • one from many
  • Implement all openai functionality with llama.cpp
  • More external apis (claude/etc)

(back to top)

Contributing

This is my first Rust crate. All contributions or feedback is more than welcomed!

(back to top)

License

Distributed under the MIT License. See LICENSE.txt for more information.

(back to top)

Contact

Shelby Jenkins - Here or Linkedin

(back to top)

You might also like...
๐Ÿ›  SmartGPT is an experimental program meant to provide LLMs

๐Ÿ›  SmartGPT is an experimental program meant to provide LLMs (particularly GPT-3.5 and GPT-4) with the ability to complete complex tasks without user input by breaking them down into smaller problems, and collecting information using the internet and other external sources.

Use LLMs to generate strongly-typed values

Magic Instantiate Quickstart use openai_magic_instantiate::*; #[derive(MagicInstantiate)] struct Person { // Descriptions can help the LLM unders

Execution of and interaction with external processes and pipelines

subprocess The subprocess library provides facilities for execution of and interaction with external processes and pipelines, inspired by Python's sub

Terminal UI to chat with large language models (LLM) using different model backends, and integrations with your favourite editors!
Terminal UI to chat with large language models (LLM) using different model backends, and integrations with your favourite editors!

Oatmeal Terminal UI to chat with large language models (LLM) using different model backends, and integrations with your favourite editors! Overview In

auto-rust is an experimental project that aims to automatically generate Rust code with LLM (Large Language Models) during compilation, utilizing procedural macros.
auto-rust is an experimental project that aims to automatically generate Rust code with LLM (Large Language Models) during compilation, utilizing procedural macros.

Auto Rust auto-rust is an experimental project that aims to automatically generate Rust code with LLM (Large Language Models) during compilation, util

Cucumber testing framework for Rust. Fully native, no external test runners or dependencies.
Cucumber testing framework for Rust. Fully native, no external test runners or dependencies.

Cucumber testing framework for Rust An implementation of the Cucumber testing framework for Rust. Fully native, no external test runners or dependenci

AUR external package builder

AUR Build Server Goal This project aims to provide an external package making server based on any PKGBUILD based project. Right now it pulls AUR packa

Solving context limits when working with AI LLM models by implementing a "chunkable" attribute on your prompt structs.

Promptize Promptize attempts to solve the issues with context limits when working with AI systems. It allows a user to add an attribute to their struc

Open-source, external CS2 cheat.
Open-source, external CS2 cheat.

๐ŸŽฎ ProExt - A game enhancer for CS2 ๐Ÿ“‹ Features: ESP ๐Ÿ‘๏ธ Aimbot ๐ŸŽฏ Triggerbot ๐Ÿ’จ Crosshair โŒ– Radar ๐Ÿ“ก Bomb Timer ๐Ÿ’ฃ Spectator List ๐Ÿ‘€ Styling ๐Ÿ–Œ๏ธ ...a

Owner
Shelby Jenkins
Shelby Jenkins
`ggllama` is a Rust port of ggerganov's llama.cpp.

Notice llama-rs beat me to the punch. I'll be contributing to that instead. The original README is preserved below. ggllama ggllama is a Rust port of

Philpax 6 Mar 15, 2023
A Rust LLaMA project to load, serve and extend LLM models

OpenLLaMA Overview A Rust LLaMA project to load, serve and extend LLM models. Key Objectives Support both GGML and HF(HuggingFace) models Support a st

Compute IO 4 Apr 9, 2024
REC2 (Rusty External Command and Control) is client and server tool allowing auditor to execute command from VirusTotal and Mastodon APIs written in Rust. ๐Ÿฆ€

Information: REC2 is an old personal project (early 2023) that I didn't continue development on. It's part of a list of projects that helped me to lea

Quentin Texier (g0h4n) 104 Oct 7, 2023
An LLM-powered (CodeLlama or OpenAI) local diff code review tool.

augre An LLM-powered (CodeLlama or OpenAI) local diff code review tool. Binary Usage Install Windows: $ iwr https://github.com/twitchax/augre/releases

Aaron Roney 4 Oct 19, 2023
An egui app for prompting a local offline LLM.

An egui app for prompting a local offline LLM. Description coze is a small egui application for prompting a local offline LLM using the Huggingface ca

null 23 Mar 9, 2024
A command-line tool aiming to upload the local image used in your markdown file to the GitHub repo and replace the local file path with the returned URL.

Pup A command line tool aiming to upload the local image used in your markdown file to the GitHub repo and replace the local file path with the return

SteveLau 11 Aug 17, 2022
Putting a brain behind `cat`๐Ÿˆโ€โฌ› Integrating language models in the Unix commands ecosystem through text streams.

smartcat (sc) Puts a brain behind cat! CLI interface to bring language models in the Unix ecosystem and allow power users to make the most out of llms

Emilien Fugier 28 Dec 2, 2023
Blockoli is a high-performance tool for code indexing, embedding generation and semantic search tool for use with LLMs.

blockoli ???? Blockoli is a high-performance tool for code indexing, embedding generation and semantic search tool for use with LLMs. blockoli is buil

Asterisk 76 Jul 24, 2024
TUI interface for LLMs written in Rust ๐Ÿ”ฅ

Tenere TUI interface for LLMs written in Rust ?? Demo ?? Supported LLMs Only ChatGPT is supported for the moment. But I'm planning to support more mod

BADR 22 Apr 22, 2023
A program that provides LLMs with the ability to complete complex tasks using plugins.

SmartGPT SmartGPT is an experimental program meant to provide LLMs (particularly GPT-3.5 and GPT-4) with the ability to complete complex tasks without

Corman 8 Apr 19, 2023