A mimimal Rust implementation of Llama.c

Overview

llama2.rs

Llama with a crab

Rust meets llama.

A mimimal Rust implementation of karpathy's llama.c.

Currently the code uses the 15M parameter model provided by Karpathy (included in the resources folder). But you should be able to replace that with any llama model. You can read the section here to download larger models.

Performance:

Right now I'm getting similiar performance on my M1 Macbook for llama.c and llama.rs (~120 tok/s). Though I think we can unlock a lot of performance benifits by parallelising some parts of the code. Left some comments in main.rs on where we can make these gains. I'm no expert on Rust, so PRs are always welcome.

Quick start

# Development
> cargo run

# Prod
> cargo build --release && ./target/release/llama2rs

TODO:

  • Support for quantized versions, 16 bit / 4 bit.
  • More parallelization.
  • Other improvements like taking in the temp / starting completion string / model path as command line args.
You might also like...
An implementation of Joker Calculus in Rust

Joker Calculus An implementation of Joker Calculus in Rust Based on paper Joker Calculus, by Daniel Fischer, William Alexander Morris and Sven Nilsen

dustls, a pure-rust DTLS implementation

dustls, a pure-rust DTLS implementation A DTLSv1.2 implementation in Rust, reusing rustls for cryptographic primitives and most message payload format

Mild RSA implementation written in Rust for a class.

rust_rsa About this repo This is my Rust implementation of the RSA encryption standard, based on this book. This is for my CS 3000 - Advanced Algorith

A clean implementation of Reso using Rust.

A clean implementation of Reso using Rust. The principle of Reso Rust is almost identical to Reso, only missing some functionality

An experimental implementation of gitbom in Rust

gitbom-rs "An experimental implementation of GitBOM in Rust" NOTICE: This project is still a work in progress and is not ready for any use beyond expe

Rust implementation for parsing StarCraft .chk files.

bwmap Rust implementation for parsing StarCraft .chk files. bounding.net uses this library to parse StarCraft and StarCraft: Brood War maps and store

rlox-interpreter is an AST-walking implementation of Bob Nystrom's Lox language in Rust.

rlox-interpreter rlox-interpreter is an AST-walking implementation of Bob Nystrom's Lox language in Rust. Disclaimer: This is my first Rust project, d

Rust implementation for Wlroots (Sway, Wayfire, Hikari, River, etc.) of Gnome Screenshot and Idle DBUS Server, which Upwork uses to capture the screen as proof of work.

🚀 upwork-wlroots-bridge 🚀 Rust Implementation for Wlroots (Sway, Wayfire, Hikari, River, etc.) of Gnome Screenshot and Idle DBUS Server (with extra

Rust implementation of ESP32 NVS partition generator.

Simple ESP32 NVS writer library for Rust Overview A library for Rust to generate NVS partitions for ESP32. How to use See examples/write_simple.rs. Ma

Owner
null
A Discord bot, written in Rust, that generates responses using the LLaMA language model.

llamacord A Discord bot, written in Rust, that generates responses using the LLaMA language model. Built on top of llama-rs. Setup Model Obtain the LL

Rustformers 18 Apr 9, 2023
Run LLaMA inference on CPU, with Rust 🦀🚀🦙

LLaMA-rs Do the LLaMA thing, but now in Rust ?? ?? ?? Image by @darthdeus, using Stable Diffusion LLaMA-rs is a Rust port of the llama.cpp project. Th

Rustformers 2.7k Apr 17, 2023
Run LLaMA inference on CPU, with Rust 🦀🚀🦙

LLaMA-rs Do the LLaMA thing, but now in Rust ?? ?? ?? Image by @darthdeus, using Stable Diffusion LLaMA-rs is a Rust port of the llama.cpp project. Th

Rustformers 2.7k Apr 17, 2023
A rusty interface to llama.cpp for rust

llama-cpp-rs Higher level API for the llama-cpp-sys library here: https://github.com/shadowmint/llama-cpp-sys/ A full end-to-end example can be found

Doug 3 Apr 16, 2023
Rust bindings to llama.cpp, using metal on macOS

llama-rs Rust bindings to llama.cpp, for macOS, with metal support, for testing and evaluating whether it would be worthwhile to run an Llama model lo

Max Brunsfeld 7 Aug 31, 2023
High-level, optionally asynchronous Rust bindings to llama.cpp

llama_cpp-rs Safe, high-level Rust bindings to the C++ project of the same name, meant to be as user-friendly as possible. Run GGUF-based large langua

Binedge.ai 4 Nov 21, 2023
OpenAI compatible API for serving LLAMA-2 model

Cria - Local llama OpenAI-compatible API The objective is to serve a local llama-2 model by mimicking an OpenAI API service. The llama2 model runs on

AmineDiro 66 Aug 8, 2023
The official rust implementation of the SpamProtectionBot API

SpamProtection-rs Table of contents About Supported Rust version Features How to use Credits License About SpamProtection-Rust is a Rust wrapper for I

Intellivoid 0 Feb 26, 2022
RusTiny -- A Rust implementation of Tiny+ language

RusTiny -- A Rust implementation of Tiny+ language 编译器实践 基本要求: 参考《编译原理及实践》的TINY语言编译器(已上传到群中)完成TINY+ 语言(见附录 A)的解释器:即给定满足 TINY+语言的源代码输入,你的解 释器可以给出对其的解释执

M4tsuri 2 May 22, 2022
kindly is a simple Rust implementation of a set-user-ID-root program, similar to sudo but in a much reduced way.

kindly is a simple Rust implementation of a set-user-ID-root program, similar to sudo but in a much reduced way.

Vinícius Miguel 26 Dec 5, 2022