A localized open-source AI server that is better than ChatGPT.

顾真牛

Last update: Aug 23, 2023

Related tags

Command-line ai openai openai-api llm rwkv chatgpt chatgpt-api chatgpt4 chatgpt4free

Overview

💯AI00 RWKV Server

English | 中文 | 日本語

AI00 RWKV Server is an inference API server based on the RWKV model.

It supports VULKAN inference acceleration and can run on all GPUs that support VULKAN. No need for Nvidia cards!!! AMD cards and even integrated graphics can be accelerated!!!

No need for bulky pytorch, CUDA and other runtime environments, it's compact and ready to use out of the box!

Compatible with OpenAI's ChatGPT API interface.

100% open source and commercially usable, under the MIT license.

If you are looking for a fast, efficient, and easy-to-use LLM API server, then AI00 RWKV Server is your best choice. It can be used for various tasks, including chatbots, text generation, translation, and Q&A.

Join the AI00 RWKV Server community now and experience the charm of AI!

QQ Group for communication: 30920262

💥Features

Based on the RWKV model, it has high performance and accuracy
Supports VULKAN inference acceleration, you can enjoy GPU acceleration without the need for CUDA! Supports AMD cards, integrated graphics, and all GPUs that support VULKAN
No need for bulky pytorch, CUDA and other runtime environments, it's compact and ready to use out of the box!
Compatible with OpenAI's ChatGPT API interface

⭕Usages

Chatbots
Text generation
Translation
Q&A
Any other tasks that LLM can do

👻Other

Based on the web-rwkv project
Model download

Installation, Compilation, and Usage

📦Direct Download and Installation

Directly download the latest version from Release
After downloading the model, place the model in the assets/models/ path, for example, assets/models/RWKV-4-World-0.4B-v1-20230529-ctx4096.st

Run in the command line

$ ./ai00_rwkv_server --model assets/models/RWKV-4-World-0.4B-v1-20230529-ctx4096.st

Open the browser and visit the WebUI http://127.0.0.1:65530

📜Compile from Source Code

Install Rust

Clone this repository

$ git clone https://github.com/cgisky1980/ai00_rwkv_server.git $ cd ai00_rwkv_server

After downloading the model, place the model in the assets/models/ path, for example, assets/models/RWKV-4-World-0.4B-v1-20230529-ctx4096.st
Compile
```
$ cargo build --release
```

After compilation, run

$ cargo run --release -- --model assets/models/RWKV-4-World-0.4B-v1-20230529-ctx4096.st

Open the browser and visit the WebUI http://127.0.0.1:65530

📝Supported Arguments

--model: Model path
--tokenizer: Tokenizer path
--port: Running port
--quant: Specify the number of quantization layers
--adepter: Adapter (GPU and backend) selection options

Example

The server listens on port 3000, loads the full-layer quantized (32 > 24) 0.4B model, and selects adapter 0 (to get the specific adapter number, you can first not add this parameter, and the program will enter the adapter selection page).

$ cargo run --release -- --model assets/models/RWKV-4-World-0.4B-v1-20230529-ctx4096.st --port 3000 --quant 32 --adepter 0

📙Currently Available APIs

The API service starts at port 65530, and the data input and output format follow the Openai API specification.

/v1/models
/models
/v1/chat/completions
/chat/completions
/v1/completions
/completions
/v1/embeddings
/embeddings

📙WebUI Screenshots

📝TODO List

Support for text_completions and chat_completions
Support for sse push
Add embeddings
Integrate basic front-end
Parallel inference via batch serve
Support for int8 quantization
Support for SpQR quantization
Support for LoRA model
Hot loading and switching of LoRA model

👥Join Us

We are always looking for people interested in helping us improve the project. If you are interested in any of the following, please join us!

💀Writing code
💬Providing feedback
🔆Proposing ideas or needs
🔍Testing new features
✏Translating documentation
📣Promoting the project
🏅Anything else that would be helpful to us

No matter your skill level, we welcome you to join us. You can join us in the following ways:

Join our Discord channel
Join our QQ group
Submit issues or pull requests on GitHub
Leave feedback on our website

We can't wait to work with you to make this project better! We hope the project is helpful to you!

Thank you to these awesome individuals who are insightful and outstanding for their support and selfless dedication to the project

_顾真牛
📖 💻 🖋 🎨 🧑‍🏫

_研究社交
💻 💡 🤔 🚧 👀 📦

_josc146
🐛 💻 🤔 🔧

_l15y
🔧 🔌 💻

Stargazers over time

Comments

The server crashed

Hi, Since the precompiled version doesn't work as mentioned in https://github.com/cgisky1980/ai00_rwkv_server/issues/15, I compiled the source code and run it, but it crashed after I submit any text in web user interface. Here is the messages in the console:

$ cargo run --release -- --model assets/models/RWKV-4-World-0.4B-v1-20230529-ctx4096.st 
    Finished release [optimized] target(s) in 1.53s
     Running `target/release/ai00_server --model assets/models/RWKV-4-World-0.4B-v1-20230529-ctx4096.st`
2023-08-01T01:24:06.717Z WARN  [wgpu_core::instance] Missing downlevel flags: DownlevelFlags(SURFACE_VIEW_FORMATS)
The underlying API or device in use does not support enough features to be a fully compliant implementation of WebGPU. A subset of the features can still be used. If you are running this program on native and not in a browser and wish to limit the features you use to the supported subset, call Adapter::downlevel_properties or Device::downlevel_properties to get a listing of the features the current platform supports.
WARNING: lavapipe is not a conformant vulkan implementation, testing use only.
2023-08-01T01:24:06.768Z INFO  [ai00_server] AdapterInfo {
    name: "llvmpipe (LLVM 12.0.0, 256 bits)",
    vendor: 65541,
    device: 0,
    device_type: Cpu,
    driver: "llvmpipe",
    driver_info: "Mesa 21.2.6 (LLVM 12.0.0)",
    backend: Vulkan,
}
2023-08-01T01:24:08.293Z INFO  [ai00_server] ModelInfo {
    num_layers: 24,
    num_emb: 1024,
    num_vocab: 65536,
}
2023-08-01T01:24:08.554Z INFO  [ai00_server] server started at http://0.0.0.0:65530
2023-08-01T01:24:21.065Z TRACE [ai00_server] Sampler {
    top_p: 0.5,
    temperature: 1.0,
    presence_penalty: 0.3,
    frequency_penalty: 0.3,
}
2023-08-01T01:24:21.065Z TRACE [ai00_server] state cache miss
2023-08-01T01:24:21.065Z TRACE [ai00_server] User: 现在的时间是2023 8月 1日 星期二 早上

Assistant: 好的我知道了！

User: 你是谁？

Assistant: Hello, I am your AI assistant. If you have any questions or instructions, please let me know!

User: test

Assistant:
Segmentation fault

opened by cahya-wirawan 3

前端界面文字标准化的一些意见和建议
建议使用标准现代汉语“愉快地聊天吧！”

此处可使用/models接口获取真实模型名称

此处也可标准化成"Max Tokens"、"Top P"、“Presence Penalty”以及"Frequency Penalty"，体现专业性

good first issue
opened by cryscan 3

`GLIBC_2.33' not found

Hi, I tried the compiled version and run ./ai00_server --model assets/models/RWKV-4-World-0.4B-v1-20230529-ctx4096.st but I get following error message:

$ ./ai00_server --model assets/models/RWKV-4-World-0.4B-v1-20230529-ctx4096.st
./ai00_server: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.33' not found (required by ./ai00_server)
./ai00_server: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by ./ai00_server)
./ai00_server: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by ./ai00_server)

opened by cahya-wirawan 0

wgpu库进行计算着色器（Compute Shader）编译时出现了错误
2023-08-18T04:24:01.957Z WARN [wgpu::backend::direct] Shader translation error for stage ShaderStages(COMPUTE): HLSL: Unimplemented("write_expr_math Unpack4x8unorm") 2023-08-18T04:24:01.958Z WARN [wgpu::backend::direct] Please report it to https://github.com/gfx-rs/naga 2023-08-18T04:24:01.958Z ERROR [wgpu::backend::direct] Handling wgpu errors as fatal by default thread 'main' panicked at 'wgpu error: Validation Error

Caused by: In Device::create_compute_pipeline note: label = matmul Internal error: HLSL: Unimplemented("write_expr_math Unpack4x8unorm")

', C:\Users\runneradmin.cargo\registry\src\index.crates.io-6f17d22bba15001f\wgpu-0.16.3\src\backend\direct.rs:3019:5 note: run with RUST_BACKTRACE=1 environment variable to display a backtrace

以下为openai给出的解释：根据您提供的错误消息，看起来是在使用wgpu库进行计算着色器（Compute Shader）编译时出现了错误。

错误消息中指出了一个HLSL（High-Level Shading Language）编译错误，具体是关于未实现的功能："Unimplemented("write_expr_math Unpack4x8unorm")"。这可能表示您的计算着色器中使用了HLSL中尚未实现的操作或函数。

解决此问题的步骤如下：

报告问题：根据错误消息中的提示，您可以将此问题报告给wgpu库的开发者，以便他们了解到该功能尚未实现，并可能提供修复或解决方案。您可以访问https://github.com/gfx-rs/naga并提交一个新的issue。

检查计算着色器代码：检查您的计算着色器代码，特别关注使用了"Unpack4x8unorm"操作或函数的地方。如果可能，尝试使用其他可用的操作或函数替代。

版本更新：确保您正在使用wgpu库的最新版本。可能已经有人报告了这个问题，并且在更新的版本中可能已经得到修复。

回溯信息：根据错误消息中的提示，您可以设置环境变量RUST_BACKTRACE为1，以显示完整的回溯信息。这可能会提供更多关于错误发生位置的信息，帮助您进行故障排除。

我已经是最新驱动了，请问该怎么解决😔
opened by TralMac 1
GPTQ quantification support.

For some low performance devices, using GPTQ for 4Bit quantization is very important.

Do you have any plans to add GPTQ quantification support?

Here is a GPTQ quantification project that can be run on a computer: https://github.com/3outeille/GPTQ-for-RWKV

opened by Pevernow 1
Mobile platform support?

This project seems to use Vulkan to provide hardware acceleration, which means it can be easily ported to mobile platforms. Do you currently support mobile platforms, or do you have plans to support them in the future?

opened by Pevernow 2
Add a text conversion API 增加文本转换API

增加文本转换API，把常见的docx、pdf等文档转化为txt，方便后续对文档对话、个人知识库等功能的开发。

可以说一下还有那些文档格式的支持，欢迎Pr

Add a text conversion API to convert docx and PDF documents into txt,

Facilitate the development of functions such as document chat and personal knowledge base in the future.

Needs Pr
good first issue

opened by cgisky1980 0

Releases(v0.1.11)

v0.1.11(Aug 2, 2023)

Source code(tar.gz)
Source code(zip)
ai00_server-v0.1.11-x86_64-apple-darwin.zip(7.03 MB)
ai00_server-v0.1.11-x86_64-pc-windows-msvc.zip(7.26 MB)
ai00_server-v0.1.11-x86_64-unknown-linux-gnu.zip(7.53 MB)
v0.1.10(Aug 1, 2023)

Source code(tar.gz)
Source code(zip)
ai00_server-v0.1.10-x86_64-apple-darwin.zip(7.03 MB)
ai00_server-v0.1.10-x86_64-pc-windows-msvc.zip(7.26 MB)
ai00_server-v0.1.10-x86_64-unknown-linux-gnu.zip(7.53 MB)
v0.1.9(Jul 31, 2023)

Source code(tar.gz)
Source code(zip)
ai00_server-v0.1.9-x86_64-apple-darwin.zip(7.00 MB)
ai00_server-v0.1.9-x86_64-pc-windows-msvc.zip(7.23 MB)
ai00_server-v0.1.9-x86_64-unknown-linux-gnu.zip(7.51 MB)
v0.1.8(Jul 28, 2023)

Source code(tar.gz)
Source code(zip)
ai00_server-v0.1.8-x86_64-apple-darwin.zip(6.77 MB)
ai00_server-v0.1.8-x86_64-pc-windows-msvc.zip(7.08 MB)
ai00_server-v0.1.8-x86_64-unknown-linux-gnu.zip(7.26 MB)
v0.1.7(Jul 28, 2023)
fix webui #13

前端高效防脱轨

后端代码整理

webui添加参数控制

webui完善i18n

让模型认知时间

很多细节fix

顺手做了一个可爱的LOGO

Source code(tar.gz)
Source code(zip)
ai00_server-v0.1.7-x86_64-apple-darwin.zip(6.75 MB)
ai00_server-v0.1.7-x86_64-pc-windows-msvc.zip(7.08 MB)
ai00_server-v0.1.7-x86_64-unknown-linux-gnu.zip(7.25 MB)
v0.1.6(Jul 25, 2023)

Source code(tar.gz)
Source code(zip)
ai00_server-v0.1.6-x86_64-apple-darwin.zip(5.32 MB)
ai00_server-v0.1.6-x86_64-pc-windows-msvc.zip(5.65 MB)
ai00_server-v0.1.6-x86_64-unknown-linux-gnu.zip(5.83 MB)
v0.1.5(Jul 24, 2023)
What's Changed

docs: add cgisky1980 as a contributor for doc by @allcontributors in https://github.com/cgisky1980/ai00_rwkv_server/pull/7

docs: add cryscan as a contributor for code, example, and 4 more by @allcontributors in https://github.com/cgisky1980/ai00_rwkv_server/pull/8

docs: add josStorer as a contributor for bug, code, and 2 more by @allcontributors in https://github.com/cgisky1980/ai00_rwkv_server/pull/9

docs: add l15y as a contributor for tool, plugin, and code by @allcontributors in https://github.com/cgisky1980/ai00_rwkv_server/pull/10

New Contributors

@allcontributors made their first contribution in https://github.com/cgisky1980/ai00_rwkv_server/pull/7

Full Changelog: https://github.com/cgisky1980/ai00_rwkv_server/compare/v0.1.4...v0.1.5
Source code(tar.gz)
Source code(zip)
ai00_server-v0.1.5-x86_64-apple-darwin.zip(5.31 MB)
ai00_server-v0.1.5-x86_64-pc-windows-msvc.zip(5.63 MB)
ai00_server-v0.1.5-x86_64-unknown-linux-gnu.zip(5.79 MB)
v0.1.4(Jul 24, 2023)
What's Changed

加入闻达论文前端 by @l15y in https://github.com/cgisky1980/ai00_rwkv_server/pull/4

fix max_tokens by @l15y in https://github.com/cgisky1980/ai00_rwkv_server/pull/5

合并 chat Wenda论文 by @cgisky1980 in https://github.com/cgisky1980/ai00_rwkv_server/pull/6

New Contributors

@l15y made their first contribution in https://github.com/cgisky1980/ai00_rwkv_server/pull/4

@cgisky1980 made their first contribution in https://github.com/cgisky1980/ai00_rwkv_server/pull/6

Full Changelog: https://github.com/cgisky1980/ai00_rwkv_server/compare/v0.1.3...v0.1.4
Source code(tar.gz)
Source code(zip)
ai00_server-v0.1.4-x86_64-apple-darwin.zip(5.15 MB)
ai00_server-v0.1.4-x86_64-pc-windows-msvc.zip(5.48 MB)
ai00_server-v0.1.4-x86_64-unknown-linux-gnu.zip(5.65 MB)
v0.1.3(Jul 23, 2023)

Source code(tar.gz)
Source code(zip)
ai00_server-v0.1.3-x86_64-apple-darwin.zip(2.63 MB)
ai00_server-v0.1.3-x86_64-pc-windows-msvc.zip(2.96 MB)
ai00_server-v0.1.3-x86_64-unknown-linux-gnu.zip(3.13 MB)
v0.1.2(Jul 22, 2023)

Source code(tar.gz)
Source code(zip)
ai00_server-v0.1.2-x86_64-apple-darwin.zip(2.56 MB)
ai00_server-v0.1.2-x86_64-pc-windows-msvc.zip(2.88 MB)
ai00_server-v0.1.2-x86_64-unknown-linux-gnu.zip(3.03 MB)
v0.1.1(Jul 22, 2023)

Source code(tar.gz)
Source code(zip)
ai00_server-v0.1.1-x86_64-apple-darwin.zip(2.53 MB)
ai00_server-v0.1.1-x86_64-pc-windows-msvc.zip(2.85 MB)
ai00_server-v0.1.1-x86_64-unknown-linux-gnu.zip(3.00 MB)
v0.1.0(Jul 21, 2023)

Source code(tar.gz)
Source code(zip)
ai00_server-v0.1.0-x86_64-apple-darwin.zip(2.51 MB)
ai00_server-v0.1.0-x86_64-pc-windows-msvc.zip(2.81 MB)
ai00_server-v0.1.0-x86_64-unknown-linux-gnu.zip(3.00 MB)
up(Jul 21, 2023)
First beta release

APIs : /v1/chat/completions /chat/completions /v1/completions /completions

sse support use stream = true

others

Full Changelog: https://github.com/cgisky1980/ai00_rwkv_server/commits/up
Source code(tar.gz)
Source code(zip)

A localized open-source AI server that is better than ChatGPT.

Related tags

Overview

💯AI00 RWKV Server

💥Features

⭕Usages

👻Other

Installation, Compilation, and Usage

📦Direct Download and Installation

📜Compile from Source Code

📝Supported Arguments

Example

📙Currently Available APIs

📙WebUI Screenshots

📝TODO List

👥Join Us

Thank you to these awesome individuals who are insightful and outstanding for their support and selfless dedication to the project

Stargazers over time

Comments

The server crashed

前端界面文字标准化的一些意见和建议

`GLIBC_2.33' not found

wgpu库进行计算着色器（Compute Shader）编译时出现了错误

GPTQ quantification support.

Mobile platform support?

Add a text conversion API 增加文本转换API

Releases(v0.1.11)

v0.1.11(Aug 2, 2023)

v0.1.10(Aug 1, 2023)

v0.1.9(Jul 31, 2023)

v0.1.8(Jul 28, 2023)

v0.1.7(Jul 28, 2023)

v0.1.6(Jul 25, 2023)

v0.1.5(Jul 24, 2023)

What's Changed

New Contributors

v0.1.4(Jul 24, 2023)

What's Changed

New Contributors

v0.1.3(Jul 23, 2023)

v0.1.2(Jul 22, 2023)

v0.1.1(Jul 22, 2023)

v0.1.0(Jul 21, 2023)

up(Jul 21, 2023)

First beta release

Owner

顾真牛

H2O Open Source Kubernetes operator and a command-line tool to ease deployment (and undeployment) of H2O open-source machine learning platform H2O-3 to Kubernetes.

ChatGPT-rs is a lightweight ChatGPT client with a graphical user interface, written in Rust

Over-simplified, featherweight, open-source and easy-to-use authentication and authorization server.

A `nix` and `nix-shell` wrapper for shells other than `bash`

Scriptable tool to read and write UEFI variables from EFI shell. View, save, edit and restore hidden UEFI (BIOS) Setup settings faster than with the OEM menu forms.

More than safe rust abstractions over rytm-sys, an unofficial SDK for writing software for Analog Rytm running on firmware 1.70.

An open source artifact manager. Written in Rust back end and an Vue front end to create a fast and modern experience

zigfi is an open-source stocks, commodities and cryptocurrencies price monitoring CLI app, written fully in Rust, where you can organize assets you're watching easily into watchlists for easy access on your terminal.

This utility traverses through your filesystem looking for open-source dependencies that are seeking donations by parsing README.md and FUNDING.yml files

A blazing fast command line license generator for your open source projects written in Rust🚀

An open source, programmed in rust, privacy focused tool for reading programming resources (like stackoverflow) fast, efficient and asynchronous from the terminal.

Open-source compiler for the Papyrus scripting language of Bethesda games.

A modern high-performance open source file analysis library for automating localization tasks

Horus is an open source tool for running forensic and administrative tasks at the kernel level using eBPF, a low-overhead in-kernel virtual machine, and the Rust programming language.

Open source email client written in Rust and Dioxus. Under 🏗️

Open-source Fortnite launcher, built in Rust.

Open-source Rust framework for building event-driven live-trading & backtesting systems

A free and open-source DNA Sequencing/Visualization software for bioinformatics research.

botwork is a single-binary, generic and open-source automation framework written in Rust for acceptance testing & RPA