pyke Diffusers is a modular Rust library for pretrained diffusion model inference to generate images, videos, or audio, using ONNX Runtime as a backend for extremely optimized generation on both CPU & GPU.
Prerequisites
You'll need Rust v1.62.1+ to use pyke Diffusers.
- If using CPU: recent (no earlier than Haswell/Zen) x86-64 CPU for best results. ARM64 supported but not recommended. For acceleration, see notes for OpenVINO, oneDNN, ACL, SNPE
- If using CUDA: CUDA v11.x, cuDNN v8.2.x more info
- If using TensorRT: CUDA v11.x, TensorRT v8.4 more info
- If using ROCm: ROCm v5.2 more info
- If using DirectML: DirectX 12 compatible GPU, Windows 10 v1903+ more info
Only generic CPU, CUDA, and TensorRT have prebuilt binaries available. Other execution providers will require you to manually build them; see the ONNX Runtime docs for more info. Additionally, you'll need to make ml2
link to your custom-built binaries.
LMS notes
Note: By default, the LMS scheduler is not enabled, and this section can simply be skipped.
If you plan to enable the all-schedulers
or scheduler-lms
feature, you will need to install binaries for the GNU Scientific Library. See the installation instructions for rust-GSL
to set up GSL.
Installation
[dependencies]
pyke-diffusers = "0.1"
# if you'd like to use CUDA:
pyke-diffusers = { version = "0.1", features = [ "cuda" ] }
The default features enable some commonly used schedulers and pipelines.
Usage
use pyke_diffusers::{Environment, EulerDiscreteScheduler, StableDiffusionOptions, StableDiffusionPipeline, StableDiffusionTxt2ImgOptions};
let environment = Arc::new(Environment::builder().build()?);
let mut scheduler = EulerDiscreteScheduler::default();
let pipeline = StableDiffusionPipeline::new(&environment, "./stable-diffusion-v1-5", &StableDiffusionOptions::default())?;
let imgs = pipeline.txt2img("photo of a red fox", &mut scheduler, &StableDiffusionTxt2ImgOptions::default())?;
imgs[0].clone().into_rgb8().save("result.png")?;
See the docs for more detailed information & examples.
Converting models
To convert a model from a HuggingFace diffusers
model:
- Create and activate a virtual environment.
- Install script requirements:
python3 -m pip install -r requirements.txt
- If you are converting a model directly from HuggingFace, log in to HuggingFace Hub with
huggingface-cli login
- this can be skipped if you have the model on disk - Convert your model with
scripts/hf2pyke.py
:- To convert a float32 model from HF (recommended for CPU):
python3 scripts/hf2pyke.py runwayml/stable-diffusion-v1-5 ~/pyke-diffusers-sd15/
- To convert a float32 model from disk:
python3 scripts/hf2pyke.py ~/stable-diffusion-v1-5/ ~/pyke-diffusers-sd15/
- To convert a float16 model from HF (recommended for GPU):
python3 scritps/hf2pyke.py runwayml/stable-diffusion-v1-5@fp16 ~/pyke-diffusers-sd15-fp16/
- To convert a float16 model from disk:
python3 scripts/hf2pyke.py ~/stable-diffusion-v1-5-fp16/ ~/pyke-diffusers/sd15-fp16/ -f16
- To convert a float32 model from HF (recommended for CPU):
Float16 models are faster on GPUs, but are not hardware-independent (due to an ONNX Runtime issue). Float16 models must be converted on the hardware they will be run on. Float32 models are hardware-independent, but are recommended only for x86 CPU inference or older NVIDIA GPUs.
ONNX Runtime binaries
On Windows (or other platforms), you may want to copy the ONNX Runtime dylibs to the target folder by enabling the onnx-copy-dylibs
Cargo feature.
When running the examples in this repo on Windows, you'll need to also manually copy the dylibs from target/debug/
to target/debug/examples/
on first run. You'll also need to copy the dylibs to target/debug/deps/
if your project uses pyke Diffusers in a Cargo test.