Orkhon: ML Inference Framework and Server Runtime

Overview


Orkhon: ML Inference Framework and Server Runtime

Latest Release Crates.io
License Crates.io
Build Status Build Status
Downloads Crates.io
Gitter

What is it?

Orkhon is Rust framework for Machine Learning to run/use inference/prediction code written in Python, frozen models and process unseen data. It is mainly focused on serving models and processing unseen data in a performant manner. Instead of using Python directly and having scalability problems for servers this framework tries to solve them with built-in async API.

Main features

  • Sync & Async API for models.
  • Easily embeddable engine for well-known Rust web frameworks.
  • API contract for interacting with Python code.
  • High processing throughput
    • ~4.8361 GiB/s prediction throughput
    • 3_000 concurrent requests takes ~4ms on average
  • Python Module caching

Installation

You can include Orkhon into your project with;

[dependencies]
orkhon = "0.2"

Dependencies

You will need:

  • If you use pymodel feature, Python dev dependencies should be installed and have proper python runtime to use Orkhon with your project.
  • If you want to have tensorflow inference. Installing tensorflow as library for linking is required.
  • ONNX interface doesn't need extra dependencies from the system side.
  • Point out your PYTHONHOME environment variable to your Python installation.

Python API contract

For Python API contract you can take a look at the Project Documentation.

Examples

Request a Tensorflow prediction asynchronously

 use orkhon::prelude::*;
 use orkhon::tcore::prelude::*;
 use orkhon::ttensor::prelude::*;
 use rand::*;
 use std::path::PathBuf;

let o = Orkhon::new()
    .config(
        OrkhonConfig::new()
            .with_input_fact_shape(InferenceFact::dt_shape(f32::datum_type(), tvec![10, 100])),
    )
    .tensorflow(
        "model_which_will_be_tested",
        PathBuf::from("tests/protobuf/manual_input_infer/my_model.pb"),
    )
    .shareable();

let mut rng = thread_rng();
let vals: Vec<_> = (0..1000).map(|_| rng.gen::<f32>()).collect();
let input = tract_ndarray::arr1(&vals).into_shape((10, 100)).unwrap();

let o = o.get();
let handle = async move {
    let processor = o.tensorflow_request_async(
       "model_which_will_be_tested",
       ORequest::with_body(TFRequest::new().body(input.into())),
    );
    processor.await
};
let resp = block_on(handle).unwrap();

Request an ONNX prediction synchronously

This example needs onnxmodel feature enabled.

use orkhon::prelude::*;
use orkhon::tcore::prelude::*;
use orkhon::ttensor::prelude::*;
use rand::*;
use std::path::PathBuf;

 let o = Orkhon::new()
     .config(
         OrkhonConfig::new()
             .with_input_fact_shape(InferenceFact::dt_shape(f32::datum_type(), tvec![10, 100])),
     )
     .onnx(
         "model_which_will_be_tested",
         PathBuf::from("tests/protobuf/onnx_model/example.onnx"),
     )
     .build();

 let mut rng = thread_rng();
 let vals: Vec<_> = (0..1000).map(|_| rng.gen::<f32>()).collect();
 let input = tract_ndarray::arr1(&vals).into_shape((10, 100)).unwrap();

 let resp = o
     .onnx_request(
         "model_which_will_be_tested",
         ORequest::with_body(ONNXRequest::new().body(input.into())),
     )
     .unwrap();
 assert_eq!(resp.body.output.len(), 1);

License

License is MIT

Documentation

Official documentation is hosted on docs.rs.

Getting Help

Please head to our Gitter or use StackOverflow

Discussion and Development

We use Gitter for development discussions. Also please don't hesitate to open issues on GitHub ask for features, report bugs, comment on design and more! More interaction and more ideas are better!

Contributing to Orkhon Open Source Helpers

All contributions, bug reports, bug fixes, documentation improvements, enhancements and ideas are welcome.

A detailed overview on how to contribute can be found in the CONTRIBUTING guide on GitHub.

You might also like...
A Rust crate that reads and writes tfrecord files

tfrecord-rust The crate provides the functionality to serialize and deserialize TFRecord data format from TensorFlow. Features Provide both high level

Apache Arrow DataFusion and Ballista query engines
Apache Arrow DataFusion and Ballista query engines

DataFusion is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.

Fastest and safest Rust implementation of parquet. `unsafe` free. Integration-tested against pyarrow

Parquet2 This is a re-write of the official parquet crate with performance, parallelism and safety in mind. The five main differentiators in compariso

ConnectorX - Fastest library to load data from DB to DataFrames in Rust and Python
ConnectorX - Fastest library to load data from DB to DataFrames in Rust and Python

ConnectorX enables you to load data from databases into Python in the fastest and most memory efficient way.

Provides a way to use enums to describe and execute ordered data pipelines. 🦀🐾

enum_pipline Provides a way to use enums to describe and execute ordered data pipelines. 🦀 🐾 I needed a succinct way to describe 2d pixel map operat

AppFlowy is an open-source alternative to Notion. You are in charge of your data and customizations
AppFlowy is an open-source alternative to Notion. You are in charge of your data and customizations

AppFlowy is an open-source alternative to Notion. You are in charge of your data and customizations. Built with Flutter and Rust.

New generation decentralized data warehouse and streaming data pipeline
New generation decentralized data warehouse and streaming data pipeline

World's first decentralized real-time data warehouse, on your laptop Docs | Demo | Tutorials | Examples | FAQ | Chat Get Started Watch this introducto

An example repository on how to start building graph applications on streaming data. Just clone and start building 💻 💪
An example repository on how to start building graph applications on streaming data. Just clone and start building 💻 💪

An example repository on how to start building graph applications on streaming data. Just clone and start building 💻 💪

Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing

Apache Arrow Powering In-Memory Analytics Apache Arrow is a development platform for in-memory analytics. It contains a set of technologies that enabl

Comments
  • Incorporate mitosis

    Incorporate mitosis

    @ManishEarth wrote https://github.com/Manishearth/mitosis . It looks like a perfect match for solving Python GIL sharing. Try that and solve Orkhon's contention problems.

    opened by vertexclique 1
Releases(v0.2.0)
  • v0.2.0(Nov 17, 2020)

    This release comes with:

    • ONNX interface
    • New asynchronous servicing methods
    • Shareable server runtime
    • Nuclei asynchronous runtime
    • Inferring input facts for frozen model
    • Improves throughput:
      • ~4.8361 GiB/s prediction throughput
      • 3_000 concurrent requests take ~4ms on average
    Source code(tar.gz)
    Source code(zip)
Owner
Theo M. Bulut
λx.λy.y → λx.λz.z | compiler opt, @rust-lang, and ML | 0xD20F2F5E6DFD6F11 – Different barb, same wire
Theo M. Bulut
High-performance runtime for data analytics applications

Weld Documentation Weld is a language and runtime for improving the performance of data-intensive applications. It optimizes across libraries and func

Weld 2.9k Dec 28, 2022
(MERGED) Rust bindings for TVM runtime

DEPRECATED The RFC is closed and this has been merge into TVM. TVM Runtime Frontend Support This crate provides an idiomatic Rust API for TVM runtime

Ehsan M. Kermani 26 Sep 29, 2022
Xaynet represents an agnostic Federated Machine Learning framework to build privacy-preserving AI applications.

xaynet Xaynet: Train on the Edge with Federated Learning Want a framework that supports federated learning on the edge, in desktop browsers, integrate

XayNet 196 Dec 22, 2022
Bytewax is an open source Python framework for building highly scalable dataflows.

Bytewax Bytewax is an open source Python framework for building highly scalable dataflows. Bytewax uses PyO3 to provide Python bindings to the Timely

Bytewax 289 Jan 6, 2023
Arrow User-Defined Functions Framework on WebAssembly.

Arrow User-Defined Functions Framework on WebAssembly Example Build the WebAssembly module: cargo build --release -p arrow-udf-wasm-example --target w

RisingWave Labs 3 Dec 14, 2023
Perhaps the fastest and most memory efficient way to pull data from PostgreSQL into pandas and numpy. 🚀

flaco Perhaps the fastest and most memory efficient way to pull data from PostgreSQL into pandas and numpy. ?? Have a gander at the initial benchmarks

Miles Granger 14 Oct 31, 2022
A fast, powerful, flexible and easy to use open source data analysis and manipulation tool written in Rust

fisher-rs fisher-rs is a Rust library that brings powerful data manipulation and analysis capabilities to Rust developers, inspired by the popular pan

Syed Vilayat Ali Rizvi 5 Aug 31, 2023
A fast, powerful, flexible and easy to use open source data analysis and manipulation tool written in Rust

fisher-rs fisher-rs is a Rust library that brings powerful data manipulation and analysis capabilities to Rust developers, inspired by the popular pan

null 5 Sep 6, 2023
ndarray: an N-dimensional array with array views, multidimensional slicing, and efficient operations

ndarray The ndarray crate provides an n-dimensional container for general elements and for numerics. Please read the API documentation on docs.rs or t

null 2.6k Jan 7, 2023
Dataframe structure and operations in Rust

Utah Utah is a Rust crate backed by ndarray for type-conscious, tabular data manipulation with an expressive, functional interface. Note: This crate w

Suchin 139 Sep 26, 2022