Tiny, no-nonsense, self-contained, Tensorflow and ONNX inference

Overview

tract-logo

rustc >= 1.50.0 MIT/Apache 2 Native Linux test status Embedded targets status Doc

Sonos' Neural Network inference engine.

This project used to be called tfdeploy, or Tensorflow-deploy-rust.

What ?

tract is a Neural Network inference toolkit. It can read Tensorflow 1, ONNX or NNEF, optimize them and run data through them.

Quick start

Tract in the landscape

ONNX

As of today (October 2020), tract passes successfully about 85% of ONNX backends tests. All "real life" integration tests in Onnx test suite are passing: bvlc_alexnet, densenet121, inception_v1, inception_v2, resnet50, shufflenet, squeezenet, vgg19, zfnet512.

The following operators are implemented and tested.

Abs, Acos, Acosh, Add, And, ArgMax, ArgMin, Asin, Asinh, Atan, Atanh, AveragePool, BatchNormalization, Cast, CategoryMapper, Ceil, Clip, Compress, Concat, Constant, ConstantLike, ConstantOfShape, Conv, ConvInteger, Cos, Cosh, DequantizeLinear, Div, Dropout, Elu, Equal, Erf, Exp, Expand, EyeLike, Flatten, Floor, GRU, Gather, Gemm, GlobalAveragePool, GlobalLpPool, GlobalMaxPool, Greater, GreaterOrEqual, HardSigmoid, Hardmax, Identity, InstanceNormalization, IsInf, IsNaN, LRN, LSTM, LeakyRelu, Less, LessOrEqual, Log, LogSoftmax, MatMul, MatMulInteger, Max, MaxPool, Mean, Min, Mod, Mul, Neg, NonZero, Not, Or, PRelu, Pad, ParametricSoftplus, Pow, QLinearConv, QLinearMatMul, QuantizeLinear, RNN, Reciprocal, ReduceL1, ReduceL2, ReduceLogSum, ReduceLogSumExp, ReduceMax, ReduceMean, ReduceMin, ReduceProd, ReduceSum, ReduceSumSquare, Relu, Reshape, Resize, Round, Rsqrt, ScaledTanh, Scan, Selu, Shape, Shrink, Sigmoid, Sign, Sin, Sinh, Size, Slice, Softmax, Softplus, Softsign, Split, Sqrt, Squeeze, Sub, Sum, Tan, Tanh, ThresholdedRelu, Tile, Transpose, Unsqueeze, Where, Xor

We test these operators against Onnx 1.4.1 (operator set 9), Onnx 1.5.0 (operator set 10), Onnx 1.6.0 (operator set 11), and Onnx 1.7.0 (operator set 12). Many networks in operator set 8 are also working.

TensorFlow

Even if tract is very far from supporting any arbitrary model, it can run Google Inception v3 and Snips wake word models. Missing operators are relatively easy to add. The lack of easy to reuse test suite, and the wide diversity of operators in Tensorflow make it difficult to target a full support.

The following operators are implemented and tested:

Abs, Add, AddN, AddV2, Assign, AvgPool, BatchToSpaceND, BiasAdd, BlockLSTM, Cast, Ceil, ConcatV2, Const, Conv2D, DepthwiseConv2dNative, Div, Enter, Equal, Exit, ExpandDims, FakeQuantWithMinMaxVars, Fill, FloorMod, FusedBatchNorm, GatherNd, GatherV2, Greater, GreaterEqual, Identity, Less, LessEqual, Log, LogicalAnd, LogicalOr, LoopCond, MatMul, Max, MaxPool, Maximum, Mean, Merge, Min, Minimum, Mul, Neg, NoOp, Pack, Pad, Placeholder, Pow, Prod, RandomUniform, RandomUniformInt, Range, RealDiv, Relu, Relu6, Reshape, Rsqrt, Shape, Sigmoid, Slice, Softmax, SpaceToBatchND, Squeeze, StridedSlice, Sub, Sum, Switch, Tanh, Tile, Transpose, VariableV2

TensorFlow-Lite

TensorFlow-Lite is a TensorFlow subproject that also focuses on inference on smaller devices. It uses a precompiler to transform a TensorFlow network to its own format. It only supports a subset of operators from TensorFlow though, and is only optimised for devices with Arm Neon support.

Tract supports a wider subset of TensorFlow operators, and has been optimised for CPU of the previous generation (ARM VFP), also targetting devices in the Raspberry Pi Zero family that TensorFlow Lite does not address.

NNEF

Long story short, TensorFlow and Onnx formats are good for designing and training networks. They need to move fast to follow the research field, tend to integrate new features and operators greedily. They also exhibit a high level of expressivity to facilitate network design.

On the other hand, only a subset of operators and network features actually reach production, so systems running production network do not have to deal with so many operators. Furthermore, some information required for training can be stripped from the network before going to production for prediction.

NNEF tries to bridge the gap between training frameworks and inference by proposing a format dedicated to production and prediction.

Tract supports NNEF:

  • tract_nnef can load and execute NNEF networks
  • tract supports most of the NNEF specification, the most notable exception being the ROI operators and deconvolution
  • tract introduces tract-OPL, a series of NNEF extensions to support other operators (or extend some operators semantics) in order to represent the full range of tract-core neural network support: any network understood by tract should be serializable to tract-OPL. This is a work in progress.
  • tract command line can translate networks from TensorFlow or ONNX to NNEF/OPL.

Example of supported networks

These models among others, are used to track tract performance evolution as part of the Continuous Integration jobs. See .travis/README.md and .travis/bundle-entrypoint.sh for more information.

Keyword spotting on Arm Cortex-M Microcontrollers

https://github.com/ARM-software/ML-KWS-for-MCU

ARM demonstrated the capabilited of the Cortex-M family by providing tutorials and pre-trained models for keyword spotting. While the exercise is ultimately meant for micro-controllers, tract can run the intermediate TensorFlow models.

For instance, on a Rasperry Pi Zero, the "CNN M" model runs in about 70 micro-seconds, and 11 micro-seconds on a Raspberry Pi 3.

Snips wake word models

https://arxiv.org/abs/1811.07684

Snips uses tract to run the wake word detectors. While earlier models were class-based and did not require any special treatment, tract pulsing capabilities made it possible to run WaveNet models efficiently enough for a Raspberry Pi Zero.

Inception v3

Device Family TensorFlow-lite tract
Raspberry Pi Zero Armv6 VFP 113s 39s
Raspberry Pi 2 Armv7 NEON 25s 7s
Raspberry Pi 3 aarch32 NEON 5s 5s

Notes:

  • while the Raspberry Pi 3 is an Armv8 device, this bench is running on Raspbian, an armv6 operating system, crippling the performance of both benches
  • there exists other benches on the internet that show better performance results for TensorFlow (not -Lite) on the Pi 3. They use all four cores of the device. Both TensorFlow-Lite and tract here have been made to run on a single-core.

Roadmap

One important guiding cross-concern: this library must cross-compile as easily as practical to small-ish devices (think 20$ boards).

License

Note: files in the tensorflow/protos directory are copied from the TensorFlow project and are not covered by the following licence statement.

Note: files in the onnx/protos directory are copied from the ONNX project and are not covered by the following licence statement.

Apache 2.0/MIT

All original work licensed under either of

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

Comments
  • Integer-sizing a decluttered streaming TypedModel without Pulse (for non causal models)

    Integer-sizing a decluttered streaming TypedModel without Pulse (for non causal models)

    Hey, I came across another problem trying the bidirectional LSTM model in a browser. It is the same LSTM that is now in CI (download link). Now normally I'd use code similar to this:

    use tract_onnx::prelude::*;
    
    fn main() -> TractResult<()> {
        let model = tract_onnx::onnx()
            .model_for_path("model.onnx")?
            .into_optimized()?
            .into_runnable()?;
    
        let input: Tensor = tract_ndarray::Array2::<u8>::zeros((1, 100)).into();
        model.run(tvec!(input))?;
    
        Ok(())
    }
    

    but I get an error:

    ➜  ~/Documents/Experiments/sblstmtest git:(master) ✗ cargo run
       Compiling sblstmtest v0.1.0 (/Users/bminixhofer/Documents/Experiments/sblstmtest)
        Finished dev [unoptimized + debuginfo] target(s) in 4.51s
         Running `target/debug/sblstmtest`
    Error: TractError(Msg("Translating node #1 \"input\" Source ToTypedTranslator"), State { next_error: Some(TractError(Msg("Output type not determined"), State { next_error: None, backtrace: InternalBacktrace { backtrace: None } })), backtrace: InternalBacktrace { backtrace: None } })
    

    Running it without into_optimized, or with an input fact works. So I understand that the model can not be optimized because the shape of the input (batch size and seq len) is not known at the time of building. Is that correct? In practice I don't want to fix the input shape at build time because it has to work with different batch sizes.

    Now so far it wouldn't be a problem, I'd just add an option optimize to the JS API to turn optimization on or off depending on whether dynamic shapes are needed during inference.

    The problem comes when I try to store the model that I got by calling into_runnable without calling into_optimized before.

    I get a model of type SimplePlan<InferenceFact, Box<dyn InferenceOp>, ModelImpl<InferenceFact, Box<dyn InferenceOp>>>. When I want to store such a model in a struct like:

    use tract_onnx::prelude::*;
    
    struct Model {
        inner: SimplePlan<InferenceFact, Box<dyn tract_hir::infer::ops::InferenceOp>, InferenceModel>,
    }
    

    I get an error which says that the module ops is private:

    ➜  ~/Documents/Experiments/sblstmtest git:(master) ✗ cargo run
       Compiling sblstmtest v0.1.0 (/Users/bminixhofer/Documents/Experiments/sblstmtest)
    error[E0603]: module `ops` is private
      --> src/main.rs:4:64
       |
    4  |     inner: SimplePlan<InferenceFact, Box<dyn tract_hir::infer::ops::InferenceOp>, InferenceModel>,
       |                                                                ^^^ private module
       |
    note: the module `ops` is defined here
      --> /Users/bminixhofer/.cargo/registry/src/github.com-1ecc6299db9ec823/tract-hir-0.7.0/src/infer/mod.rs:12:1
       |
    12 | mod ops;
       | ^^^^^^^^
    
    error: aborting due to previous error
    
    For more information about this error, try `rustc --explain E0603`.
    error: could not compile `sblstmtest`.
    
    To learn more, run the command again with --verbose.
    

    So I can't store the result. Am I missing something? And if not, is there some way to work around this?

    Thanks for all your help :)

    opened by bminixhofer 36
  • Input fact propagation wonky for NNEF

    Input fact propagation wonky for NNEF

    Maybe I was a bit too quick to close #718 as it seems to still have some issues when running depending on exact flags I pass.

    I'll post these in one go as I think they're related; but we'll see.

    As a base; I'm using the image.nnef.tar we can now generate. I'm using the following base command line:

    tract image.nnef.tar --nnef-tract-core --nnef-tract-onnx -i input:1,3,224,224,f32 --allow-random-input 
    

    This always works with dump (except when passing --profile), but fails run with the following error:

    [2022-06-14T13:55:45.961766439Z WARN  tract::tensor] Using random input for input called "input": 1,3,224,224,F32
    [2022-06-14T13:55:45.969444249Z ERROR tract] Evaluating #1 "ConstantOfShape_25" MultiBroadcastTo
    
        Caused by:
            Undetermined symbol in expression: N
    

    Adding --set N=1 to the run fixes this. I'd have expect something like --override-fact input:1,3,224,224,f32 to also work correctly as a more aggressive -i input:1,3,224,224,f32.

    If I attempt to optimize the graph it fails with the following wonky error where it fails to unify two compatible shapes?

    [2022-06-14T13:42:11.116655254Z ERROR tract] Error at stage optimize
    
        Caused by:
            0: codegen node #4 "Conv_0" ConvUnary
            1: Trying to substitute a N,768,7,7,F32 by 1,768,7,7,F32.
               ModelPatch { context: ["wire_as_lazy_im2col"], dont_apply_twice: None, model: Graph { nodes: [Node { id: 0, name: "incoming-3/0", inputs: [], op: TypedSource { fact: 1,3,224,224,F32 }, outputs: [1,3,224,224,F32 >1/0] }, Node { id: 1, name: "Conv_0.matmatmul", inputs: [0/0>], op: LirMatMulUnary { c_fact: 1,768,49,F32, c_m_axis: 1, c_n_axis: 2, micro_ops: [(2359296,F32 0.0050811768, 0.002538681, -0.0051002502, -0.0015630722, 0.0034770966, 0.0017652512, -0.0231781, -0.0051574707, -0.013504028, 0.002796173, 0.00044894218, -0.0076141357..., [Store])], shape=[1], strides=[1], layout=CFcf (0xf), dynamic ndim=1, c_final_shape: 1,768,49, geometry: Concrete(ConcreteMatMulGeometry { m: 768, k: 3072, n: 49, b_storage: VirtualPacking { packer: Packer { r: 5, alignment: 4, end_padding_record: 0 }, func: LazyIm2colSpec { n_bytes_offsets: [0, ...], k_bytes_offsets: [0, 4, ...] }, k: 3072 } }), mmm: MMM (fma_mmm_f32_16x5 16x5), reshape_post: [] }, outputs: [1,768,49,F32 >2/0] }, Node { id: 2, name: "Conv_0", inputs: [1/0>], op: Reshape(2, [Val(49)], [Val(7), Val(7)]), outputs: [1,768,7,7,F32 ] }], inputs: [0/0>], outputs: [], outlet_labels: {}, properties: {} }, inputs: {}, incoming: {0/0>: 3/0>}, shunt_outlet_by: {}, obliterate: [] }`
    

    Looks like it's somehow fails to propagate the input facts?

    opened by tgolsson 32
  • Add support for GPU inference

    Add support for GPU inference

    Address #688 .

    Tasks:

    • [x] GPUTensor
      • [x] Import
        • [x] Proper type for imported tensor
      • [x] Export
      • [x] Intermediate data in GPU memory
      • [x] Pass tensor strides as uniforms
      • [x] Have way of processing rank 4 tensors
    • [ ] Ops
      • [x] Validate tensor props before applying ops, at least in debug builds
      • [ ] Convolution
      • [ ] Activations
        • [x] tanh
        • [x] sigmoid
        • [ ] relu
      • [ ] Fully-connected
      • [ ] Pooling
      • [ ] Softmax
    • [ ] Runner for models
      • [ ] Managing GPU memory
        • [ ] Free buffers no longer in use to allow for models larger than GPU memory
    • [ ] Examples working
      • [ ] tensorflow-mobilenet-v2
    • [ ] Test various platforms
      • [ ] Linux
        • [x] Vulkan + RADV
        • [ ] Other GPUs
        • [ ] Various embedded systems like RPi
      • [ ] Windows
      • [ ] macOS and iOS
      • [ ] Android
      • [ ] WASM
        • [ ] WebGPU
        • [ ] WebGL
    opened by sh7dm 27
  • Some tensorflow extensions for keras layers support

    Some tensorflow extensions for keras layers support

    I'm trying to load a model into rust and I'm getting an error when I run the model.

    thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: 
    TractError(Msg("Translating #30 \"global_average_pooling1d/Mean\" Unimplemented(Mean)"), 
    State { next_error: Some(TractError(Msg("Operator can not be made a TypedOp."), 
    State { next_error: None, backtrace: InternalBacktrace { backtrace: None } })), 
    backtrace: InternalBacktrace { backtrace: None } })
    

    It seems that the mean operation of global_average_pooling1d is not supported, does anyone know anymore about this?

    opened by CharlieBickerton 25
  • Unimplemented Unimplemented(RandomNormalLike) ToTypedTransla

    Unimplemented Unimplemented(RandomNormalLike) ToTypedTransla

    I'm trying to make Soft-Actor Critic model from stable_baselines3 work with onnx and get following error.

    thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Translating node #39 "RandomNormalLike_25" Unimplemented(RandomNormalLike) ToTypedTranslator
    
    Caused by:
        0: translating op UnimplementedOp { outputs: 1, name: "RandomNormalLike", message: "NodeProto { input: [\"onnx::RandomNormalLike_102\"], output: [\"onnx::Mul_103\"], name: \"RandomNormalLike_25\", op_type: \"RandomNormalLike\", domain: \"\", attribute: [], doc_string: \"\" }" }
    

    Do you plan to support RandomNormalLike op?

    opened by stillonearth 23
  • BERT support

    BERT support

    Not too sure what specific operators the BERT architecture will require but:

    • https://github.com/onnx/models/tree/master/text/machine_comprehension/bert-squad use OneHot which is not implemented but easy to add (as an onnx primitive)
    • as reported here, https://github.com/snipsco/tract/issues/313#issuecomment-661937254 we may encounter ConstantOfShape with dynamic inputs
    opened by kali 22
  • Tree ensemble ONNX ML ops [WIP]

    Tree ensemble ONNX ML ops [WIP]

    @kali Opening this draft PR so as to collect some potential early feedback.

    The core generic tree ensemble engine seems to be working (what I've managed to test so far, I was actually super surprised that the tests vs lightgbm passed from the first try after I got it to compile, lol), now need to pin it to protobuf config, set up the rules, run a few basic benchmarks, wrap it in classifier/regressor types, etc. All features (including score post transforms except probit, different various comparison ops, inputnan handling etc) that can be provided in ONNX protobuf config are already supported here.

    It's definitely not implemented in the most efficient way now, but I think it shouldn't be too bad (although before there's benchmarks, I have no idea how much worse it would be than the existing lightgbm/xgboost c++ engines).

    opened by aldanor 22
  • WebAssembly support

    WebAssembly support

    Hi!

    At the moment, there is no easy way to reliably run ONNX models in the browser. ONNX.js exists but is apparently unmaintained and lacks support for important operations like Conv1d and LSTM.

    The alternative is Tensorflow.js which does not directly support ONNX so a model would have to be converted from ONNX to TF, then to a TFJS model, which does also not work at the moment (see https://github.com/onnx/onnx-tensorflow/issues/490).

    So there is a bit of a gap in the ecosystem there.

    That gap could be filled by compiling tract to WASM, and exposing a higher-level API (i. e. load and predict functions) to Javascript. WebGL support would of course be missing but that is out of scope.

    I did some prototyping today, and got the latest release (tract-onnx = "0.6.3") to work in the browser without any changes. So I think a JS wrapper would not be too hard to make.

    I'll start working on this in the next couple of days. Depending on how it goes and if there is interest on your side, this could be merged back into tract at a later point.

    It would be great if you could officially support compiling to WASM, and add WASM to the CI (e. g. the current master branch does not compile to WASM because of the memory maps from https://github.com/snipsco/tract/commit/99c622ad8279e676fde4485ab9b4db0e537418e4).

    Thanks, and please let me know what you think!

    opened by bminixhofer 21
  • Support TreeEnsembleClassifier op

    Support TreeEnsembleClassifier op

    (I'm aware that it overlaps somewhat with #56, but it's a bit more specific, hence opening it as a separate issue)

    Given that it's now officially possible to convert LightGBM (and xgboost) tree ensemble classifiers into ONNX, how realistic would it be to expect tract to support TreeEnsembleClassifier op in the foreseeable future? This would potentially be a huge feature, instantly unlocking whole universe of tree ensemble classifiers (and potentially regressors as well).

    // I'd be glad to help if there was some guidance on what to do and where, if needed; not quite sure how much of work it is to implement this since I'm not very familiar with the internals of tract.

    Thanks!

    opened by aldanor 21
  • Internal multithreading

    Internal multithreading

    Hi!

    From #326:

    tract does not make any effort to run a computation using multiple cores, but is safe to use in multiple threads. So you may get better results by calling run::() on several inputs (or several copies) from different thread (using a parallel iterator may do the trick).

    Are there any plans to support internal multithreading? Tract is already very fast. With internal multithreading it could possibly be faster than onnxruntime and ONNX.js*.

    *That is, if we can exploit multithreading in the browser, but there is already a working wasm-bindgen example with rayon so I'm confident we would get there.

    I'm not very familiar with parallelized implementations of neural nets but I think there are three major points where parallelization is possible:

    1. Slicing the input in chunks that get computed on different cores e. g. with batch sizes > 1.
    2. Computing different operators on different cores, each operator could start it's computation once all its inputs are computed.
    3. Internal parallelization of an operation, e. g. different convolution filters on different cores.

    Feel free to close this issue if this does not align with your Roadmap for tract.

    opened by bminixhofer 20
  • MobileNet ops not supported

    MobileNet ops not supported

    Hi

    I wanted to run the pretrained frozen .pb models from mobilenetv1 and mobilenetv2 with

    let tfd = ::tract_tensorflow::tensorflow().model_for_path(mobilenetv1_frozen).unwrap();
    let plan = ::tract::SimplePlan::new(&tfd).unwrap();
    let input = load_image(img);
    let outputs = plan.run(tvec![input]).unwrap();
    

    But for MobilenetV1 I get

    thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: TractError(Msg("Evaluating #13 \"MobilenetV1/MobilenetV1/Conv2d_0/Relu6\" Unimplemented(Relu6): unimplemented operation: Relu6"), State { next_error: None, backtrace: InternalBacktrace })', src/libcore/result.rs:997:5
    

    and for MobilenetV2

    thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: TractError(Msg("Node named MobilenetV2/Conv/BatchNorm/FusedBatchNorm not found"), State { next_error: None, backtrace: InternalBacktrace })', src/libcore/result.rs:997:5
    

    Any plan to support Relu6 or FusedBatchNorm? Would you be willing to point me where can I add those?

    opened by ehsanmok 20
  • DepthWise conv Inner loop f16 support

    DepthWise conv Inner loop f16 support

    https://github.com/Rikorose/DeepFilterNet/pull/211#issuecomment-1353637586

    Digging in a bit into why I was seeing so many f32/f16 conversions despite the A55 supporting fp16 storage and arithmetic, it seems like this is just a limitation of Rust’s f16 support.

    To fully take advantage of FP16, I think avoiding these conversions is necessary… though, I’m not sure what the best solution is…

    Maybe just rewriting the inner loop in assembly for f16 when the CPU says it supports f16?

    Overriding the operators in the half crate might work too.

    opened by VariantXYZ 22
  • Instructions on training new cost_models

    Instructions on training new cost_models

    It would be great to take advantage of the cost_model setup for arbitrary ARM CPUs (like the a57) and generate better cost models for then, but I’m not really sure on the procedure.

    Digging in a little, it looks like the cost_model binary gets run on the platform and then that data gets processed by the train script, which generates the file. Seems straightforward, but there’s a lot of parameters that I’m not really sure about…

    opened by VariantXYZ 18
  • Supporting ARM64 CPUs on systems without /proc/cpuinfo

    Supporting ARM64 CPUs on systems without /proc/cpuinfo

    Related to #847

    It is possible to override the detector with environment variables... It's not documented, I thought it was only making sens in qemu test contexts where detection is confused. See https://github.com/sonos/tract/blob/main/linalg/src/arm64.rs#L61/

    (I just happened to hit this issue since /proc/cpuinfo didn't exist)

    Which part of what getrandom does do you refer to ? The ability to register a generator at runtime ?

    Right, the ability to provide a fallback implementation for unsupported targets.

    https://docs.rs/getrandom/latest/getrandom/macro.register_custom_getrandom.html

    I'm not a huge fan of it though, as it involves modifying the crate to add the fallback implementation for a platform.

    The environment variable setup would work, but it would be more ideal to have this available during compile-time.

    I was thinking about using features, but ARM is a bit strange such that in-order/out-of-order execution is actually implied by the CPU name, which is a bit of a pain. Things like fp16, neon, etc... can be handled via this though (with the caveat that target_feature can only be applied to unsafe functions). Maybe the CPU ID could be provided via a regular feature?

    In a situation where a user is building on a platform that isn't windows/linux, I think it's not unreasonable to expect them to provide a .cargo/config.toml or something similar to define their setup better.

    opened by VariantXYZ 8
  • Unnecessary copy of inputs

    Unnecessary copy of inputs

    Hello,

    In the run method of Simpleplan, the inputs parameter is a TVec. Since the tensors eventually get converted into an Arc, it means each input tensor get copied once.

    Of course this doesn't matter most of the time because the input is small enough, but we have a use case in which the inputs would be huge and reused between differents runs.

    Therefore I was wondering if it is possible to allow the user to pass a TVec<Arc> to the run method, hence avoiding the later conversion. In the case you don't want to change the API, there is also the option of adding a new run_with_arcs method that would have the same effect.

    If you agree with one of the two possibilities, I can do the corresponding PR shortly as there is almost nothing to do.

    wdyt?

    opened by mbrunel 1
Releases(0.19.0-alpha.19)
Owner
Sonos, Inc.
The Wireless Hi-Fi system
Sonos, Inc.
Perhaps the fastest and most memory efficient way to pull data from PostgreSQL into pandas and numpy. 🚀

flaco Perhaps the fastest and most memory efficient way to pull data from PostgreSQL into pandas and numpy. ?? Have a gander at the initial benchmarks

Miles Granger 14 Oct 31, 2022
A fast, powerful, flexible and easy to use open source data analysis and manipulation tool written in Rust

fisher-rs fisher-rs is a Rust library that brings powerful data manipulation and analysis capabilities to Rust developers, inspired by the popular pan

Syed Vilayat Ali Rizvi 5 Aug 31, 2023
A fast, powerful, flexible and easy to use open source data analysis and manipulation tool written in Rust

fisher-rs fisher-rs is a Rust library that brings powerful data manipulation and analysis capabilities to Rust developers, inspired by the popular pan

null 5 Sep 6, 2023
ndarray: an N-dimensional array with array views, multidimensional slicing, and efficient operations

ndarray The ndarray crate provides an n-dimensional container for general elements and for numerics. Please read the API documentation on docs.rs or t

null 2.6k Jan 7, 2023
Dataframe structure and operations in Rust

Utah Utah is a Rust crate backed by ndarray for type-conscious, tabular data manipulation with an expressive, functional interface. Note: This crate w

Suchin 139 Sep 26, 2022
A Rust crate that reads and writes tfrecord files

tfrecord-rust The crate provides the functionality to serialize and deserialize TFRecord data format from TensorFlow. Features Provide both high level

null 22 Nov 3, 2022
Apache Arrow DataFusion and Ballista query engines

DataFusion is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.

The Apache Software Foundation 2.9k Jan 2, 2023
Fastest and safest Rust implementation of parquet. `unsafe` free. Integration-tested against pyarrow

Parquet2 This is a re-write of the official parquet crate with performance, parallelism and safety in mind. The five main differentiators in compariso

Jorge Leitao 237 Jan 1, 2023
ConnectorX - Fastest library to load data from DB to DataFrames in Rust and Python

ConnectorX enables you to load data from databases into Python in the fastest and most memory efficient way.

SFU Database Group 939 Jan 5, 2023
Provides a way to use enums to describe and execute ordered data pipelines. 🦀🐾

enum_pipline Provides a way to use enums to describe and execute ordered data pipelines. ?? ?? I needed a succinct way to describe 2d pixel map operat

Ben Greenier 0 Oct 29, 2021
AppFlowy is an open-source alternative to Notion. You are in charge of your data and customizations

AppFlowy is an open-source alternative to Notion. You are in charge of your data and customizations. Built with Flutter and Rust.

null 30.7k Jan 7, 2023
New generation decentralized data warehouse and streaming data pipeline

World's first decentralized real-time data warehouse, on your laptop Docs | Demo | Tutorials | Examples | FAQ | Chat Get Started Watch this introducto

kamu 184 Dec 22, 2022
An example repository on how to start building graph applications on streaming data. Just clone and start building 💻 💪

An example repository on how to start building graph applications on streaming data. Just clone and start building ?? ??

Memgraph 40 Dec 20, 2022
Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing

Apache Arrow Powering In-Memory Analytics Apache Arrow is a development platform for in-memory analytics. It contains a set of technologies that enabl

The Apache Software Foundation 10.9k Jan 6, 2023
This library provides a data view for reading and writing data in a byte array.

Docs This library provides a data view for reading and writing data in a byte array. This library requires feature(generic_const_exprs) to be enabled.

null 2 Nov 2, 2022
Cloud native log storage and management for Kubernetes, containerised workloads

Live Demo | Website | API Workspace on Postman Parseable is an open source, cloud native, log storage and management platform. Parseable helps you ing

Parseable, Inc. 715 Jan 1, 2023
A dataframe manipulation tool inspired by dplyr and powered by polars.

dply is a command line tool for viewing, querying, and writing csv and parquet files, inspired by dplyr and powered by polars. Usage overview A dply p

null 14 May 29, 2023
Read specialized NGS formats as data frames in R, Python, and more.

oxbow Read specialized bioinformatic file formats as data frames in R, Python, and more. File formats create a lot of friction for computational biolo

null 12 Jun 7, 2023
Integration between arrow-rs and extendr

arrow_extendr arrow-extendr is a crate that facilitates the transfer of Apache Arrow memory between R and Rust. It utilizes extendr, the {nanoarrow} R

Josiah Parry 8 Nov 24, 2023