High-performance runtime for data analytics applications

Overview

Weld

Build Status

Documentation

Weld is a language and runtime for improving the performance of data-intensive applications. It optimizes across libraries and functions by expressing the core computations in libraries using a common intermediate representation, and optimizing across each framework.

Modern analytics applications combine multiple functions from different libraries and frameworks to build complex workflows. Even though individual functions can achieve high performance in isolation, the performance of the combined workflow is often an order of magnitude below hardware limits due to extensive data movement across the functions. Weld’s take on solving this problem is to lazily build up a computation for the entire workflow, and then optimizing and evaluating it only when a result is needed.

You can join the discussion on Weld on our Google Group or post on the Weld mailing list at [email protected].

Contents

Building

To build Weld, you need the latest stable version of Rust and LLVM/Clang++ 6.0.

To install Rust, follow the steps here. You can verify that Rust was installed correctly on your system by typing rustc into your shell. If you already have Rust and rustup installed, you can upgrade to the latest stable version with:

rustup update stable

MacOS LLVM Installation

To install LLVM on macOS, first install Homebrew. Then:

brew install llvm@6

Weld's dependencies require llvm-config on $PATH, so you may need to create a symbolic link so the correct llvm-config is picked up (note that you might need to add sudo at the start of this command):

ln -sf `brew --prefix llvm@6`/bin/llvm-config /usr/local/bin/llvm-config

To make sure this worked correctly, run llvm-config --version. You should see 6.0.x.

Ubuntu LLVM Installation

To install LLVM on Ubuntu, get the LLVM 6.0 sources and then apt-get:

On Ubuntu 16.04 (Xenial):

wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key | sudo apt-key add -
sudo apt-add-repository "deb http://apt.llvm.org/xenial/ llvm-toolchain-xenial-6.0 main"
sudo apt-get update
sudo apt-get install llvm-6.0-dev clang-6.0

On Ubuntu 14.04 (Trusty):

wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key | sudo apt-key add -
sudo apt-add-repository "deb http://apt.llvm.org/trusty/ llvm-toolchain-trusty-6.0 main"

# gcc backport is required on 14.04, for libstdc++. See https://apt.llvm.org/
sudo apt-add-repository "deb http://ppa.launchpad.net/ubuntu-toolchain-r/test/ubuntu trusty main"
sudo apt-get update
sudo apt-get install llvm-6.0-dev clang-6.0

Weld's dependencies require llvm-config, so you may need to create a symbolic link so the correct llvm-config is picked up. sudo may be required:

ln -s /usr/bin/llvm-config-6.0 /usr/local/bin/llvm-config

To make sure this worked correctly, run llvm-config --version. You should see 6.0.x or newer.

You will also need zlib:

sudo apt-get install zlib1g-dev

Building Weld

With LLVM and Rust installed, you can build Weld. Clone this repository, set the WELD_HOME environment variable, and build using cargo:

git clone https://www.github.com/weld-project/weld
cd weld/
export WELD_HOME=`pwd`
cargo build --release

Weld builds two dynamically linked libraries (.so files on Linux and .dylib files on Mac): libweld and libweldrt.

Finally, run the unit and integration tests:

cargo test

Documentation

The Rust Weld crate is documented here.

The docs/ directory contains documentation for the different components of Weld.

  • language.md describes the syntax of the Weld IR.
  • api.md describes the low-level C API for interfacing with Weld.
  • python.md gives an overview of the Python API.
  • tutorial.md contains a tutorial for how to build a small vector library using Weld.

Python Bindings

Weld's Python bindings are in python, with examples in examples/python.

Grizzly

Grizzly is a subset of Pandas integrated with Weld. Details on how to use Grizzly are in python/grizzly. Some example workloads that make use of Grizzly are in examples/python/grizzly. To run Grizzly, you will also need the WELD_HOME environment variable to be set, because Grizzly needs to find its own native library through this variable.

Testing

cargo test runs unit and integration tests. A test name substring filter can be used to run a subset of the tests:

cargo test <substring to match in test name>

Tools

This repository contains a number of useful command line tools which are built automatically with the main Weld repository, including an interactive REPL for inspecting and debugging programs. More information on those tools can be found under docs/tools.md.

Comments
  • Codegen cleanup

    Codegen cleanup

    On leg 2 of my winter break journey across the US, I made a prototype of a new LLVM code generator using llvm-rs. Three major changes:

    1. Code generation through builders. Instead of generating LLVM code strings, builders provide a more type-safe (both at compile time and runtime) and concise means of creating an IR.
    2. New code execution runtime. The llvm-rs JitEngine is pretty comparable to what's already implemented in easy_ll, except it integrates well with the code builders instead of relying on a string intermediary.
    3. Revamped the REPL to actually produce output and use a command line parser.
    opened by willcrichton 11
  • Build refactor

    Build refactor

    1. Removes the make commands from build.rs that were used to build the convertor dylib.
    2. Change the package name grizzly to pygrizzly
    3. Added a binary extension for libweld in pyweld/setup.py which somehow allowed auditwheel to stop complaining about libweld.so being part of the python wheel. I could run auditwheel successfully and it changed the platform tag which allowed me to upload to pypi.
    opened by rahulpalamuttam 10
  • String ~ vec[i8] comparisons Python3

    String ~ vec[i8] comparisons Python3

    Am attempting in baloo to encode strings to Weld for e.g. sr[sr != 'abc'] to work, however there seems to be a bug somewhere. Are vec[i8] <comparison> vec[i8] expected to work correctly at the Weld level?

    For example:

    // _inp2 here is the index associated with the _inp0 strings data
    |_inp0: vec[vec[i8]], _inp1: vec[i8], _inp2: vec[i64]| let obj100 = (_inp0);
    let obj101 = (map(
        obj100,
        |a: vec[i8]| 
            a != _inp1
    ));
    result(
        for(
            zip(_inp2, obj101),
            appender[i64],
            |b: appender[i64], i: i64, e: {i64, bool}| 
                if (e.$1, 
                    merge(b, e.$0), 
                    b)
        )
    )
    

    This only seems to work when _inp1 is of length 1. So for:

    sr = Series(np.array(['abc', 'Burgermeister', 'b'], dtype=np.bytes_))
    sr[sr != 'b']  # will correctly return the first 2 elements
    sr[sr != 'abc']  # does not; (returns all elements)
    

    The most likely culprit is the encoding with Python3. The only changes I made are essentially moving from PyString_AsString and PyString_Size to the PyBytes_* equivalents (in the .cpp file) and encoding the str to utf-8, e.g. abc.encode('utf-8') (in the encoders.py file):

    extern "C"
    weld::vec<uint8_t> str_to_weld_char_arr(PyObject* in) {
      int64_t dimension = (int64_t) PyBytes_Size(in);
      weld::vec<uint8_t> t;
      t.size = dimension;
      t.ptr = (uint8_t*) PyBytes_AsString(in);
      return t;
    }
    
    ...
    if isinstance(obj, str):
            numpy_to_weld = self.utils.str_to_weld_char_arr
            numpy_to_weld.restype = WeldVec(WeldChar()).ctype_class
            numpy_to_weld.argtypes = [py_object]
    
            return numpy_to_weld(obj.encode('utf-8'))
    

    Note that

    1. En-/decoding numpy arrays of bytes works fine with the grizzly encoders (and using PyBytes_FromStringAndSize instead of PyString_FromStringAndSize).
    2. Also toyed around with modifying WeldChar.ctype_class to c_char_p as opposed to c_wchar_p which seemed more appropriate yet produces the same result.
    3. Encoding as ascii would probably be more appropriate, since Weld can't handle unicode from what I can tell. Nevertheless, the tested data is ascii.
    4. This is with the master branch Weld.

    Any feedback/idea on what the issue might be?

    opened by radujica 9
  • Python encoder

    Python encoder

    @sppalkia: just as an FYI, not ready to merge.

    Major issues right now:

    • Some benchmarks don't yet work correctly
    • Encoder / decoder much slower than C++ encoder / decoder
    opened by deepakn94 8
  • Use typed null pointers instead of i64 0-values

    Use typed null pointers instead of i64 0-values

    Fixes #473 when using Weld with an LLVM 6.0 distribution that has LLVM_ENABLE_ASSERTIONS enabled.

    Also fixes some README issues.

    The issue was that some places in the code generation used an i64 0 literal as a substitute for null, which was okay with LLVM's module verifier but caused certain debug assertions to complain.

    opened by sppalkia 7
  • Rust program using Weld exits with

    Rust program using Weld exits with "LLVM ERROR: Program used external function ..." after calling FindFunction

    System: Ubuntu 16.04 Rust: stable-x86_64-unknown-linux-gnu (1.28.0)

    I can successfully compile the Weld library. I write client application with Rust. Similar to the example application at https://www.weld.rs/docs/weld/, it is

    extern crate weld;
    
    use weld::*;
    
    #[repr(C)]
    struct MyArgs {
        a: i32,
        b: i32,
    }
    
    fn main() {
        set_log_level(WeldLogLevel::Trace);
        let code = "|a: i32, b: i32| a + b";
        let ref mut conf = WeldConf::new();
        conf.set("weld.compile.dumpCode", "true");
        conf.set("weld.compile.dumpCodeDir", "/tmp");
        let mut module = WeldModule::compile(code, conf).unwrap();
    
        // Weld accept packed C structs as an argument.
        let ref args = MyArgs { a: 1, b: 50 };
        let ref input = WeldValue::new_from_data(args as *const _ as Data);
    
        // Running a Weld module and reading a value out of it is unsafe!
        unsafe {
            // Run the module, which returns a wrapper `WeldValue`.
            let result = module.run(conf, input).unwrap();
            // The data is just a pointer: cast it to the expected type
            let data = result.data() as *const i32;
    
            let result = (*data).clone();
            assert_eq!(args.a + args.b, result);
        }
    }
    

    However, the program terminated with the following output

    [debug] 23:56:48.124: Started compiling LLVM
    [debug] 23:56:48.124: Done creating LLVM context
    [debug] 23:56:48.124: Done parsing module
    [debug] 23:56:48.124: Done parsing bytecode file
    [debug] 23:56:48.125: Done linking bytecode file
    [debug] 23:56:48.125: Done validating module
    [debug] 23:56:48.128: Done optimizing module
    [debug] 23:56:48.128: Done creating execution engine
    [debug] 23:56:48.128: Before Calling FindFunction
    LLVM ERROR: Program used external function 'weld_rt_get_run_id' which could not be resolved!
    

    I try to investigate into Weld source code, and find that it called llvm::execution_engine::LLVMGetFunctionAddress and it is LLVM that terminates the whole program forcefully. What's going wrong?

    opened by stevenybw 7
  • Sequential loops

    Sequential loops

    This implements sequential loops using the iterate(initial_value, update_func) construct, where initial_value is of some type T and update_func is of type T => {T, bool}. It works the same way as the sequential loop in NVL. We call update_func repeatedly on values starting from initial_value and stop when the bool it returns is false. Then the final value of the expression is the last T it produced.

    I made this work slightly differently if the loop body is sequential vs parallel: in the sequential case, it just adds basic blocks in the current function, while in the parallel case, it adds new functions for the continuation, etc. This is the same way If generates code.

    opened by mateiz 7
  • Weld caches the result of UDF calls?

    Weld caches the result of UDF calls?

    Modifying the Weld program here https://github.com/weld-project/weld/blob/b6ef6748cec3f2740032df164fbeff0aeb0b236a/examples/cpp/udfs_from_library/udfs.cpp#L24 into |x:i64| cudf[add_five,i64](x) + cudf[add_five,i64](x) returns the expected result (2*x + 10) but only calls add_five once.

    opened by mihai-varga 6
  • Nditer

    Nditer

    The main structural changes involves adding a new kind, NdIter, to IterKind: pub enum IterKind { ScalarIter, // A standard scalar iterator. SimdIter, // A vector iterator. FringeIter, // A fringe iterator, handling the fringe of a vector iter. NdIter, // multi-dimensional nd-iter }

    and a couple of fields relevant to NdIter in struct Iter, and struct ParallelForIter, e.g,: pub struct ParallelForIter { pub data: Symbol, pub start: Option, pub end: Option, pub stride: Option, pub kind: IterKind, // NdIter specific fields pub strides: Option, pub shapes: Option, }

    The code follows the same path as for other IterKinds, and the main changes are in the code to generate the index of the next element in llvm.rs (gen_loop_iteration_start, and gen_loop_iteration_end). And changing the bounds/num_iterations based on the shapes parameter also in llvm.rs (gen_loop_bounds_check, and gen_num_iters_and_fringe_start).

    Besides this, I added a struct in llvm.rs (not sure if this was the best place for this?): pub struct VecLLVMInfo { pub ty_str: String, pub arr_str: String, pub prefix: String, pub len_str: String, pub el_ty_str: String, } which was useful for many of the llvm routines I had.

    I have only added two tests for it (testing with zip, and a basic op (log)), but I did test it further using the numpy API where it was more natural as I could compare different outputs of non-contig arrays with results numpy produces. Here, I had a simple case that simulates a non-contiguous array using a 1-d rust array.

    One issue that still remains is how to set the size of the array correctly: right now it is being done in transforms.rs --> infer_size using the formula: len = end-start / stride; Instead, the new formula should be shapes[0]shapes[1]...*shapes[n-1], but I wasn't sure how to emit weld code for this using exprs...in weld it should be something like:

    for(shapes, merger[i64, *], |b, i, e| merge(b, e));
    

    And using exprs, as in infer_size, I thought it would be something like: let b = exprs::newbuilder_expr(Merger(Scalar(I64), BinOpKind::Multiply))?; let m = exprs::merge_expr(b, ???); exprs::for_expr(iters[0].data.clone(), b, m, false)?

    but I wasn't sure how to do the merge_expr --> in weld code I guess we have access to element "e", but how do we get it here?

    If this is resolved, then we won't need to pass in "end", and "stride" to nditer. So far, I was just using stride=1, and end = start + real_len, which is functionally correct, but adds too many parameters to nditer...

    opened by parimarjan 6
  • Reduce small loop overhead

    Reduce small loop overhead

    1. increase outer loop grain size to 4096
    2. eliminate bounds checks for simple single-iter (no explicit start and end) cases
    3. minimize thread ID retrievals
    4. local stack-based mergers with lazy creation of global mergers

    performance on a worst-case loop like this

    merger[+] m
    for v in vs {
      merger[+] n
      for e in v { // only one iteration
        merge(n, e)
      }
      merge(m, result(n))
    }
    result(m)
    

    is still 4x off the C version ... used to be 500x though

    opened by jjthomas 6
  • Eliminated shared runtime library

    Eliminated shared runtime library

    Compile times may have gone up ... tests seem to take a bit longer to run. @mateiz maybe you can measure this?

    weld_module_free and weld_module_mem_free are now used to free memory (they free all memory allocated by the module, which is what weld_value_free did before), and weld_module_memory_usage is used to determine a module's total memory usage. weld_value_free and weld_value_memory_usage now do nothing. I think we should implement these later once we figure out how to correctly free and determine the memory usage of a single value.

    opened by jjthomas 6
  • movielens_grizzly.py code not working.

    movielens_grizzly.py code not working.

    Hello,

    I tried to run "movielens_grizzly.py" but I got the following error:

    weld_type = grizzly_impl.numpy_to_weld_type_mapping[dtype] KeyError: '|S1'

    Is this a known issue?

    Thanks in advance.

    opened by kchasialis 0
  • Bump numpy from 1.18.1 to 1.22.0 in /weld-python

    Bump numpy from 1.18.1 to 1.22.0 in /weld-python

    Bumps numpy from 1.18.1 to 1.22.0.

    Release notes

    Sourced from numpy's releases.

    v1.22.0

    NumPy 1.22.0 Release Notes

    NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

    • Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.
    • A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.
    • NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.
    • New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.
    • A new configurable allocator for use by downstream projects.

    These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

    The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

    Expired deprecations

    Deprecated numeric style dtype strings have been removed

    Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.

    (gh-19539)

    Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

    numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.

    (gh-19615)

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • unique() function on weld-capi.

    unique() function on weld-capi.

    Hello,

    I want to implement the grizzly_impl.unique() function using weld-capi.

    After looking at the grizzly_impl.py code I found out that this is the code for unique()

    map(
             tovec(
               result(
                 for(
                   map(
                     obj_id,
                     |p: vec[i8]| {p,0}
                   ),
                   dictmerger[vec[i8],i32,+],
                   |b, i, e| merge(b,e)
                 )
               )
             ),
             |p: {vec[i8], i32}| p.$0
           )
    

    However, obj_id is retrieved during runtime and I do not know how to do what using weld-capi. Basically my question is how to write a unique() function using weld-ir that can be compiled and called using weld-capi.

    Thanks in advance!

    opened by kchasialis 0
  • Running Python UDFs in Weld.

    Running Python UDFs in Weld.

    I am trying to run a UDF pipeline on a dataset using Weld (or grizzly, I suppose).

    Grizzly, however, (as far as I know) does not offer an optimized function to apply for example a scalar UDF on a specific column of the dataset.

    I found that one way to do it is to access the internal data using to_pandas() which has a function called “apply” and use this function to run a Python UDF on a column.

    The problem is that I want to measure Weld’s performance on UDFs and by accessing the internal data and applying the functions just like a normal python program would do is not a fair way to measure Weld’s performance regarding (Python) UDF execution.

    How can I apply a python UDF on a column of the dataset in an optimized way using Weld?

    Thanks in advance!

    opened by kchasialis 0
  • example udfs run error

    example udfs run error

    When I try to build the example udfs and run it , I got error: LLVM ERROR: Program used external function 'add_five' which could not be resolved! I figure that should be lack the -rdynamic in the gcc flags. I want to submit a PR to fix that. But I found that still exist 2 unmerge PR. I want to know who is maintaining this project and what is the PR standard?

    opened by bakey 0
Releases(v0.4.0)
  • v0.4.0(Feb 13, 2020)

    Changes

    • Deprecates the old constructors API in lieu of the NewExpr trait on Expr
    • Adds a new SIR pass that allows for improved performance in some cases
    Source code(tar.gz)
    Source code(zip)
  • v0.3.1(Aug 23, 2019)

    This release is equivalent to the tagged v0.3.0 release, but it increments the Cargo.toml version number. This allows the version of Weld published on Cargo to be in sync with the version tagged in releases on GitHub from now on.

    Source code(tar.gz)
    Source code(zip)
  • v0.3.0(Aug 23, 2019)

    This is the first tagged release of Weld. See the release notes below.

    v0.3.0

    This release brings a new, optimized LLVM backend and updates to the core Weld APIs. It also removes multi-threading support for the time being, beause the old multi-threading design imposed undue overhead on single-threaded programs. In the future, threading will be re-introduced.

    Detailed Notes

    • Introduces the WeldContext API, which provides a context for Weld runs to allocate their memory and to reuse objects allocated by other Weld runs. See the [WeldContext documentation]({{ site.url }}/docs/latest/weld/struct.WeldContext.html) for more details on how this is used.

    • Adds an optlookup operator that performs keyexists and lookup with a single hash operation. The operation returns a value and a boolean indicating whether the value was found. This operator will replace lookup on dictionaries eventually. See the language docs for more detail.

    • Changes the sort API to take a comparator function similar to libc qsort.

    • Adds an assert operator, which crashes the program if the assertion fails and evaluates to true otherwise.

    • Removes support for the nditer iterator. This will be added back in a future release.

    • Adds new options for dumping code. See the [new configuration options]({{ site.url }}/docs/latest/weld/#constants) for details.

    • Adds the ability to make type aliases:

      type mytuple = {i32,i32};
      
    • Internally, the backend now uses LLVM's C builder API. This improves both compilation times and execution time (since LLVM's optimizer does a better job overall)

    • Change the hash function to CRC32. This is only supported on x86/x64.

    • Implements common subexpression elimination.

    • Bug fixes (see the PRs on the Weld repository).

    Source code(tar.gz)
    Source code(zip)
Owner
Weld
The Weld Project
Weld
RustFFT is a high-performance FFT library written in pure Rust.

RustFFT is a high-performance FFT library written in pure Rust. It can compute FFTs of any size, including prime-number sizes, in O(nlogn) time.

Elliott Mahler 411 Jan 9, 2023
A Machine Learning Framework for High Performance written in Rust

polarlight polarlight is a machine learning framework for high performance written in Rust. Key Features TBA Quick Start TBA How To Contribute Contrib

Chris Ohk 25 Aug 23, 2022
High performance distributed framework for training deep learning recommendation models based on PyTorch.

PERSIA (Parallel rEcommendation tRaining System with hybrId Acceleration) is developed by AI platform@Kuaishou Technology, collaborating with ETH. It

null 340 Dec 30, 2022
Damavand is a quantum circuit simulator. It can run on laptops or High Performance Computing architectures, such CPU distributed architectures or multi GPU distributed architectures.

Damavand is a code that simulates quantum circuits. In order to learn more about damavand, refer to the documentation. Development status Core feature

prevision.io 6 Mar 29, 2022
A high performance python technical analysis library written in Rust and the Numpy C API.

Panther A efficient, high-performance python technical analysis library written in Rust using PyO3 and rust-numpy. Indicators ATR CMF SMA EMA RSI MACD

Greg 210 Dec 22, 2022
Orkhon: ML Inference Framework and Server Runtime

Orkhon: ML Inference Framework and Server Runtime Latest Release License Build Status Downloads Gitter What is it? Orkhon is Rust framework for Machin

Theo M. Bulut 129 Dec 21, 2022
WASM runtime for Deku and Michelson-to-WASM compiler

Tuna This repository has two different projects, a plugable VM for running WASM contracts on Deku and a Michelson to WASM compiler which also has some

Marigold 6 Nov 17, 2022
Xaynet represents an agnostic Federated Machine Learning framework to build privacy-preserving AI applications.

xaynet Xaynet: Train on the Edge with Federated Learning Want a framework that supports federated learning on the edge, in desktop browsers, integrate

XayNet 196 Dec 22, 2022
Rust crate for reconstructing Arabic sentences to be used in applications that don't support Arabic

Arabic Reshaper Rust Reconstruct Arabic sentences to be used in applications that don't support Arabic script. Usage: resahpe a single line of string

YouKnow 5 Jun 21, 2023
Network-agnostic, high-level game networking library for client-side prediction and server reconciliation.

WARNING: This crate currently depends on nightly rust unstable and incomplete features. crystalorb Network-agnostic, high-level game networking librar

Ernest Wong 175 Dec 31, 2022
High-level non-blocking Deno bindings to the rust-bert machine learning crate.

bertml High-level non-blocking Deno bindings to the rust-bert machine learning crate. Guide Introduction The ModelManager class manages the FFI bindin

Carter Snook 14 Dec 15, 2022
MO's Trading - an online contest for high frequency trading

MO's Trading - an online contest for high frequency trading

Runji Wang 29 Dec 14, 2022
A high level, easy to use gpgpu crate based on wgpu

A high level, easy to use gpgpu crate based on wgpu. It is made for very large computations on powerful gpus

null 18 Nov 26, 2022
Cleora AI is a general-purpose model for efficient, scalable learning of stable and inductive entity embeddings for heterogeneous relational data.

Cleora Cleora is a genus of moths in the family Geometridae. Their scientific name derives from the Ancient Greek geo γῆ or γαῖα "the earth", and metr

Synerise 405 Dec 20, 2022
Dynamically get the suggested clusters in the data for unsupervised learning.

Python implementation of the Gap Statistic Purpose Dynamically identify the suggested number of clusters in a data-set using the gap statistic. Full e

Miles Granger 163 Dec 9, 2022
zenoh-flow aims at providing a zenoh-based data-flow programming framework for computations that span from the cloud to the device.

Eclipse Zenoh-Flow Zenoh-Flow provides a zenoh-based dataflow programming framework for computations that span from the cloud to the device. ⚠️ This s

null 35 Dec 12, 2022
SelfOrgMap 5 Nov 4, 2020
Deduplicating Training Data Makes Language Models Better

Deduplicating Training Data Makes Language Models Better This repository contains code to deduplicate language model datasets as descrbed in the paper

Google Research 431 Dec 27, 2022
Experimenting with Rust's fundamental data model

ferrilab Redefining the Rust fundamental data model bitvec funty radium Introduction The ferrilab project is a collection of crates that provide more

Rusty Bit-Sequences 13 Dec 13, 2022