High-performance runtime for data analytics applications



Build Status


Weld is a language and runtime for improving the performance of data-intensive applications. It optimizes across libraries and functions by expressing the core computations in libraries using a common intermediate representation, and optimizing across each framework.

Modern analytics applications combine multiple functions from different libraries and frameworks to build complex workflows. Even though individual functions can achieve high performance in isolation, the performance of the combined workflow is often an order of magnitude below hardware limits due to extensive data movement across the functions. Weld’s take on solving this problem is to lazily build up a computation for the entire workflow, and then optimizing and evaluating it only when a result is needed.

You can join the discussion on Weld on our Google Group or post on the Weld mailing list at [email protected].



To build Weld, you need the latest stable version of Rust and LLVM/Clang++ 6.0.

To install Rust, follow the steps here. You can verify that Rust was installed correctly on your system by typing rustc into your shell. If you already have Rust and rustup installed, you can upgrade to the latest stable version with:

rustup update stable

MacOS LLVM Installation

To install LLVM on macOS, first install Homebrew. Then:

brew install llvm@6

Weld's dependencies require llvm-config on $PATH, so you may need to create a symbolic link so the correct llvm-config is picked up (note that you might need to add sudo at the start of this command):

ln -sf `brew --prefix llvm@6`/bin/llvm-config /usr/local/bin/llvm-config

To make sure this worked correctly, run llvm-config --version. You should see 6.0.x.

Ubuntu LLVM Installation

To install LLVM on Ubuntu, get the LLVM 6.0 sources and then apt-get:

On Ubuntu 16.04 (Xenial):

wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key | sudo apt-key add -
sudo apt-add-repository "deb http://apt.llvm.org/xenial/ llvm-toolchain-xenial-6.0 main"
sudo apt-get update
sudo apt-get install llvm-6.0-dev clang-6.0

On Ubuntu 14.04 (Trusty):

wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key | sudo apt-key add -
sudo apt-add-repository "deb http://apt.llvm.org/trusty/ llvm-toolchain-trusty-6.0 main"

# gcc backport is required on 14.04, for libstdc++. See https://apt.llvm.org/
sudo apt-add-repository "deb http://ppa.launchpad.net/ubuntu-toolchain-r/test/ubuntu trusty main"
sudo apt-get update
sudo apt-get install llvm-6.0-dev clang-6.0

Weld's dependencies require llvm-config, so you may need to create a symbolic link so the correct llvm-config is picked up. sudo may be required:

ln -s /usr/bin/llvm-config-6.0 /usr/local/bin/llvm-config

To make sure this worked correctly, run llvm-config --version. You should see 6.0.x or newer.

You will also need zlib:

sudo apt-get install zlib1g-dev

Building Weld

With LLVM and Rust installed, you can build Weld. Clone this repository, set the WELD_HOME environment variable, and build using cargo:

git clone https://www.github.com/weld-project/weld
cd weld/
export WELD_HOME=`pwd`
cargo build --release

Weld builds two dynamically linked libraries (.so files on Linux and .dylib files on Mac): libweld and libweldrt.

Finally, run the unit and integration tests:

cargo test


The Rust Weld crate is documented here.

The docs/ directory contains documentation for the different components of Weld.

  • language.md describes the syntax of the Weld IR.
  • api.md describes the low-level C API for interfacing with Weld.
  • python.md gives an overview of the Python API.
  • tutorial.md contains a tutorial for how to build a small vector library using Weld.

Python Bindings

Weld's Python bindings are in python, with examples in examples/python.


Grizzly is a subset of Pandas integrated with Weld. Details on how to use Grizzly are in python/grizzly. Some example workloads that make use of Grizzly are in examples/python/grizzly. To run Grizzly, you will also need the WELD_HOME environment variable to be set, because Grizzly needs to find its own native library through this variable.


cargo test runs unit and integration tests. A test name substring filter can be used to run a subset of the tests:

cargo test <substring to match in test name>


This repository contains a number of useful command line tools which are built automatically with the main Weld repository, including an interactive REPL for inspecting and debugging programs. More information on those tools can be found under docs/tools.md.

  • Codegen cleanup

    Codegen cleanup

    On leg 2 of my winter break journey across the US, I made a prototype of a new LLVM code generator using llvm-rs. Three major changes:

    1. Code generation through builders. Instead of generating LLVM code strings, builders provide a more type-safe (both at compile time and runtime) and concise means of creating an IR.
    2. New code execution runtime. The llvm-rs JitEngine is pretty comparable to what's already implemented in easy_ll, except it integrates well with the code builders instead of relying on a string intermediary.
    3. Revamped the REPL to actually produce output and use a command line parser.
    opened by willcrichton 11
  • Build refactor

    Build refactor

    1. Removes the make commands from build.rs that were used to build the convertor dylib.
    2. Change the package name grizzly to pygrizzly
    3. Added a binary extension for libweld in pyweld/setup.py which somehow allowed auditwheel to stop complaining about libweld.so being part of the python wheel. I could run auditwheel successfully and it changed the platform tag which allowed me to upload to pypi.
    opened by rahulpalamuttam 10
  • String ~ vec[i8] comparisons Python3

    String ~ vec[i8] comparisons Python3

    Am attempting in baloo to encode strings to Weld for e.g. sr[sr != 'abc'] to work, however there seems to be a bug somewhere. Are vec[i8] <comparison> vec[i8] expected to work correctly at the Weld level?

    For example:

    // _inp2 here is the index associated with the _inp0 strings data
    |_inp0: vec[vec[i8]], _inp1: vec[i8], _inp2: vec[i64]| let obj100 = (_inp0);
    let obj101 = (map(
        |a: vec[i8]| 
            a != _inp1
            zip(_inp2, obj101),
            |b: appender[i64], i: i64, e: {i64, bool}| 
                if (e.$1, 
                    merge(b, e.$0), 

    This only seems to work when _inp1 is of length 1. So for:

    sr = Series(np.array(['abc', 'Burgermeister', 'b'], dtype=np.bytes_))
    sr[sr != 'b']  # will correctly return the first 2 elements
    sr[sr != 'abc']  # does not; (returns all elements)

    The most likely culprit is the encoding with Python3. The only changes I made are essentially moving from PyString_AsString and PyString_Size to the PyBytes_* equivalents (in the .cpp file) and encoding the str to utf-8, e.g. abc.encode('utf-8') (in the encoders.py file):

    extern "C"
    weld::vec<uint8_t> str_to_weld_char_arr(PyObject* in) {
      int64_t dimension = (int64_t) PyBytes_Size(in);
      weld::vec<uint8_t> t;
      t.size = dimension;
      t.ptr = (uint8_t*) PyBytes_AsString(in);
      return t;
    if isinstance(obj, str):
            numpy_to_weld = self.utils.str_to_weld_char_arr
            numpy_to_weld.restype = WeldVec(WeldChar()).ctype_class
            numpy_to_weld.argtypes = [py_object]
            return numpy_to_weld(obj.encode('utf-8'))

    Note that

    1. En-/decoding numpy arrays of bytes works fine with the grizzly encoders (and using PyBytes_FromStringAndSize instead of PyString_FromStringAndSize).
    2. Also toyed around with modifying WeldChar.ctype_class to c_char_p as opposed to c_wchar_p which seemed more appropriate yet produces the same result.
    3. Encoding as ascii would probably be more appropriate, since Weld can't handle unicode from what I can tell. Nevertheless, the tested data is ascii.
    4. This is with the master branch Weld.

    Any feedback/idea on what the issue might be?

    opened by radujica 9
  • Python encoder

    Python encoder

    @sppalkia: just as an FYI, not ready to merge.

    Major issues right now:

    • Some benchmarks don't yet work correctly
    • Encoder / decoder much slower than C++ encoder / decoder
    opened by deepakn94 8
  • Use typed null pointers instead of i64 0-values

    Use typed null pointers instead of i64 0-values

    Fixes #473 when using Weld with an LLVM 6.0 distribution that has LLVM_ENABLE_ASSERTIONS enabled.

    Also fixes some README issues.

    The issue was that some places in the code generation used an i64 0 literal as a substitute for null, which was okay with LLVM's module verifier but caused certain debug assertions to complain.

    opened by sppalkia 7
  • Rust program using Weld exits with

    Rust program using Weld exits with "LLVM ERROR: Program used external function ..." after calling FindFunction

    System: Ubuntu 16.04 Rust: stable-x86_64-unknown-linux-gnu (1.28.0)

    I can successfully compile the Weld library. I write client application with Rust. Similar to the example application at https://www.weld.rs/docs/weld/, it is

    extern crate weld;
    use weld::*;
    struct MyArgs {
        a: i32,
        b: i32,
    fn main() {
        let code = "|a: i32, b: i32| a + b";
        let ref mut conf = WeldConf::new();
        conf.set("weld.compile.dumpCode", "true");
        conf.set("weld.compile.dumpCodeDir", "/tmp");
        let mut module = WeldModule::compile(code, conf).unwrap();
        // Weld accept packed C structs as an argument.
        let ref args = MyArgs { a: 1, b: 50 };
        let ref input = WeldValue::new_from_data(args as *const _ as Data);
        // Running a Weld module and reading a value out of it is unsafe!
        unsafe {
            // Run the module, which returns a wrapper `WeldValue`.
            let result = module.run(conf, input).unwrap();
            // The data is just a pointer: cast it to the expected type
            let data = result.data() as *const i32;
            let result = (*data).clone();
            assert_eq!(args.a + args.b, result);

    However, the program terminated with the following output

    [debug] 23:56:48.124: Started compiling LLVM
    [debug] 23:56:48.124: Done creating LLVM context
    [debug] 23:56:48.124: Done parsing module
    [debug] 23:56:48.124: Done parsing bytecode file
    [debug] 23:56:48.125: Done linking bytecode file
    [debug] 23:56:48.125: Done validating module
    [debug] 23:56:48.128: Done optimizing module
    [debug] 23:56:48.128: Done creating execution engine
    [debug] 23:56:48.128: Before Calling FindFunction
    LLVM ERROR: Program used external function 'weld_rt_get_run_id' which could not be resolved!

    I try to investigate into Weld source code, and find that it called llvm::execution_engine::LLVMGetFunctionAddress and it is LLVM that terminates the whole program forcefully. What's going wrong?

    opened by stevenybw 7
  • Sequential loops

    Sequential loops

    This implements sequential loops using the iterate(initial_value, update_func) construct, where initial_value is of some type T and update_func is of type T => {T, bool}. It works the same way as the sequential loop in NVL. We call update_func repeatedly on values starting from initial_value and stop when the bool it returns is false. Then the final value of the expression is the last T it produced.

    I made this work slightly differently if the loop body is sequential vs parallel: in the sequential case, it just adds basic blocks in the current function, while in the parallel case, it adds new functions for the continuation, etc. This is the same way If generates code.

    opened by mateiz 7
  • Weld caches the result of UDF calls?

    Weld caches the result of UDF calls?

    Modifying the Weld program here https://github.com/weld-project/weld/blob/b6ef6748cec3f2740032df164fbeff0aeb0b236a/examples/cpp/udfs_from_library/udfs.cpp#L24 into |x:i64| cudf[add_five,i64](x) + cudf[add_five,i64](x) returns the expected result (2*x + 10) but only calls add_five once.

    opened by mihai-varga 6
  • Nditer


    The main structural changes involves adding a new kind, NdIter, to IterKind: pub enum IterKind { ScalarIter, // A standard scalar iterator. SimdIter, // A vector iterator. FringeIter, // A fringe iterator, handling the fringe of a vector iter. NdIter, // multi-dimensional nd-iter }

    and a couple of fields relevant to NdIter in struct Iter, and struct ParallelForIter, e.g,: pub struct ParallelForIter { pub data: Symbol, pub start: Option, pub end: Option, pub stride: Option, pub kind: IterKind, // NdIter specific fields pub strides: Option, pub shapes: Option, }

    The code follows the same path as for other IterKinds, and the main changes are in the code to generate the index of the next element in llvm.rs (gen_loop_iteration_start, and gen_loop_iteration_end). And changing the bounds/num_iterations based on the shapes parameter also in llvm.rs (gen_loop_bounds_check, and gen_num_iters_and_fringe_start).

    Besides this, I added a struct in llvm.rs (not sure if this was the best place for this?): pub struct VecLLVMInfo { pub ty_str: String, pub arr_str: String, pub prefix: String, pub len_str: String, pub el_ty_str: String, } which was useful for many of the llvm routines I had.

    I have only added two tests for it (testing with zip, and a basic op (log)), but I did test it further using the numpy API where it was more natural as I could compare different outputs of non-contig arrays with results numpy produces. Here, I had a simple case that simulates a non-contiguous array using a 1-d rust array.

    One issue that still remains is how to set the size of the array correctly: right now it is being done in transforms.rs --> infer_size using the formula: len = end-start / stride; Instead, the new formula should be shapes[0]shapes[1]...*shapes[n-1], but I wasn't sure how to emit weld code for this using exprs...in weld it should be something like:

    for(shapes, merger[i64, *], |b, i, e| merge(b, e));

    And using exprs, as in infer_size, I thought it would be something like: let b = exprs::newbuilder_expr(Merger(Scalar(I64), BinOpKind::Multiply))?; let m = exprs::merge_expr(b, ???); exprs::for_expr(iters[0].data.clone(), b, m, false)?

    but I wasn't sure how to do the merge_expr --> in weld code I guess we have access to element "e", but how do we get it here?

    If this is resolved, then we won't need to pass in "end", and "stride" to nditer. So far, I was just using stride=1, and end = start + real_len, which is functionally correct, but adds too many parameters to nditer...

    opened by parimarjan 6
  • Reduce small loop overhead

    Reduce small loop overhead

    1. increase outer loop grain size to 4096
    2. eliminate bounds checks for simple single-iter (no explicit start and end) cases
    3. minimize thread ID retrievals
    4. local stack-based mergers with lazy creation of global mergers

    performance on a worst-case loop like this

    merger[+] m
    for v in vs {
      merger[+] n
      for e in v { // only one iteration
        merge(n, e)
      merge(m, result(n))

    is still 4x off the C version ... used to be 500x though

    opened by jjthomas 6
  • Eliminated shared runtime library

    Eliminated shared runtime library

    Compile times may have gone up ... tests seem to take a bit longer to run. @mateiz maybe you can measure this?

    weld_module_free and weld_module_mem_free are now used to free memory (they free all memory allocated by the module, which is what weld_value_free did before), and weld_module_memory_usage is used to determine a module's total memory usage. weld_value_free and weld_value_memory_usage now do nothing. I think we should implement these later once we figure out how to correctly free and determine the memory usage of a single value.

    opened by jjthomas 6
  • movielens_grizzly.py code not working.

    movielens_grizzly.py code not working.


    I tried to run "movielens_grizzly.py" but I got the following error:

    weld_type = grizzly_impl.numpy_to_weld_type_mapping[dtype] KeyError: '|S1'

    Is this a known issue?

    Thanks in advance.

    opened by kchasialis 0
  • Bump numpy from 1.18.1 to 1.22.0 in /weld-python

    Bump numpy from 1.18.1 to 1.22.0 in /weld-python

    Bumps numpy from 1.18.1 to 1.22.0.

    Release notes

    Sourced from numpy's releases.


    NumPy 1.22.0 Release Notes

    NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

    • Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.
    • A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.
    • NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.
    • New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.
    • A new configurable allocator for use by downstream projects.

    These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

    The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

    Expired deprecations

    Deprecated numeric style dtype strings have been removed

    Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.


    Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

    numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.


    ... (truncated)


    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    opened by dependabot[bot] 0
  • unique() function on weld-capi.

    unique() function on weld-capi.


    I want to implement the grizzly_impl.unique() function using weld-capi.

    After looking at the grizzly_impl.py code I found out that this is the code for unique()

                     |p: vec[i8]| {p,0}
                   |b, i, e| merge(b,e)
             |p: {vec[i8], i32}| p.$0

    However, obj_id is retrieved during runtime and I do not know how to do what using weld-capi. Basically my question is how to write a unique() function using weld-ir that can be compiled and called using weld-capi.

    Thanks in advance!

    opened by kchasialis 0
  • Running Python UDFs in Weld.

    Running Python UDFs in Weld.

    I am trying to run a UDF pipeline on a dataset using Weld (or grizzly, I suppose).

    Grizzly, however, (as far as I know) does not offer an optimized function to apply for example a scalar UDF on a specific column of the dataset.

    I found that one way to do it is to access the internal data using to_pandas() which has a function called “apply” and use this function to run a Python UDF on a column.

    The problem is that I want to measure Weld’s performance on UDFs and by accessing the internal data and applying the functions just like a normal python program would do is not a fair way to measure Weld’s performance regarding (Python) UDF execution.

    How can I apply a python UDF on a column of the dataset in an optimized way using Weld?

    Thanks in advance!

    opened by kchasialis 0
  • example udfs run error

    example udfs run error

    When I try to build the example udfs and run it , I got error: LLVM ERROR: Program used external function 'add_five' which could not be resolved! I figure that should be lack the -rdynamic in the gcc flags. I want to submit a PR to fix that. But I found that still exist 2 unmerge PR. I want to know who is maintaining this project and what is the PR standard?

    opened by bakey 0
  • v0.4.0(Feb 13, 2020)


    • Deprecates the old constructors API in lieu of the NewExpr trait on Expr
    • Adds a new SIR pass that allows for improved performance in some cases
    Source code(tar.gz)
    Source code(zip)
  • v0.3.1(Aug 23, 2019)

    This release is equivalent to the tagged v0.3.0 release, but it increments the Cargo.toml version number. This allows the version of Weld published on Cargo to be in sync with the version tagged in releases on GitHub from now on.

    Source code(tar.gz)
    Source code(zip)
  • v0.3.0(Aug 23, 2019)

    This is the first tagged release of Weld. See the release notes below.


    This release brings a new, optimized LLVM backend and updates to the core Weld APIs. It also removes multi-threading support for the time being, beause the old multi-threading design imposed undue overhead on single-threaded programs. In the future, threading will be re-introduced.

    Detailed Notes

    • Introduces the WeldContext API, which provides a context for Weld runs to allocate their memory and to reuse objects allocated by other Weld runs. See the [WeldContext documentation]({{ site.url }}/docs/latest/weld/struct.WeldContext.html) for more details on how this is used.

    • Adds an optlookup operator that performs keyexists and lookup with a single hash operation. The operation returns a value and a boolean indicating whether the value was found. This operator will replace lookup on dictionaries eventually. See the language docs for more detail.

    • Changes the sort API to take a comparator function similar to libc qsort.

    • Adds an assert operator, which crashes the program if the assertion fails and evaluates to true otherwise.

    • Removes support for the nditer iterator. This will be added back in a future release.

    • Adds new options for dumping code. See the [new configuration options]({{ site.url }}/docs/latest/weld/#constants) for details.

    • Adds the ability to make type aliases:

      type mytuple = {i32,i32};
    • Internally, the backend now uses LLVM's C builder API. This improves both compilation times and execution time (since LLVM's optimizer does a better job overall)

    • Change the hash function to CRC32. This is only supported on x86/x64.

    • Implements common subexpression elimination.

    • Bug fixes (see the PRs on the Weld repository).

    Source code(tar.gz)
    Source code(zip)
The Weld Project
A high-performance, high-reliability observability data pipeline.

Quickstart • Docs • Guides • Integrations • Chat • Download What is Vector? Vector is a high-performance, end-to-end (agent & aggregator) observabilit

Timber 12.1k Jan 2, 2023
A Modern Real-Time Data Processing & Analytics DBMS with Cloud-Native Architecture, written in Rust

Datafuse Modern Real-Time Data Processing & Analytics DBMS with Cloud-Native Architecture Datafuse is a Real-Time Data Processing & Analytics DBMS wit

Datafuse Labs 5k Jan 4, 2023
📊 Cube.js — Open-Source Analytics API for Building Data Apps

?? Cube.js — Open-Source Analytics API for Building Data Apps

Cube.js 14.4k Jan 8, 2023
PostQuet: Stream PostgreSQL tables/queries to Parquet files seamlessly with this high-performance, Rust-based command-line tool.

STATUS: IN DEVELOPMENT PostQuet: Streaming PostgreSQL to Parquet Exporter PostQuet is a powerful and efficient command-line tool written in Rust that

Per Arneng 4 Apr 11, 2023
An example repository on how to start building graph applications on streaming data. Just clone and start building 💻 💪

An example repository on how to start building graph applications on streaming data. Just clone and start building ?? ??

Memgraph 40 Dec 20, 2022
A cross-platform library to retrieve performance statistics data.

A toolkit designed to be a foundation for applications to monitor their performance.

Lark Technologies Pte. Ltd. 155 Nov 12, 2022
New generation decentralized data warehouse and streaming data pipeline

World's first decentralized real-time data warehouse, on your laptop Docs | Demo | Tutorials | Examples | FAQ | Chat Get Started Watch this introducto

kamu 184 Dec 22, 2022
This library provides a data view for reading and writing data in a byte array.

Docs This library provides a data view for reading and writing data in a byte array. This library requires feature(generic_const_exprs) to be enabled.

null 2 Nov 2, 2022
(MERGED) Rust bindings for TVM runtime

DEPRECATED The RFC is closed and this has been merge into TVM. TVM Runtime Frontend Support This crate provides an idiomatic Rust API for TVM runtime

Ehsan M. Kermani 26 Sep 29, 2022
Orkhon: ML Inference Framework and Server Runtime

Orkhon: ML Inference Framework and Server Runtime Latest Release License Build Status Downloads Gitter What is it? Orkhon is Rust framework for Machin

Theo M. Bulut 129 Dec 21, 2022
Xaynet represents an agnostic Federated Machine Learning framework to build privacy-preserving AI applications.

xaynet Xaynet: Train on the Edge with Federated Learning Want a framework that supports federated learning on the edge, in desktop browsers, integrate

XayNet 196 Dec 22, 2022
Rayon: A data parallelism library for Rust

Rayon Rayon is a data-parallelism library for Rust. It is extremely lightweight and makes it easy to convert a sequential computation into a parallel

null 7.8k Jan 8, 2023
Quickwit is a big data search engine.

Quickwit This repository will host Quickwit, the big data search engine developed by Quickwit Inc. We will progressively polish and opensource our cod

Quickwit Inc. 2.9k Jan 7, 2023
DataFrame / Series data processing in Rust

black-jack While PRs are welcome, the approach taken only allows for concrete types (String, f64, i64, ...) I'm not sure this is the way to go. I want

Miles Granger 30 Dec 10, 2022
ConnectorX - Fastest library to load data from DB to DataFrames in Rust and Python

ConnectorX enables you to load data from databases into Python in the fastest and most memory efficient way.

SFU Database Group 939 Jan 5, 2023
A highly efficient daemon for streaming data from Kafka into Delta Lake

A highly efficient daemon for streaming data from Kafka into Delta Lake

Delta Lake 172 Dec 23, 2022
TensorBase is a new big data warehousing with modern efforts.

TensorBase is a new big data warehousing with modern efforts.

null 1.3k Jan 4, 2023
Fill Apache Arrow record batches from an ODBC data source in Rust.

arrow-odbc Fill Apache Arrow arrays from ODBC data sources. This crate is build on top of the arrow and odbc-api crate and enables you to read the dat

Markus Klein 21 Dec 27, 2022
Analysis of Canadian Federal Elections Data

Canadian Federal Elections election is a small Rust program for processing vote data from Canadian Federal Elections. After building, see election --h

Colin Woodbury 2 Sep 26, 2021