An experimental implementation of Arc against Apache Datafusion

Related tags

Utilities box
Overview

box

This is an experimental repository to perform a proof of concept replacement of the Apache Spark executor for Arc with Apache DataFusion.

This is a very simple proof-of-concept which, with community collaboration, could easily form the basis of much more efficient Arc execution. If you can see value in this approach and would like to get involved please raise an issue. If sufficient demand is reached we can set up a more formal discussion forum.

How to run

Clone the repository

This respository has a submodule with the TPC-H data in it for easy execution demonstration. So when cloning add the recusive capability:

git clone --recurse-submodules https://github.com/tripl-ai/box.git

Command Line

To execute a job via the command line you can use the the provided ./box.sh file which will execute job.json and is intended to show the basic functionality.

You will need to have Rust installed (see rustup) and then add the nightly channel:

rustup toolchain install nightly

after the initial Rust install. The Rust nightly version is currently required for the simd support. Some packages may need to be install to compile such as cmake but if you check the build output it should indicate any missing packages.

Please note that if running on WSL or Windows you may need to convert the line endings to Unix format (LF) in order to run the script. When checking out the code they may be automatically changed to Windows line endings (CRLF) depending on your config. If you would like to git to not convert CRLF line endings then you can set core.autocrlf to false:

 git config --global core.autocrlf false

See Customizing Git for more information.

Notebook

To execute the notebook functionality execute the provided ./notebook.sh file. The box.ipynb file is a demonstration and is intended to show the basic notebook functionality. You will need Docker installed (see Docker).

Licenses

The notebook functionality relies on code copied and modified from the evcxr crate.

You might also like...
Rust implementation of the legacy Master Server Query Protocol

msq-rs Rust library implementation of the legacy Master Server Query Protocol. Documentation crates.io Repository Release Notes Usage Add this to your

This is an implementation defining standard for client-side-validation

Client-side-validation Foundation Libraries This is an implementation defining standard of client-side-validation representing a set of its Foundation

🦀 Rust-based implementation of a Snowflake Generator which communicates using gRPC

Clawflake Clawflake is a Rust application which implements Twitter Snowflakes and communicates using gRPC. Snowflake ID numbers are 63 bits integers s

Re-implementation of Panda Doodle in Rust targetting WASM, a mobile game originally written in C++

Description This is the source code of my game Panda Doodle, which can be played at https://pandadoodle.lucamoller.com/ (it's best playable on touch s

2D Predictive-Corrective Smoothed Particle Hydrodynamics (SPH) implementation in Rust with WASM + WebGL

pcisph-wasm 2D Predictive-Corrective Smoothed Particle Hydrodynamics (SPH) implementation in Rust with WASM + WebGL Reimplementation of my previous Ru

A Rust implementation of fractional indexing.
A Rust implementation of fractional indexing.

fractional_index This crate implements fractional indexing, a term coined by Figma in their blog post Realtime Editing of Ordered Sequences. Specifica

Pure rust implementation of jq

XQ JQ reimplemented purely in Rust. Caution This program is under development. You probably want to use the original implementation of jq, or pure Go

A pure Rust PLONK implementation using arkworks as a backend.

PLONK This is a pure Rust implementation of the PLONK zk proving system Usage use ark_plonk::prelude::*; use ark_ec::bls12::Bls12; use rand_core::OsRn

A Bancho implementation made in Rust for the *cursed* stack.

cu.rs A Bancho implementation made in Rust for the cursed stack. THIS PROJECT IS REALLY UNFINISHED AND IN ITS EARLY STAGES A drag and drop replacement

Owner
tripl.ai
tripl.ai
Pure Rust library for Apache ZooKeeper built on tokio

zookeeper-async Async Zookeeper client written 100% in Rust, based on tokio. This library is intended to be equivalent with the official (low-level) Z

Kamil Rojewski 16 Dec 16, 2022
Experimental playground for wiktionary data

wikt Experimental playground for wiktionary data. This document might not update as often as the code does. Set up You'll want a minimum of 10 GB free

Félix Saparelli 8 Jul 9, 2022
An experimental programming language for exploring first class iterators.

An experimental programming language for exploring first class iterators.

Miccah 4 Nov 23, 2021
Experimental syntax for Rust

Osy.rs Experimental syntax for Rust Hey everyone, this readme needs work! The spec has been roughed out in Osy.rs_spec.alpha, but the file could be be

null 3 Dec 17, 2021
An experimental Rust crate for sigstore

Continuous integration Docs License This is an experimental crate to interact with sigstore. This is under high development, many features and checks

sigstore 89 Dec 29, 2022
Experimental Valve Index camera passthrough for Linux

Index camera passthrough Warning: This is still a work in progress, you could get motion sickness if you try it now The problem that the Index camera

yshui 22 Dec 1, 2022
Experimental Rust tool for generating FFI definitions allowing many other languages to call Rust code

Diplomat is an experimental Rust tool for generating FFI definitions allowing many other languages to call Rust code. With Diplomat, you can simply define Rust APIs to be exposed over FFI and get high-level C, C++, and JavaScript bindings automatically!

null 255 Dec 30, 2022
An experimental transpiler to bring tailwind macros to SWC 🚀

stailwc (speedy tailwind compiler) This is an experimental SWC transpiler to bring compile time tailwind macros to SWC (and nextjs) a-la twin macro. T

Alexander Lyon 139 Dec 20, 2022
An experimental RISC-V recompiler

WARNING: All of this code is highly experimental and is a direct result of a two day hacking binge fueled by a truckload of tea. It's definitely not s

Koute 13 Apr 2, 2023
An experimental Athena extension for DuckDB 🐤

DuckDB Athena Extension WARNING This is a work in progress - things may or may not work as expected ??‍♂️ Limitations Only the default database is sup

Damon P. Cortesi 34 Apr 3, 2023