An AWS Lambda for automatically loading JSON files as they're created into Delta tables

Overview

Delta S3 Loader

This AWS Lambda serves a singular purpose: bring JSON files from an S3 bucket into Delta Lake. This can be highly useful for legacy or external processes which rely on uploading JSON to an S3 bucket and cannot be properly updated to write directly to Delta Lake.

You might also like...
Easy c̵̰͠r̵̛̠ö̴̪s̶̩̒s̵̭̀-t̶̲͝h̶̯̚r̵̺͐e̷̖̽ḁ̴̍d̶̖̔ ȓ̵͙ė̶͎ḟ̴͙e̸̖͛r̶̖͗ë̶̱́ṉ̵̒ĉ̷̥e̷͚̍ s̷̹͌h̷̲̉a̵̭͋r̷̫̊ḭ̵̊n̷̬͂g̵̦̃ f̶̻̊ơ̵̜ṟ̸̈́ R̵̞̋ù̵̺s̷̖̅ţ̸͗!̸̼͋

Rust S̵̓i̸̓n̵̉ I̴n̴f̶e̸r̵n̷a̴l mutability! Howdy, friendly Rust developer! Ever had a value get m̵̯̅ð̶͊v̴̮̾ê̴̼͘d away right under your nose just when

Rs.aws-login - A command line utility to simplify logging into AWS services.

aws-login A command line utility to simplify logging into AWS accounts and services. $ aws-login use ? Please select a profile to use: › ❯ dev-read

A tool to run web applications on AWS Lambda without changing code.
A tool to run web applications on AWS Lambda without changing code.

AWS Lambda Adapter A tool to run web applications on AWS Lambda without changing code. How does it work? AWS Lambda Adapter supports AWS Lambda functi

AWS Lambda with Rust

Set up (MacOS) Add x86_64 target architecture for Rust rustup target add x86_64-unknown-linux-musl Install cross system C linker sudo brew install fil

A Rust runtime for AWS Lambda

Rust Runtime for AWS Lambda This package makes it easy to run AWS Lambda Functions written in Rust. This workspace includes multiple crates: lambda-ru

A highly efficient daemon for streaming data from Kafka into Delta Lake

kafka-delta-ingest The kafka-delta-ingest project aims to build a highly efficient daemon for streaming data through Apache Kafka into Delta Lake. Thi

A highly efficient daemon for streaming data from Kafka into Delta Lake

A highly efficient daemon for streaming data from Kafka into Delta Lake

🦖 Evolve your fixed length data files into Apache Arrow tables, fully parallelized!

🦖 Evolve your fixed length data files into Apache Arrow tables, fully parallelized! 🔎 Overview ... 📦 Installation The easiest way to install evolut

Aws-sdk-rust - AWS SDK for the Rust Programming Language

The AWS SDK for Rust This repo contains the new AWS SDK for Rust (the SDK) and its public roadmap. Please Note: The SDK is currently released as a dev

Generate markdown comparison tables from `cargo-criterion` JSON output

criterion-table Generate markdown comparison tables from Cargo Criterion benchmark JSON output. Currently, the tool is limited to Github Flavored Mark

A trading bot written in Rust based on the orderbook delta volume.

The strategy based on the concept of mean reversion. We look for large deviations in the volume delta of BTC-PERP on FTX at a depth of 1. These deviations could be caused by over-enthusiastic and over-leveraged market participants.

Quick poc of the rsync wire protocol in Rust. Supports delta transfer.

Rsync wire protocol in Rust This is a quick poc of the rsync wire protocol in Rust. It supports delta transfer. The code is really a mess right now, a

Convert rust coverage data to HTML reports, LCOV files or terminal tables

Coverage Prepare Convert coverage data to HTML reports, LCOV files or terminal tables. coverage-prepare --help: Convert "profraw" coverage data to: *

PostQuet: Stream PostgreSQL tables/queries to Parquet files seamlessly with this high-performance, Rust-based command-line tool.

STATUS: IN DEVELOPMENT PostQuet: Streaming PostgreSQL to Parquet Exporter PostQuet is a powerful and efficient command-line tool written in Rust that

Tool to allow parsing large JSON files without laoding into memory
Tool to allow parsing large JSON files without laoding into memory

Tool to allow parsing large JSON files without laoding into memory. Developed in Rust with adapters in other programming langauges for easy adoption

My own image file format created for fun! Install the "hif_opener.exe" to open hif files. clone the repo and compile to make your own hif file

Why am i creating this? I wanted to create my own image format since I was 12 years old using Windows 7, tryna modify GTA San Andreas. That day, when

Merge together and efficiently time-sort compressed .pcap files stored in AWS S3 object storage (or locally) to stdout for pipelined processing.

Merge together and efficiently time-sort compressed .pcap files stored in AWS S3 object storage (or locally) to stdout for pipelined processing. High performance and parallel implementation for 10 Gbps playback throughput with large numbers of files (~4k).

Get JSON values quickly - JSON parser for Rust
Get JSON values quickly - JSON parser for Rust

get json values quickly GJSON is a Rust crate that provides a fast and simple way to get values from a json document. It has features such as one line

Schema2000 is a tool that parses exsiting JSON documents and tries to derive a JSON schema from these documents.

Schema 2000 Schema2000 is a tool that parses exsiting JSON documents and tries to derive a JSON schema from these documents. Currently, Schema2000 is

Comments
  • How delta s3 loader works with a bucket containing old files ?

    How delta s3 loader works with a bucket containing old files ?

    Hello !

    I read all docs and I don't understand onething, if my bucket has old files, like 10k and I start delta-s3-loader, he will load all 10k files and after load the new files that will arrive in bucket ou just the new files ?

    Thank you !

    opened by luancaarvalho 0
Owner
R. Tyler Croy
person
R. Tyler Croy
🦖 Evolve your fixed length data files into Apache Arrow tables, fully parallelized!

?? Evolve your fixed length data files into Apache Arrow tables, fully parallelized! ?? Overview ... ?? Installation The easiest way to install evolut

Firelink Data 3 Dec 22, 2023
PostQuet: Stream PostgreSQL tables/queries to Parquet files seamlessly with this high-performance, Rust-based command-line tool.

STATUS: IN DEVELOPMENT PostQuet: Streaming PostgreSQL to Parquet Exporter PostQuet is a powerful and efficient command-line tool written in Rust that

Per Arneng 4 Apr 11, 2023
Perhaps the fastest and most memory efficient way to pull data from PostgreSQL into pandas and numpy. 🚀

flaco Perhaps the fastest and most memory efficient way to pull data from PostgreSQL into pandas and numpy. ?? Have a gander at the initial benchmarks

Miles Granger 14 Oct 31, 2022
Dig into ClickHouse with TUI interface. PRE ALPHA version, everything will be changed.

chdig Dig into ClickHouse with TUI interface. Motivation The idea is came from everyday digging into various ClickHouse issues. ClickHouse has a appro

Azat Khuzhin 5 Feb 10, 2023
A Rust crate that reads and writes tfrecord files

tfrecord-rust The crate provides the functionality to serialize and deserialize TFRecord data format from TensorFlow. Features Provide both high level

null 22 Nov 3, 2022
Work out how to read Parquet files in a browser using web assembly (via the Rust toolchain)

wasm-pack-template A template for kick starting a Rust and WebAssembly project using wasm-pack. Tutorial | Chat Built with ?? ?? by The Rust and WebAs

null 5 Oct 11, 2022
cargo-lambda a Cargo subcommand to help you work with AWS Lambda

cargo-lambda cargo-lambda is a Cargo subcommand to help you work with AWS Lambda. This subcommand compiles AWS Lambda functions natively and produces

David Calavera 184 Jan 5, 2023
cargo-lambda is a Cargo subcommand to help you work with AWS Lambda.

cargo-lambda cargo-lambda is a Cargo subcommand to help you work with AWS Lambda. The new subcommand creates a basic Rust package from a well defined

null 184 Jan 5, 2023
The lambda-chaos-extension allows you to inject faults into Lambda functions without modifying the function code.

Chaos Extension - Seamless, Universal & Lightning-Fast The lambda-chaos-extension allows you to inject faults into Lambda functions without modifying

AWS CLI Tools 5 Aug 2, 2023
📦 🚀 a smooth-talking smuggler of Rust HTTP functions into AWS lambda

lando ?? maintenance mode ahead ?? As of this announcement AWS not officialy supports Rust through this project. As mentioned below this projects goal

Doug Tangren 68 Dec 7, 2021