PostQuet: Stream PostgreSQL tables/queries to Parquet files seamlessly with this high-performance, Rust-based command-line tool.

Overview

Build and Test

STATUS: IN DEVELOPMENT

logo

PostQuet: Streaming PostgreSQL to Parquet Exporter

PostQuet is a powerful and efficient command-line tool written in Rust that enables you to stream PostgreSQL tables to Parquet files seamlessly. Designed for data engineers, analysts, and developers, this tool simplifies the process of converting PostgreSQL data into the widely used columnar storage format, Parquet.

Key Features

  • Streaming data conversion: PostQuet allows for the streaming of data between PostgreSQL and Parquet, reducing memory overhead and enabling the processing of large datasets.
  • High-performance: Leveraging Rust's speed and safety, PostQuet offers excellent performance while ensuring the reliability of your data conversion.
  • Asynchronous processing: The tool utilizes asynchronous processing techniques to optimize efficiency and minimize the time spent on data conversion tasks.
  • Support for complex data types and table structures: PostQuet handles a wide range of data types and table structures, ensuring compatibility with diverse PostgreSQL databases.
  • User-friendly command-line interface: With an intuitive CLI, PostQuet makes it easy for users to export PostgreSQL tables to Parquet files, regardless of their technical expertise.

With PostQuet, you can quickly and easily convert your PostgreSQL data into the Parquet format, unlocking new possibilities for data analysis and processing.

You might also like...
A rust library built to support building time-series based projection models

TimeSeries TimeSeries is a framework for building analytical models in Rust that have a time dimension. Inspiration The inspiration for writing this i

Rustic - a backup tool that provides fast, encrypted, deduplicated backups

Rustic is a backup tool that provides fast, encrypted, deduplicated backups. It can read the restic repo format desribed in the design document and writes a compatible repo format which can also be read by restic.

Yet Another Technical Analysis library [for Rust]

YATA Yet Another Technical Analysis library YaTa implements most common technical analysis methods and indicators. It also provides you an interface t

Dataframe structure and operations in Rust

Utah Utah is a Rust crate backed by ndarray for type-conscious, tabular data manipulation with an expressive, functional interface. Note: This crate w

Rust DataFrame library

Polars Blazingly fast DataFrames in Rust & Python Polars is a blazingly fast DataFrames library implemented in Rust. Its memory model uses Apache Arro

A Rust DataFrame implementation, built on Apache Arrow

Rust DataFrame A dataframe implementation in Rust, powered by Apache Arrow. What is a dataframe? A dataframe is a 2-dimensional tabular data structure

Rayon: A data parallelism library for Rust

Rayon Rayon is a data-parallelism library for Rust. It is extremely lightweight and makes it easy to convert a sequential computation into a parallel

Official Rust implementation of Apache Arrow

Native Rust implementation of Apache Arrow Welcome to the implementation of Arrow, the popular in-memory columnar format, in Rust. This part of the Ar

Apache TinkerPop from Rust via Rucaja (JNI)

Apache TinkerPop from Rust An example showing how to call Apache TinkerPop from Rust via Rucaja (JNI). This repository contains two directories: java

Owner
Per Arneng
Per Arneng
An AWS Lambda for automatically loading JSON files as they're created into Delta tables

Delta S3 Loader This AWS Lambda serves a singular purpose: bring JSON files from an S3 bucket into Delta Lake. This can be highly useful for legacy or

R. Tyler Croy 4 Jan 12, 2022
Fastest and safest Rust implementation of parquet. `unsafe` free. Integration-tested against pyarrow

Parquet2 This is a re-write of the official parquet crate with performance, parallelism and safety in mind. The five main differentiators in compariso

Jorge Leitao 237 Jan 1, 2023
A high-performance, high-reliability observability data pipeline.

Quickstart • Docs • Guides • Integrations • Chat • Download What is Vector? Vector is a high-performance, end-to-end (agent & aggregator) observabilit

Timber 12.1k Jan 2, 2023
A tool to stream the chats of Twitch channels as a CSV.

twitch2csv A tool to stream the chats of Twitch channels as a CSV. Installation You can use cargo to install this tool: cargo install -f twitch2csv Us

Clément Renault 2 Nov 20, 2021
High-performance runtime for data analytics applications

Weld Documentation Weld is a language and runtime for improving the performance of data-intensive applications. It optimizes across libraries and func

Weld 2.9k Dec 28, 2022
Perhaps the fastest and most memory efficient way to pull data from PostgreSQL into pandas and numpy. 🚀

flaco Perhaps the fastest and most memory efficient way to pull data from PostgreSQL into pandas and numpy. ?? Have a gander at the initial benchmarks

Miles Granger 14 Oct 31, 2022
A cross-platform library to retrieve performance statistics data.

A toolkit designed to be a foundation for applications to monitor their performance.

Lark Technologies Pte. Ltd. 155 Nov 12, 2022
A Rust crate that reads and writes tfrecord files

tfrecord-rust The crate provides the functionality to serialize and deserialize TFRecord data format from TensorFlow. Features Provide both high level

null 22 Nov 3, 2022
Create full-fledged APIs for static datasets without writing a single line of code.

ROAPI ROAPI automatically spins up read-only APIs for static datasets without requiring you to write a single line of code. It builds on top of Apache

null 2.5k Dec 31, 2022
PyO3-based Rust binding of NumPy C-API

rust-numpy Rust bindings for the NumPy C-API API documentation Latest release (possibly broken) Current Master Requirements Rust >= 1.41.1 Basically,

PyO3 759 Jan 3, 2023