Fill Apache Arrow record batches from an ODBC data source in Rust.

Overview

arrow-odbc

Docs Licence Crates.io

Fill Apache Arrow arrays from ODBC data sources. This crate is build on top of the arrow and odbc-api crate and enables you to read the data of an ODBC data source as sequence of Apache Arrow record batches.

About Arrow

Apache Arrow defines a language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware like CPUs and GPUs. The Arrow memory format also supports zero-copy reads for lightning-fast data access without serialization overhead.

About ODBC

ODBC (Open DataBase Connectivity) is a standard which enables you to access data from a wide variaty of data sources using SQL.

Usage

use arrow_odbc::{odbc_api::Environment, OdbcReader};

const CONNECTION_STRING: &str = "\
    Driver={ODBC Driver 17 for SQL Server};\
    Server=localhost;\
    UID=SA;\
    PWD=My@Test@Password1;\
";

fn main() -> Result<(), anyhow::Error> {
    // Your application is fine if you spin up only one Environment.
    let odbc_environment = unsafe {
        Environment::new().unwrap()
    };
    
    // Connect with database.
    let connection = odbc_environment.connect_with_connection_string(CONNECTION_STRING)?;

    // This SQL statement does not require any arguments.
    let parameters = ();

    // Execute query and create result set
    let cursor = connection
        .execute("SELECT * FROM MyTable", parameters)?
        .expect("SELECT statement must produce a cursor");

    // Each batch shall only consist of maximum 10.000 rows.
    let max_batch_size = 10_000;

    // Read result set as arrow batches. Infer Arrow types automatically using the meta
    // information of `cursor`.
    let arrow_record_batches = OdbcReader::new(cursor, max_batch_size)?;

    for batch in arrow_record_batches {
        // ... process batch ...
    }
    Ok(())
}

Matching of ODBC to Arrow types

ODBC Arrow
Numeric(p <= 38) Decimal
Decimal(p <= 38) Decimal
Integer Int32
SmallInt Int16
Real Float32
Float(p <=24) Float32
Double Float64
Float(p > 24) Float64
Date Date32
LongVarbinary Binary
Timestamp(p = 0) TimestampSecond
Timestamp(p: 1..3) TimestampMilliSecond
Timestamp(p: 4..6) TimestampMicroSecond
Timestamp(p >= 7 ) TimestampNanoSecond
BigInt Int64
TinyInt Int8
Bit Boolean
Varbinary Binary
Binary FixedSizedBinary
All others Utf8

Supported Arrow types

Appart from the afformentioned Arrow types Uint8 is also supported if specifying the Arrow schema directly.

Comments
  • Update odbc-api requirement from >= 0.40.2, < 0.43 to >= 0.40.2, < 0.45

    Update odbc-api requirement from >= 0.40.2, < 0.43 to >= 0.40.2, < 0.45

    Updates the requirements on odbc-api to permit the latest version.

    Changelog

    Sourced from odbc-api's changelog.

    0.44.0

    • All methods on the ResultSetMetaData trait now require exclusive (&mut) references.
    • The trait BorrowMutStatement has been replaced by AsStatementRef.
    • ColumnarBulkInserter is now generic over statement ownership.
    • It is now possible to create a bulk inserter which just borrows a prepared statement, rather than taking ownership of it. Use Prepared::any_column_inserter to do so. This is useful for dynamically growing the capacity of the bound array parameters. So far users were forced to create a new ColumnarBulkInserter to do so. This is still true, but creating a new ColumnarBulkInserter now longer implies that a new prepared statement has to be created.

    0.43.0

    • BoundInputSlice, TextColumnSliceMut and BinColumnSliceMut now only track the lifetime of the statement. The do no longer need to track the lifetime of the Connection anymore.

    0.42.0

    • Removed TextColumn::set_max_len as resize_max_str_len can do everything the former can and does it better.
    • ColumnBuffer and TextRowSet can no longer be used to insert data into the database. It can now only be used to receive data. This is due to different invariants which must hold true for input and output buffers. The new ColumnarBulkInserter which can be constructed using into_any_column_inserter or into_text_inserter on a prepared statment closes this safety hole. It also allows for faster insertion into a database in case you transmit the data in several batches, because it does not require the user to rebind the parameter buffers in safe code. This significantly reduces the overhead for each batch.

    0.41.0

    • Refactored traits for passing parameters to Connection::execute. Should not break user syntax, though.
    • Allow using fixed size paramters like i32 directly as output parameter without wrapping them into a Nullable first.
    • Renamed TextColumnWriter::rebind into TextColumnWriter::resize_max_str
    • Fast insertion into database using TextRowSet without rebinding parameter buffers for individual batches.

    0.40.2

    • Introduce alias for ColumnarAnyBuffer for ColumnarBuffer<AnyColumnBuffer>.

    0.40.1

    • Introduce feature iodbc. This features enables all features required for this crate to work with the IODBC driver manager.

      • narrow ODBC function calls
      • Use of odbc_version_3_5 symbols and declaration of ODBC API version 3.0.
      • Linking of upstream odbc-sys vs iodbc as opposed to plain odbc.
    • Introduce ColumnDescription::new to allow for intantiating ColumnDescription in test cases without caring about wether SqlChar is u8 or u16.

    0.40.0

    • The fallibale allocations introduced for buffer_from_description had performance implications. Therfore the fallibale allocations are available to users of this library to try_buffer_from_description. buffer_for_description is now panicing again in case there is not enough memory available to allocate the buffers.

    0.39.1

    • Fixed an issue introduced in 0.39.1 there the terminating zero at the end of diagnostic messages has not been removed correctly.

    0.39.0

    • Support for checking for truncation after bulk fetch with: RowSetCursor::fetch_with_truncation_check.

    ... (truncated)

    Commits
    • db0420e bump version to 0.44.0
    • 6b35d23 Prepared::any_column_inserter
    • a1fa0aa ensure parameters are reset then binding new params
    • 9aeaccb ColumnarBulkInserter generic over Statement
    • bc44106 AsStatement ref now has now associated types
    • 0082419 BorrowMutStatement -> AsStatementRef
    • d59527d bump version to 0.4.7
    • 2ffbd72 update changelog
    • 277beb3 BoundInputSlice, now only tracks lifetime of stmt
    • 941987b bin-column now only tracks lifetime of statement
    • Additional commits viewable in compare view

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 2
  • Update arrow requirement from >= 25, < 28 to >= 25, < 29

    Update arrow requirement from >= 25, < 28 to >= 25, < 29

    Updates the requirements on arrow to permit the latest version.

    Changelog

    Sourced from arrow's changelog.

    28.0.0 (2022-11-25)

    Full Changelog

    Breaking changes:

    Implemented enhancements:

    • Add iterator to RowSelection #3172 [parquet]
    • create an integration test set for parquet crate against pyspark for working with bloom filters #3167 [parquet]
    • Row Format Size Tracking #3160 [arrow]
    • Add ArrayBuilder::finish_cloned() #3154 [arrow]
    • Optimize memory usage of json reader #3150
    • Add Field::size and DataType::size #3147 [parquet] [arrow]
    • Add like_utf8_scalar_dyn kernel #3145 [arrow]
    • support comparison for decimal128 array with scalar in kernel #3140 [arrow]
    • audit and create a document for bloom filter configurations #3138 [parquet]
    • Should be the rounding vs truncation when cast decimal to smaller scale #3137 [arrow]
    • Upgrade chrono to 0.4.23 #3120
    • Implements more temporal kernels using time_fraction_dyn #3108 [arrow]
    • Upgrade to thrift 0.17 #3105 [parquet] [arrow]
    • Be able to parse time formatted strings #3100 [arrow]
    • Improve "Fail to merge schema" error messages #3095 [arrow]
    • Expose SortingColumn when reading and writing parquet metadata #3090 [parquet]
    • Change Field::metadata to HashMap #3086 [parquet] [arrow]
    • Support bloom filter reading and writing for parquet #3023 [parquet]
    • API to take back ownership of an ArrayRef #2901 [arrow]
    • Specialized Interleave Kernel #2864 [arrow]

    Fixed bugs:

    • arithmatic overflow leads to segfault in concat_batches #3123 [arrow]
    • Clippy failing on master : error: use of deprecated associated function chrono::NaiveDate::from_ymd: use from_ymd_opt() instead #3097 [parquet] [arrow]
    • Pretty print for interval types has wrong formatting #3092 [arrow]
    • Field is not serializable with binary formats #3082 [arrow]
    • Decimal Casts are Unchecked #2986 [arrow]

    Closed issues:

    ... (truncated)

    Commits

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
  • Update arrow requirement from >= 22, < 25 to >= 22, < 26

    Update arrow requirement from >= 22, < 25 to >= 22, < 26

    Updates the requirements on arrow to permit the latest version.

    Changelog

    Sourced from arrow's changelog.

    25.0.0 (2022-10-14)

    Full Changelog

    Breaking changes:

    Implemented enhancements:

    Fixed bugs:

    • Don't try to infer nulls in CSV schema inference #2859 [arrow]
    • parquet::arrow::arrow_writer::ArrowWriter ignores page size properties #2853 [parquet]
    • Introducing ArrowNativeTypeOp made it impossible to call kernels from generics #2839 [arrow]
    • Unsound ArrayData to Array Conversions #2834 [parquet] [arrow]
    • Regression: the trait bound for<'de> arrow::datatypes::Schema: serde::de::Deserialize<'de> is not satisfied #2825 [arrow]
    • convert string to timestamp shouldn't apply local timezone offset if there's no explicit timezone info in the string #2813 [arrow]

    Closed issues:

    • Add pub api for checking column index is sorted #2848 [parquet]

    Merged pull requests:

    ... (truncated)

    Commits

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
  • Update arrow requirement from >= 21, < 22 to >= 21, < 23

    Update arrow requirement from >= 21, < 22 to >= 21, < 23

    Updates the requirements on arrow to permit the latest version.

    Changelog

    Sourced from arrow's changelog.

    22.0.0 (2022-09-02)

    Full Changelog

    Breaking changes:

    Implemented enhancements:

    • Add Macros to assist with static dispatch #2635 [arrow]
    • Support comparison between DictionaryArray and BooleanArray #2617 [arrow]
    • Use total_cmp for floating value ordering and remove nan_ordering feature flag #2613 [arrow]
    • Support empty projection in CSV, JSON readers #2603 [arrow]
    • Support SQL-compliant NaN ordering between for DictionaryArray and non-DictionaryArray #2599 [arrow]
    • Add dyn_cmp_dict feature flag to gate dyn comparison of dictionary arrays #2596 [arrow]
    • Add max_dyn and min_dyn for max/min for dictionary array #2584 [arrow]
    • Allow FlightSQL implementers to extend do_get() #2581 [arrow-flight]
    • Support SQL-compliant behavior on eq_dyn, neq_dyn, lt_dyn, lt_eq_dyn, gt_dyn, gt_eq_dyn #2569 [arrow]
    • Add sql-compliant feature for enabling sql-compliant kernel behavior #2568
    • Calculate sum for dictionary array #2565 [arrow]
    • Add test for float nan comparison #2556 [arrow]
    • Compare dictionary with string array #2548 [arrow]
    • Compare dictionary with primitive array in lt_dyn, lt_eq_dyn, gt_dyn, gt_eq_dyn #2538 [arrow]
    • Compare dictionary with primitive array in eq_dyn and neq_dyn #2535 [arrow]
    • UnionBuilder Create Children With Capacity #2523 [arrow]
    • Speed up like_utf8_scalar for %pat% #2519 [arrow]
    • Replace macro with TypedDictionaryArray in comparison kernels #2513 [arrow]
    • Use same codebase for boolean kernels #2507 [arrow]
    • Use u8 for Decimal Precision and Scale #2496 [arrow]
    • Integrate skip row without pageIndex in SerializedPageReader in Fuzz Test #2475 [parquet]
    • Avoid unecessary copies in Arrow IPC reader #2437 [arrow]
    • Add GenericColumnReader::skip_records Missing OffsetIndex Fallback #2433 [parquet]
    • Support Reading PageIndex with ParquetRecordBatchStream #2430 [parquet]
    • Specialize FixedLenByteArrayReader for Parquet #2318 [parquet]
    • Make JSON support Optional via Feature Flag #2300 [arrow]

    Fixed bugs:

    • Casting timestamp array to string should not ignore timezone #2607 [arrow]
    • Ilike_ut8_scalar kernals have incorrect logic #2544 [arrow]
    • Always validate the array data when creating array in IPC reader #2541 [arrow]

    ... (truncated)

    Commits

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
  • Update arrow requirement from >= 20, < 21 to >= 20, < 22

    Update arrow requirement from >= 20, < 21 to >= 20, < 22

    Updates the requirements on arrow to permit the latest version.

    Changelog

    Sourced from arrow's changelog.

    21.0.0 (2022-08-18)

    Full Changelog

    Breaking changes:

    Implemented enhancements:

    • add into_inner method to ArrowWriter #2491 [parquet]
    • Remove byteorder dependency #2472 [parquet]
    • Return Structured ColumnCloseResult from GenericColumnWriter::close #2465 [parquet]
    • Push ChunkReader into SerializedPageReader #2463 [parquet]
    • Support SerializedPageReader::skip_page without OffsetIndex #2459 [parquet]
    • Support Time64/Time32 comparison #2457 [arrow]
    • Revise FromIterator for Decimal128Array to use Into instead of Borrow #2441 [parquet]
    • Support RowFilter withinParquetRecordBatchReader #2431 [parquet]
    • Remove the field StructBuilder::len #2429 [arrow]
    • Standardize creation and configuration of parquet --> Arrow readers ( ParquetRecordBatchReaderBuilder) #2427 [parquet]
    • Use OffsetIndex to Prune IO in ParquetRecordBatchStream #2426 [parquet]
    • Support peek_next_page and skip_next_page in InMemoryPageReader #2406 [parquet]
    • Support casting from Utf8/LargeUtf8 to Binary/LargeBinary #2402 [arrow]
    • Support casting between Decimal128 and Decimal256 arrays #2375 [arrow]
    • Combine multiple selections into the same batch size in skip_records #2358 [parquet]
    • Add API to change timezone for timestamp array #2346 [arrow]
    • Change the output of read_buffer Arrow IPC API to return Result<_> #2342 [arrow]
    • Allow skip_records in GenericColumnReader to skip across row groups #2331 [parquet]
    • Optimize the validation of Decimal256 #2320 [arrow]
    • Implement Skip for DeltaBitPackDecoder #2281 [parquet]
    • Changes to ParquetRecordBatchStream to support row filtering in DataFusion #2270 [parquet]
    • Add ArrayReader::skip_records API #2197 [parquet]

    Fixed bugs:

    • Panic in SerializedPageReader without offset index #2503 [parquet]
    • MapArray columns don't handle null values correctly #2484 [arrow]
    • There is no compiler error when using an invalid Decimal type. #2440 [arrow]
    • Flight SQL Server sends incorrect response for DoPutUpdateResult #2403 [arrow-flight]
    • AsyncFileReaderNo Longer Object-Safe #2372 [parquet]
    • StructBuilder Does not Verify Child Lengths #2252 [arrow]

    Closed issues:

    ... (truncated)

    Commits

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
  • Update odbc-api requirement from 0.46.0 to 0.47.0

    Update odbc-api requirement from 0.46.0 to 0.47.0

    Updates the requirements on odbc-api to permit the latest version.

    Changelog

    Sourced from odbc-api's changelog.

    0.47.0

    • Support getting number of rows affected by INSERT, UPDATE or DELETE.
      • New method Preallocated::row_count
      • New method Prepared::row_count
    • TextRowSet::from_max_str_lens now takes IntoIterator<Item=usize> instead of Iterator<Item=usize>.
    • trait method RowSetBuffer::bind_to_cursor has been renamed to bind_columns_to_cursor and now takes a StatementRef instead of a Cursor. This change is unlikely to affect user code, as so far I know all downstream crates use the provided RowSetBuffer implementations and do not implement their own versions.

    0.46.0

    • Minimal support for asynchronous code in the handle module.
    • SqlResult has a new variant SqlResult::StillExecuting.
    • New function Statement::set_async_enable can be used to enable polling for statements.
    • Functions returning Option<SqlResult<_>> now return SqlResult<Option<_>> or SqlResult<bool> in order to allow for top level asynchronous code to build on the same abstraction as the synchronous code.

    Since most users would not engage with the unsafe functions of the handle module their code should be unaffected.

    0.45.1

    • TextRowSet::for_cursor now only performs a faliable allocation, if no upper bound for string length is specified. This has been changed to remidy performance regressions, but still have the faliable allocation in situation there it could likely occurr.

    0.45.0

    • Add Connection::into_prepared to allow for prepared queries to take ownership of the connection.
    • Prepared::describe_param now requires an exclusive reference
    • Remove Prebound.

    0.44.3

    • Add NullableSliceMut::set_cell

    0.44.2

    • Add Bit::from_bool.

    0.44.1

    • Add ColumnarBulkInserter::capacity.

    0.44.0

    • All methods on the ResultSetMetaData trait now require exclusive (&mut) references.
    • The trait BorrowMutStatement has been replaced by AsStatementRef.
    • ColumnarBulkInserter is now generic over statement ownership.
    • It is now possible to create a bulk inserter which just borrows a prepared statement, rather than taking ownership of it. Use Prepared::any_column_inserter to do so. This is useful for dynamically growing the capacity of the bound array parameters. So far users were forced to create a new ColumnarBulkInserter to do so. This is still true, but creating a new ColumnarBulkInserter now longer implies that a new prepared statement has to be created.

    0.43.0

    • BoundInputSlice, TextColumnSliceMut and BinColumnSliceMut now only track the lifetime of the statement. The do no longer need to track the lifetime of the Connection anymore.

    ... (truncated)

    Commits
    • 7635f89 bump version to 0.47.0
    • 17e03ca row count for prepared queries
    • 72c75f4 row count for preallocated
    • b5e97a5 row count in safe code
    • b45ebd8 row count for StatementImpl
    • c3d6987 add test for fetching row count
    • bed7b75 Pass sleep explicitly
    • 57c3f1f fetch for async row set cursor
    • a54f62d dedupliace error handling for fetch
    • 4890015 async fetch with truncation check
    • Additional commits viewable in compare view

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
  • Update arrow requirement from >= 19, < 20 to >= 19, < 21

    Update arrow requirement from >= 19, < 20 to >= 19, < 21

    Updates the requirements on arrow to permit the latest version.

    Changelog

    Sourced from arrow's changelog.

    20.0.0 (2022-08-05)

    Full Changelog

    Breaking changes:

    Implemented enhancements:

    • Add the constant data type constructors for ListArray #2311 [arrow]
    • Update FlightSqlService trait to pass session info along #2308 [arrow-flight]
    • Optimize take_bits for non-null indices #2306 [arrow]
    • Make FFI support optional via Feature Flag ffi #2302 [arrow]
    • Mark ffi::ArrowArray::try_new is safe #2301 [arrow]
    • Remove test_utils from default arrow-rs features #2298 [arrow]
    • Remove JsonEqual trait #2296 [arrow]
    • Move with_precision_and_scale to Decimal array traits #2291 [arrow]
    • Improve readability and maybe performance of string --> numeric/time/date/timetamp cast kernels #2285 [arrow]
    • Add vectorized unpacking for 8, 16, and 64 bit integers #2276 [parquet]
    • Use initial capacity for interner hashmap #2273 [arrow]
    • Impl FromIterator for Decimal256Array #2248 [arrow]
    • Separate ArrayReader::next_batchwith ArrayReader::read_records and ArrayReader::consume_batch #2236 [parquet]
    • Rename DataType::Decimal to DataType::Decimal128 #2228 [arrow]
    • Automatically Grow Parquet BitWriter Buffer #2226 [parquet]
    • Add append_option support to Decimal128Builder and Decimal256Builder #2224 [arrow]
    • Split the FixedSizeBinaryArray and FixedSizeListArray from array_binary.rs and array_list.rs #2217 [arrow]
    • Don't Box Values in PrimitiveDictionaryBuilder #2215 [arrow]
    • Use BitChunks in equal_bits #2186 [arrow]
    • Implement Hash for Schema #2182 [arrow]
    • read decimal data type from parquet file with binary physical type #2159 [parquet]
    • The GenericStringBuilder should use GenericBinaryBuilder #2156 [arrow]
    • Update Rust version to 1.62 #2143 [parquet] [arrow] [arrow-flight]
    • Check precision and scale against maximum value when constructing Decimal128 and Decimal256 #2139 [arrow]
    • Use ArrayAccessor in Decimal128Iter and Decimal256Iter #2138 [arrow]
    • Use ArrayAccessor and FromIterator in Cast Kernels #2137 [arrow]
    • Add TypedDictionaryArray for more ergonomic interaction with DictionaryArray #2136 [arrow]
    • Use ArrayAccessor in Comparison Kernels #2135 [arrow]
    • Support peek_next_page() and skip_next_page in InMemoryColumnChunkReader #2129 [parquet]
    • Lazily materialize the null buffer builder for all array builders. #2125 [arrow]
    • Do value validation for Decimal256 #2112 [arrow]
    • Support skip_def_levels for ColumnLevelDecoder #2107 [parquet]
    • Add integration test for scan rows with selection #2106 [parquet]
    • Support for casting from Utf8/String to Time32 / Time64 #2053 [arrow]
    • Update prost and tonic related crates #2268 [arrow-flight] (carols10cents)

    ... (truncated)

    Commits

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
  • Update arrow requirement from >= 7.0.0, < 19 to >= 7.0.0, < 20

    Update arrow requirement from >= 7.0.0, < 19 to >= 7.0.0, < 20

    Updates the requirements on arrow to permit the latest version.

    Changelog

    Sourced from arrow's changelog.

    19.0.0 (2022-07-22)

    Full Changelog

    Breaking changes:

    Implemented enhancements:

    • Use total_cmp from std #2130 [arrow]
    • Permit parallel fetching of column chunks in ParquetRecordBatchStream #2110 [parquet]
    • The GenericBinaryBuilder should use buffer builders directly. #2104 [arrow]
    • Pass generate_decimal256_case arrow integration test #2093 [arrow]
    • Rename weekday and weekday0 kernels to to num_days_from_monday and days_since_sunday #2065 [arrow]
    • Improve performance of filter_dict #2062 [arrow]
    • Improve performance of set_bits #2060 [arrow]
    • Lazily materialize the null buffer builder of BooleanBuilder #2058 [arrow]
    • BooleanArray::from_iter should omit validity buffer if all values are valid #2055 [arrow]
    • FFI_ArrowSchema should set DICTIONARY_ORDERED flag if a field's dictionary is ordered #2049 [arrow]
    • Support peek_next_page() and skip_next_page in SerializedPageReader #2043 [parquet]
    • Support FFI / C Data Interface for MapType #2037 [arrow]
    • The DecimalArrayBuilder should use FixedSizedBinaryBuilder #2026 [arrow]
    • Enable serialized_reader read specific Page by passing row ranges. #1976 [parquet]

    Fixed bugs:

    • type_id and value_offset are incorrect for sliced UnionArray #2086 [arrow]
    • Boolean take kernel does not handle null indices correctly #2057 [arrow]
    • Don't double-count nulls in write_batch_with_statistics #2046 [parquet]
    • Parquet Writer Ignores Statistics specification in WriterProperties #2014 [parquet]

    Documentation updates:

    • Improve docstrings + examples for as_primitive_array cast functions #2114 [arrow] (alamb)

    Closed issues:

    • Why does serde_json specify the preserve_order feature in arrow package #2095 [arrow]
    • Support skip_values in DictionaryDecoder #2079 [parquet]
    • Support skip_values in ColumnValueDecoderImpl #2078 [parquet]
    • Support skip_values in ByteArrayColumnValueDecoder #2072 [parquet]
    • Several Builder::append methods returning results even though they are infallible #2071
    • Improve formatting of logical plans containing subqueries #2059
    • Return reference from UnionArray::child #2035

    ... (truncated)

    Commits

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
  • Update arrow requirement from >= 7.0.0, < 18 to >= 7.0.0, < 19

    Update arrow requirement from >= 7.0.0, < 18 to >= 7.0.0, < 19

    Updates the requirements on arrow to permit the latest version.

    Changelog

    Sourced from arrow's changelog.

    18.0.0 (2022-07-08)

    Full Changelog

    Breaking changes:

    Implemented enhancements:

    • Add DataType::Dictionary support to subtract_scalar, multiply_scalar, divide_scalar #2019 [arrow]
    • Support DictionaryArray in add_scalar kernel #2017 [arrow]
    • Enable column page index read test for all types #2010 [parquet]
    • Simplify FixedSizeBinaryBuilder #2007 [arrow]
    • Support Decimal256Builder and Decimal256Array #1999 [arrow]
    • Support DictionaryArray in unary kernel #1989 [arrow]
    • Add kernel to quickly compute comparisons on Arrays #1987 [arrow]
    • Support DictionaryArray in divide kernel #1982 [arrow]
    • Implement Into<ArrayData> for T: Array #1979 [arrow]
    • Support DictionaryArray in multiply kernel #1972 [arrow]
    • Support DictionaryArray in subtract kernel #1970 [arrow]
    • Declare DecimalArray::length as a constant #1967 [arrow]
    • Support DictionaryArray in add kernel #1950 [arrow]
    • Add builder style methods to Field #1934 [arrow]
    • Make StringDictionaryBuilder faster #1851 [arrow]
    • concat_elements_utf8 should accept arbitrary number of input arrays #1748 [arrow]

    Fixed bugs:

    • Array reader for list columns fails to decode if batches fall on row group boundaries #2025 [parquet]
    • ColumnWriterImpl::write_batch_with_statistics incorrect distinct count in statistics #2016 [parquet]
    • ColumnWriterImpl::write_batch_with_statistics can write incorrect page statistics #2015 [parquet]
    • RowFormatter is not part of the public api #2008 [parquet]
    • Infinite Loop possible in ColumnReader::read_batch For Corrupted Files #1997 [parquet]
    • PrimitiveBuilder::finish_dict does not validate dictionary offsets #1978 [arrow]
    • Incorrect n_buffers in FFI_ArrowArray #1959 [arrow]
    • DecimalArray::from_fixed_size_list_array fails when offset > 0 #1958 [arrow]
    • Incorrect (but ignored) metadata written after ColumnChunk #1946 [parquet]
    • Send + Sync impl for Allocation may not be sound unless Allocation is Send + Sync as well #1944 [arrow]
    • Disallow cast from other datatypes to NullType #1923 [arrow]

    Documentation updates:

    • The doc of FixedSizeListArray::value_length is incorrect. #1908 [arrow]

    ... (truncated)

    Commits
    • 330505c Prepare for 18.0.0 release: Update CHANGELOG and versions (#2023)
    • 5a76697 Fix record delimiting on row group boundaries (#2025) (#2027)
    • ca1bfb8 Add Decimal256Builder and Decimal256Array (#2000)
    • d70c2ea Consistent case in Index enumeration (#2029)
    • 373ac81 Add page index reader test for all types and support empty index. (#2012)
    • 9333a85 Don't generate empty google.protobuf.rs (#2005)
    • 58848c4 Fix parquet test_common feature flags (#2003)
    • d9dbf72 Fix several bugs in parquet writer statistics generation, add `EnabledStatist...
    • 7542f7d Add builder style APIs For Field: with_name, with_data_type and `with_n...
    • ef02bf8 Support dictionary array for divide kernel (#1983)
    • Additional commits viewable in compare view

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
  • Update arrow requirement from >= 7.0.0, < 17 to >= 7.0.0, < 18

    Update arrow requirement from >= 7.0.0, < 17 to >= 7.0.0, < 18

    Updates the requirements on arrow to permit the latest version.

    Changelog

    Sourced from arrow's changelog.

    17.0.0 (2022-06-24)

    Full Changelog

    Breaking changes:

    Implemented enhancements:

    • add a small doc example showing ArrowWriter being used with a cursor #1927 [parquet]
    • Support cast to/from NULL and DataType::Decimal #1921 [arrow]
    • Add Decimal256 API #1913 [arrow]
    • Add DictionaryArray::key function #1911 [arrow]
    • Support specifying capacities for ListArrays in MutableArrayData #1884 [arrow]
    • Explicitly declare the features used for each dependency #1876 [parquet] [arrow] [arrow-flight]
    • Add Decimal128 API and use it in DecimalArray and DecimalBuilder #1870 [arrow]
    • PrimitiveArray::from_iter should omit validity buffer if all values are valid #1856 [arrow]
    • Add from(v: Vec<Option<&[u8]>>) and from(v: Vec<&[u8]>) for FixedSizedBInaryArray #1852 [arrow]
    • Add Vec-inspired APIs to BufferBuilder #1850 [arrow]
    • PyArrow intergation test for C Stream Interface #1847 [arrow]
    • Add nilike support in comparison #1845 [arrow]
    • Split up arrow::array::builder module #1843 [arrow]
    • Add quarter support in temporal kernels #1835 [arrow]
    • Rename ArrayData::validate_dictionary_offset to ArrayData::validate_values #1812 [arrow]
    • Clean up the testing code for substring kernel #1801 [arrow]
    • Speed up substring_by_char kernel #1800 [arrow]

    Fixed bugs:

    • unable to write parquet file with UTC timestamp #1932 [parquet]
    • Incorrect max and min decimals #1916 [arrow]
    • dynamic_types example does not print the projection #1902 [arrow]
    • log2(0) panicked at 'attempt to subtract with overflow', parquet/src/util/bit_util.rs:148:5 #1901 [parquet]
    • Final slicing in combine_option_bitmap needs to use bit slices #1899 [arrow]
    • Dictionary IPC writer writes incorrect schema #1892 [arrow]
    • Creating a RecordBatch with null values in non-nullable fields does not cause an error #1888 [arrow]
    • Upgrade regex dependency #1874 [arrow]
    • Miri reports leaks in ffi tests #1872 [arrow]
    • AVX512 + simd binary and/or kernels slower than autovectorized version #1829 [arrow]

    Documentation updates:

    ... (truncated)

    Commits

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
  • Update half requirement from 1.8.2 to 2.0.0

    Update half requirement from 1.8.2 to 2.0.0

    Updates the requirements on half to permit the latest version.

    Release notes

    Sourced from half's releases.

    2.0.0

    Changed

    • Breaking Change Minimum supported Rust version is now 1.58.
    • Breaking Change std is now enabled as a default cargo feature. Disable default features to continue using no_std support.
    • Migrated to Rust Edition 2021.
    • Added #[must_use] attributes to functions, as appropriate.

    Fixed

    • Fix a soundness bug with slice::as_ptr not correctly using mutable reference. By @​Nilstrieb.

    Added

    • Added const conversion methods to both f16 and bf16. These methods never use hardware intrinsics, unlike the current conversion methods, which is why they are separated into new methods. The following const methods were added:
      • from_f32_const
      • from_f64_const
      • to_f32_const
      • to_f64_const
    • Added Neg trait support for borrowed values &f16 and &bf16. By @​pthariensflame.
    • Added AsPrimitive implementations from and to self, usize, and isize. By @​kali.

    Removed

    • Breaking Change The deprecated serialize cargo feature has been removed. Use serde cargo feature instead.
    • Breaking Change The deprecated consts module has been removed. Use associated constants on f16 instead.
    • Breaking Change The following deprecated functions have been removed:
      • f16::as_bits
      • slice::from_bits_mut
      • slice::to_bits_mut
      • slice::from_bits
      • slice::to_bits
      • vec::from_bits
      • vec::to_bits
    Changelog

    Sourced from half's changelog.

    [2.0.0] - 2022-06-21

    Changed

    • Breaking Change Minimum supported Rust version is now 1.58.
    • Breaking Change std is now enabled as a default cargo feature. Disable default features to continue using no_std support.
    • Migrated to Rust Edition 2021.
    • Added #[must_use] attributes to functions, as appropriate.

    Fixed

    • Fix a soundness bug with slice::as_ptr not correctly using mutable reference. By [@​Nilstrieb].

    Added

    • Added const conversion methods to both f16 and bf16. These methods never use hardware intrinsics, unlike the current conversion methods, which is why they are separated into new methods. The following const methods were added:
      • from_f32_const
      • from_f64_const
      • to_f32_const
      • to_f64_const
    • Added Neg trait support for borrowed values &f16 and &bf16. By [@​pthariensflame].
    • Added AsPrimitive implementations from and to self, usize, and isize. By [@​kali].

    Removed

    • Breaking Change The deprecated serialize cargo feature has been removed. Use serde cargo feature instead.
    • Breaking Change The deprecated consts module has been removed. Use associated constants on f16 instead.
    • Breaking Change The following deprecated functions have been removed:
      • f16::as_bits
      • slice::from_bits_mut
      • slice::to_bits_mut
      • slice::from_bits
      • slice::to_bits
      • vec::from_bits
      • vec::to_bits

    [1.8.2] - 2021-10-22

    Fixed

    • Remove cargo resolver=2 from manifest to resolve errors in older versions of Rust that still worked with 1.8.0. Going forward, MSRV increases will be major version increases. Fixes #48.

    [1.8.1] - 2021-10-21 - Yanked

    Yanked

    Not recommended due to introducing compilation error in Rust versions that worked with 1.8.0.

    Changed

    • Now uses cargo resolver version 2 to prevent dev-dependencies from enabling std feature on optional dependencies.

    Fixed

    • Fixed compile failure when std feature is not enabled and num-traits is enabled under new

    ... (truncated)

    Commits
    • 7ae587c Version bump for release
    • 2a463c3 Add miri to CI
    • 36c1e3f Fix exclusive reference obtained from shared reference
    • 4e4ef87 update changelog with PR changes
    • 723ca07 add AsPrimitive impls: from and to self, from and to usize and isize
    • 0389edf Add impl Neg for references
    • f8a4b83 bump rustc min to 1.58
    • 005ad4e allow uninit_vec clippy lint for now
    • ce0279c allow uninit_vec clippy lint for now
    • 37ed710 added #[must_use] attributes
    • Additional commits viewable in compare view

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
Releases(v0.8.1)
Owner
Markus Klein
Data Engineering
Markus Klein
🦖 Evolve your fixed length data files into Apache Arrow tables, fully parallelized!

?? Evolve your fixed length data files into Apache Arrow tables, fully parallelized! ?? Overview ... ?? Installation The easiest way to install evolut

Firelink Data 3 Dec 22, 2023
A Rust DataFrame implementation, built on Apache Arrow

Rust DataFrame A dataframe implementation in Rust, powered by Apache Arrow. What is a dataframe? A dataframe is a 2-dimensional tabular data structure

Wakahisa 287 Nov 11, 2022
Official Rust implementation of Apache Arrow

Native Rust implementation of Apache Arrow Welcome to the implementation of Arrow, the popular in-memory columnar format, in Rust. This part of the Ar

The Apache Software Foundation 1.3k Jan 9, 2023
Polars is a blazingly fast DataFrames library implemented in Rust using Apache Arrow Columnar Format as memory model.

Polars Python Documentation | Rust Documentation | User Guide | Discord | StackOverflow Blazingly fast DataFrames in Rust, Python & Node.js Polars is

null 11.8k Jan 8, 2023
Apache Arrow DataFusion and Ballista query engines

DataFusion is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.

The Apache Software Foundation 2.9k Jan 2, 2023
Integration between arrow-rs and extendr

arrow_extendr arrow-extendr is a crate that facilitates the transfer of Apache Arrow memory between R and Rust. It utilizes extendr, the {nanoarrow} R

Josiah Parry 8 Nov 24, 2023
Arrow User-Defined Functions Framework on WebAssembly.

Arrow User-Defined Functions Framework on WebAssembly Example Build the WebAssembly module: cargo build --release -p arrow-udf-wasm-example --target w

RisingWave Labs 3 Dec 14, 2023
Apache TinkerPop from Rust via Rucaja (JNI)

Apache TinkerPop from Rust An example showing how to call Apache TinkerPop from Rust via Rucaja (JNI). This repository contains two directories: java

null 8 Sep 27, 2022
A new arguably faster implementation of Apache Spark from scratch in Rust

vega Previously known as native_spark. Documentation A new, arguably faster, implementation of Apache Spark from scratch in Rust. WIP Framework tested

raja sekar 2.1k Jan 5, 2023
A fast, powerful, flexible and easy to use open source data analysis and manipulation tool written in Rust

fisher-rs fisher-rs is a Rust library that brings powerful data manipulation and analysis capabilities to Rust developers, inspired by the popular pan

Syed Vilayat Ali Rizvi 5 Aug 31, 2023
A fast, powerful, flexible and easy to use open source data analysis and manipulation tool written in Rust

fisher-rs fisher-rs is a Rust library that brings powerful data manipulation and analysis capabilities to Rust developers, inspired by the popular pan

null 5 Sep 6, 2023
📊 Cube.js — Open-Source Analytics API for Building Data Apps

?? Cube.js — Open-Source Analytics API for Building Data Apps

Cube.js 14.4k Jan 8, 2023
AppFlowy is an open-source alternative to Notion. You are in charge of your data and customizations

AppFlowy is an open-source alternative to Notion. You are in charge of your data and customizations. Built with Flutter and Rust.

null 30.7k Jan 7, 2023
A Modern Real-Time Data Processing & Analytics DBMS with Cloud-Native Architecture, built to make the Data Cloud easy

A Modern Real-Time Data Processing & Analytics DBMS with Cloud-Native Architecture, built to make the Data Cloud easy

Datafuse Labs 5k Jan 9, 2023
New generation decentralized data warehouse and streaming data pipeline

World's first decentralized real-time data warehouse, on your laptop Docs | Demo | Tutorials | Examples | FAQ | Chat Get Started Watch this introducto

kamu 184 Dec 22, 2022
This library provides a data view for reading and writing data in a byte array.

Docs This library provides a data view for reading and writing data in a byte array. This library requires feature(generic_const_exprs) to be enabled.

null 2 Nov 2, 2022
Bytewax is an open source Python framework for building highly scalable dataflows.

Bytewax Bytewax is an open source Python framework for building highly scalable dataflows. Bytewax uses PyO3 to provide Python bindings to the Timely

Bytewax 289 Jan 6, 2023
Rayon: A data parallelism library for Rust

Rayon Rayon is a data-parallelism library for Rust. It is extremely lightweight and makes it easy to convert a sequential computation into a parallel

null 7.8k Jan 8, 2023
DataFrame / Series data processing in Rust

black-jack While PRs are welcome, the approach taken only allows for concrete types (String, f64, i64, ...) I'm not sure this is the way to go. I want

Miles Granger 30 Dec 10, 2022