Provides multiple-dtype columner storage, known as DataFrame in pandas/R

Overview

brassfibre

Provides multiple-dtype columner storage, known as DataFrame in pandas/R.

Release Documentation Build Status

Series

Single-dtype 1-dimentional vector with label (index).

Creation

extern crate brassfibre;
use brassfibre::prelude::*;

let values: Vec<i64> = vec![1, 2, 3, 4, 3];
let index: Vec<i64> = vec![10, 20, 30, 40, 50];
let s = Series::<i64, i64>::new(values, index);

println!("{:?}", &s);
// 10 1
// 20 2
// 30 3
// 40 4
// 50 3

Selection

println!("{:?}", &s.locs(&vec![10, 40, 50]));
// 10 1
// 40 4
// 50 3

println!("{:?}", &s.ilocs(&vec![2, 3, 4]));
// 30 3
// 40 4
// 50 3

Calculation

println!("{:?}", &(&b + 5));
// 10 2
// 20 3
// 30 4
// 40 5
// 50 4

println!("{:?}", &s.sum());
// 13

println!("{:?}", &s.describe());
// count                 5
//  mean               2.6
//   std 1.019803902718557
//   min                 1
//   max                 4

println!("{:?}", s.value_counts());
// 3 2
// 2 1
// 1 1
// 4 1

Group By

let sg = s.groupby(vec![1, 1, 1, 2, 2]);

println!("{:?}", sg.get_group(&1));
// 10 1
// 20 2
// 30 3

println!("{:?}", sg.sum());
// 1 6
// 2 7

DataFrame

Multiple-dtype 2-dimentional vector with labels (index and columns).

Creation

#[macro_use]
extern crate brassfibre;
use brassfibre::prelude::*;

let values = vec![array![1, 2, 3, 4, 5],
                  array![6.1, 7.1, 8.1, 9.1, 10.1],
                  array![11, 12, 13, 14, 15]];
let df = DataFrame::from_vec(values,
                             vec![10, 20, 30, 40, 50],
                             vec!["X", "YYY", "ZZ"]);
println!("{:?}", &df);
//    X  YYY ZZ
// 10 1  6.1 11
// 20 2  7.1 12
// 30 3  8.1 13
// 40 4  9.1 14
// 50 5 10.1 15

Selection

by multiple index labels

println!("{:?}", &df.locs(&vec![20, 30, 40]));
//    X YYY ZZ
// 20 2 7.1 12
// 30 3 8.1 13
// 40 4 9.1 14

by multiple index locations

println!("{:?}", &df.ilocs(&vec![0, 2, 1]));
//    X YYY ZZ
// 10 1 6.1 11
// 30 3 8.1 13
// 20 2 7.1 12

Group By

get group

let dg = df.groupby(vec!["A", "A", "B", "A", "B"]);
println!("{:?}", &dg.get_group(&"A"));
//    X YYY ZZ
// 10 1 6.1 11
// 20 2 7.1 12
// 40 4 9.1 14

Reshaping

inner join

let values2 = vec![array![1.1, 2.1, 3.1],
                   array![6, 7, 8]];
let df2 = DataFrame::from_vec(values2,
                              vec![20, 30, 40],
                              vec!["X2", "Y2"]);
let j = df.join_inner(&df2);
println!("{:?}", &j);
//    X YYY ZZ  X2 Y2
// 20 2 7.1 12 1.1  6
// 30 3 8.1 13 2.1  7
// 40 4 9.1 14 3.1  8
You might also like...
Dataframe structure and operations in Rust

Utah Utah is a Rust crate backed by ndarray for type-conscious, tabular data manipulation with an expressive, functional interface. Note: This crate w

Rust DataFrame library

Polars Blazingly fast DataFrames in Rust & Python Polars is a blazingly fast DataFrames library implemented in Rust. Its memory model uses Apache Arro

A Rust DataFrame implementation, built on Apache Arrow

Rust DataFrame A dataframe implementation in Rust, powered by Apache Arrow. What is a dataframe? A dataframe is a 2-dimensional tabular data structure

DataFrame / Series data processing in Rust

black-jack While PRs are welcome, the approach taken only allows for concrete types (String, f64, i64, ...) I'm not sure this is the way to go. I want

DataFrame & its adaptors

Fabrix Fabrix is a lib crate, who uses Polars Series and DataFrame as fundamental data structures, and is capable to communicate among different data

Network Block Storage server, written in Rust. Supports pluggable and chainable underlying storage

nbd-rs Disclaimer DO NEVER USE THIS FOR PRODUCTION Do not use this for any data that you cannot afford to lose any moment. Expect data loss, corruptio

Plugin for macro-, mini-quad (quads) to save data in simple local storage using Web Storage API in WASM and local file on a native platforms.

quad-storage This is the crate to save data in persistent local storage in miniquad/macroquad environment. In WASM the data persists even if tab or br

LevelDB is a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values.

LevelDB is a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values. Authors: Sanjay Ghem

archive-rs provides a generic way of dealing with multiple archive and compression formats in Rust

archive-rs A Rust crate that aims to provide a generic way of dealing with multiple archive and compression formats by providing a generic abstraction

Rust read/write support for well-known text (WKT)

wkt Rust read/write support for well-known text (WKT). License Licensed under either of Apache License, Version 2.0 (LICENSE-APACHE or http://www.apac

Rust read/write support for well-known text (WKT)

wkt Rust read/write support for well-known text (WKT). License Licensed under either of Apache License, Version 2.0 (LICENSE-APACHE or http://www.apac

Faster division by constants that aren't known at compile-time

Baseline implementation of division by constants When dividing integers by compile-time constants, compilers (LLVM) can be trusted to convert those to

A list of known SS58 account types as an enum.

A list of known SS58 account types as an enum.

Unofficial Bitwarden compatible server written in Rust, formerly known as bitwarden_rs

Alternative implementation of the Bitwarden server API written in Rust and compatible with upstream Bitwarden clients*, perfect for self-hosted deploy

Unofficial Bitwarden compatible server written in Rust, formerly known as bitwarden_rs

Alternative implementation of the Bitwarden server API written in Rust and compatible with upstream Bitwarden clients*, perfect for self-hosted deploy

A typemap for a set of known types optionally without heap allocation, and supporting iterating by traits

fixed_typemap docs.rs GitHub Sponsors Implements typemaps that support a lot of extra funcctionality using procedural macros. docs.rs has a lot more t

A set of cryptographic primitives for building a multi-hop Proxy Re-encryption scheme, known as Transform Encryption.

recrypt A pure-Rust library that implements a set of cryptographic primitives for building a multi-hop Proxy Re-encryption scheme, known as Transform

Experiments on blockchain technology (also known as Hashed & Zero-trust Verifiable Linked List)

AngeloChain Experiments on blockchain technology (also known as Hashed & Zero-trust Verifiable Linked List) ⚠️ Before We Get Started Before we get sta

Option and Either types with variants known at compile time.

Const Either Some types to allow deciding at compile time if an option contains a value or which variant from the either type is active. This might be

Comments
  • Fixes some issues found by clippy

    Fixes some issues found by clippy

    Fixes some issues found by clippy. The only changes in the usage of the library is that group_by now uses a slice instead of a Vec. I think this is a good change, no need for a Vec here

    opened by JordiPolo 1
  • Updates of other crates and fixes

    Updates of other crates and fixes

    • Fixes warnings (unused parameters and variables)
    • Updates dependencies and related code. Most notably csv to 1.0 which does not use rusc-serialize anymore. And that brought me to update rust-nullvec also.

    This will not compile till this is merged https://github.com/sinhrks/rust-nullvec/pull/6 and nullvec 0.2.0 is released

    opened by JordiPolo 1
  • A better way to access individual elements?

    A better way to access individual elements?

    Currently, to get a single element out of a Dataframe, I use a macro like the following:

    macro_rules! get_df_value {
        ($df: ident, $what: expr, $x: expr) => {
            {
                let x = $df.gets(&[String::from($what)]);
                x.values[0].iloc(&$x).into()
            }
        }
    }
    

    Unfortunately, that isn't the full story, since type inference doesn't always work with this, so sometimes, I need to additionally do the following:

    let t1: f64 = get_df_value!(data, "col1", row);
    let t2: f64 = get_df_value!(data, "col2", row);
    println!("{}", (t1 - t2).abs());
    

    This is a lot to just access and use single values, which I would assume is a very common action. At least, it is very common for me.

    Best, Rob

    opened by rennis250 2
Releases(v0.1.0)
Owner
Sinhrks
Sinhrks
Dataframe structure and operations in Rust

Utah Utah is a Rust crate backed by ndarray for type-conscious, tabular data manipulation with an expressive, functional interface. Note: This crate w

Suchin 139 Sep 26, 2022
Rust DataFrame library

Polars Blazingly fast DataFrames in Rust & Python Polars is a blazingly fast DataFrames library implemented in Rust. Its memory model uses Apache Arro

Ritchie Vink 11.9k Jan 8, 2023
A Rust DataFrame implementation, built on Apache Arrow

Rust DataFrame A dataframe implementation in Rust, powered by Apache Arrow. What is a dataframe? A dataframe is a 2-dimensional tabular data structure

Wakahisa 287 Nov 11, 2022
DataFrame / Series data processing in Rust

black-jack While PRs are welcome, the approach taken only allows for concrete types (String, f64, i64, ...) I'm not sure this is the way to go. I want

Miles Granger 30 Dec 10, 2022
DataFrame & its adaptors

Fabrix Fabrix is a lib crate, who uses Polars Series and DataFrame as fundamental data structures, and is capable to communicate among different data

Jacob Xie 18 Dec 12, 2022
Cloud native log storage and management for Kubernetes, containerised workloads

Live Demo | Website | API Workspace on Postman Parseable is an open source, cloud native, log storage and management platform. Parseable helps you ing

Parseable, Inc. 715 Jan 1, 2023
Provides a way to use enums to describe and execute ordered data pipelines. 🦀🐾

enum_pipline Provides a way to use enums to describe and execute ordered data pipelines. ?? ?? I needed a succinct way to describe 2d pixel map operat

Ben Greenier 0 Oct 29, 2021
This library provides a data view for reading and writing data in a byte array.

Docs This library provides a data view for reading and writing data in a byte array. This library requires feature(generic_const_exprs) to be enabled.

null 2 Nov 2, 2022
Rustic - a backup tool that provides fast, encrypted, deduplicated backups

Rustic is a backup tool that provides fast, encrypted, deduplicated backups. It can read the restic repo format desribed in the design document and writes a compatible repo format which can also be read by restic.

null 266 Jan 2, 2023
Perhaps the fastest and most memory efficient way to pull data from PostgreSQL into pandas and numpy. 🚀

flaco Perhaps the fastest and most memory efficient way to pull data from PostgreSQL into pandas and numpy. ?? Have a gander at the initial benchmarks

Miles Granger 14 Oct 31, 2022