Provides multiple-dtype columner storage, known as DataFrame in pandas/R

Overview

brassfibre

Provides multiple-dtype columner storage, known as DataFrame in pandas/R.

Release Documentation Build Status

Series

Single-dtype 1-dimentional vector with label (index).

Creation

extern crate brassfibre;
use brassfibre::prelude::*;

let values: Vec<i64> = vec![1, 2, 3, 4, 3];
let index: Vec<i64> = vec![10, 20, 30, 40, 50];
let s = Series::<i64, i64>::new(values, index);

println!("{:?}", &s);
// 10 1
// 20 2
// 30 3
// 40 4
// 50 3

Selection

println!("{:?}", &s.locs(&vec![10, 40, 50]));
// 10 1
// 40 4
// 50 3

println!("{:?}", &s.ilocs(&vec![2, 3, 4]));
// 30 3
// 40 4
// 50 3

Calculation

println!("{:?}", &(&b + 5));
// 10 2
// 20 3
// 30 4
// 40 5
// 50 4

println!("{:?}", &s.sum());
// 13

println!("{:?}", &s.describe());
// count                 5
//  mean               2.6
//   std 1.019803902718557
//   min                 1
//   max                 4

println!("{:?}", s.value_counts());
// 3 2
// 2 1
// 1 1
// 4 1

Group By

let sg = s.groupby(vec![1, 1, 1, 2, 2]);

println!("{:?}", sg.get_group(&1));
// 10 1
// 20 2
// 30 3

println!("{:?}", sg.sum());
// 1 6
// 2 7

DataFrame

Multiple-dtype 2-dimentional vector with labels (index and columns).

Creation

#[macro_use]
extern crate brassfibre;
use brassfibre::prelude::*;

let values = vec![array![1, 2, 3, 4, 5],
                  array![6.1, 7.1, 8.1, 9.1, 10.1],
                  array![11, 12, 13, 14, 15]];
let df = DataFrame::from_vec(values,
                             vec![10, 20, 30, 40, 50],
                             vec!["X", "YYY", "ZZ"]);
println!("{:?}", &df);
//    X  YYY ZZ
// 10 1  6.1 11
// 20 2  7.1 12
// 30 3  8.1 13
// 40 4  9.1 14
// 50 5 10.1 15

Selection

by multiple index labels

println!("{:?}", &df.locs(&vec![20, 30, 40]));
//    X YYY ZZ
// 20 2 7.1 12
// 30 3 8.1 13
// 40 4 9.1 14

by multiple index locations

println!("{:?}", &df.ilocs(&vec![0, 2, 1]));
//    X YYY ZZ
// 10 1 6.1 11
// 30 3 8.1 13
// 20 2 7.1 12

Group By

get group

let dg = df.groupby(vec!["A", "A", "B", "A", "B"]);
println!("{:?}", &dg.get_group(&"A"));
//    X YYY ZZ
// 10 1 6.1 11
// 20 2 7.1 12
// 40 4 9.1 14

Reshaping

inner join

let values2 = vec![array![1.1, 2.1, 3.1],
                   array![6, 7, 8]];
let df2 = DataFrame::from_vec(values2,
                              vec![20, 30, 40],
                              vec!["X2", "Y2"]);
let j = df.join_inner(&df2);
println!("{:?}", &j);
//    X YYY ZZ  X2 Y2
// 20 2 7.1 12 1.1  6
// 30 3 8.1 13 2.1  7
// 40 4 9.1 14 3.1  8
You might also like...
Perhaps the fastest and most memory efficient way to pull data from PostgreSQL into pandas and numpy. 🚀

flaco Perhaps the fastest and most memory efficient way to pull data from PostgreSQL into pandas and numpy. 🚀 Have a gander at the initial benchmarks

Network Block Storage server, written in Rust. Supports pluggable and chainable underlying storage

nbd-rs Disclaimer DO NEVER USE THIS FOR PRODUCTION Do not use this for any data that you cannot afford to lose any moment. Expect data loss, corruptio

Plugin for macro-, mini-quad (quads) to save data in simple local storage using Web Storage API in WASM and local file on a native platforms.

quad-storage This is the crate to save data in persistent local storage in miniquad/macroquad environment. In WASM the data persists even if tab or br

Dataframe structure and operations in Rust

Utah Utah is a Rust crate backed by ndarray for type-conscious, tabular data manipulation with an expressive, functional interface. Note: This crate w

Rust DataFrame library

Polars Blazingly fast DataFrames in Rust & Python Polars is a blazingly fast DataFrames library implemented in Rust. Its memory model uses Apache Arro

A Rust DataFrame implementation, built on Apache Arrow

Rust DataFrame A dataframe implementation in Rust, powered by Apache Arrow. What is a dataframe? A dataframe is a 2-dimensional tabular data structure

DataFrame / Series data processing in Rust

black-jack While PRs are welcome, the approach taken only allows for concrete types (String, f64, i64, ...) I'm not sure this is the way to go. I want

DataFrame & its adaptors

Fabrix Fabrix is a lib crate, who uses Polars Series and DataFrame as fundamental data structures, and is capable to communicate among different data

A dataframe manipulation tool inspired by dplyr and powered by polars.

dply is a command line tool for viewing, querying, and writing csv and parquet files, inspired by dplyr and powered by polars. Usage overview A dply p

Peakrs Dataframe is a library and framework facilitates the extraction, transformation, and loading (ETL) of data.

Peakrs Dataframe Peakrs Dataframe is a library and framework facilitates the extraction, transformation, and loading (ETL) of data. Its first applicat

LevelDB is a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values.

LevelDB is a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values. Authors: Sanjay Ghem

archive-rs provides a generic way of dealing with multiple archive and compression formats in Rust

archive-rs A Rust crate that aims to provide a generic way of dealing with multiple archive and compression formats by providing a generic abstraction

Rust read/write support for well-known text (WKT)

wkt Rust read/write support for well-known text (WKT). License Licensed under either of Apache License, Version 2.0 (LICENSE-APACHE or http://www.apac

Rust read/write support for well-known text (WKT)

wkt Rust read/write support for well-known text (WKT). License Licensed under either of Apache License, Version 2.0 (LICENSE-APACHE or http://www.apac

Faster division by constants that aren't known at compile-time

Baseline implementation of division by constants When dividing integers by compile-time constants, compilers (LLVM) can be trusted to convert those to

A list of known SS58 account types as an enum.

A list of known SS58 account types as an enum.

Unofficial Bitwarden compatible server written in Rust, formerly known as bitwarden_rs

Alternative implementation of the Bitwarden server API written in Rust and compatible with upstream Bitwarden clients*, perfect for self-hosted deploy

Unofficial Bitwarden compatible server written in Rust, formerly known as bitwarden_rs

Alternative implementation of the Bitwarden server API written in Rust and compatible with upstream Bitwarden clients*, perfect for self-hosted deploy

A typemap for a set of known types optionally without heap allocation, and supporting iterating by traits

fixed_typemap docs.rs GitHub Sponsors Implements typemaps that support a lot of extra funcctionality using procedural macros. docs.rs has a lot more t

Comments
  • Fixes some issues found by clippy

    Fixes some issues found by clippy

    Fixes some issues found by clippy. The only changes in the usage of the library is that group_by now uses a slice instead of a Vec. I think this is a good change, no need for a Vec here

    opened by JordiPolo 1
  • Updates of other crates and fixes

    Updates of other crates and fixes

    • Fixes warnings (unused parameters and variables)
    • Updates dependencies and related code. Most notably csv to 1.0 which does not use rusc-serialize anymore. And that brought me to update rust-nullvec also.

    This will not compile till this is merged https://github.com/sinhrks/rust-nullvec/pull/6 and nullvec 0.2.0 is released

    opened by JordiPolo 1
  • A better way to access individual elements?

    A better way to access individual elements?

    Currently, to get a single element out of a Dataframe, I use a macro like the following:

    macro_rules! get_df_value {
        ($df: ident, $what: expr, $x: expr) => {
            {
                let x = $df.gets(&[String::from($what)]);
                x.values[0].iloc(&$x).into()
            }
        }
    }
    

    Unfortunately, that isn't the full story, since type inference doesn't always work with this, so sometimes, I need to additionally do the following:

    let t1: f64 = get_df_value!(data, "col1", row);
    let t2: f64 = get_df_value!(data, "col2", row);
    println!("{}", (t1 - t2).abs());
    

    This is a lot to just access and use single values, which I would assume is a very common action. At least, it is very common for me.

    Best, Rob

    opened by rennis250 2
Releases(v0.1.0)
Owner
Sinhrks
Sinhrks
Dataframe structure and operations in Rust

Utah Utah is a Rust crate backed by ndarray for type-conscious, tabular data manipulation with an expressive, functional interface. Note: This crate w

Suchin 139 Sep 26, 2022
Rust DataFrame library

Polars Blazingly fast DataFrames in Rust & Python Polars is a blazingly fast DataFrames library implemented in Rust. Its memory model uses Apache Arro

Ritchie Vink 11.9k Jan 8, 2023
A Rust DataFrame implementation, built on Apache Arrow

Rust DataFrame A dataframe implementation in Rust, powered by Apache Arrow. What is a dataframe? A dataframe is a 2-dimensional tabular data structure

Wakahisa 287 Nov 11, 2022
DataFrame / Series data processing in Rust

black-jack While PRs are welcome, the approach taken only allows for concrete types (String, f64, i64, ...) I'm not sure this is the way to go. I want

Miles Granger 30 Dec 10, 2022
DataFrame & its adaptors

Fabrix Fabrix is a lib crate, who uses Polars Series and DataFrame as fundamental data structures, and is capable to communicate among different data

Jacob Xie 18 Dec 12, 2022
A dataframe manipulation tool inspired by dplyr and powered by polars.

dply is a command line tool for viewing, querying, and writing csv and parquet files, inspired by dplyr and powered by polars. Usage overview A dply p

null 14 May 29, 2023
Cloud native log storage and management for Kubernetes, containerised workloads

Live Demo | Website | API Workspace on Postman Parseable is an open source, cloud native, log storage and management platform. Parseable helps you ing

Parseable, Inc. 715 Jan 1, 2023
Provides a way to use enums to describe and execute ordered data pipelines. 🦀🐾

enum_pipline Provides a way to use enums to describe and execute ordered data pipelines. ?? ?? I needed a succinct way to describe 2d pixel map operat

Ben Greenier 0 Oct 29, 2021
This library provides a data view for reading and writing data in a byte array.

Docs This library provides a data view for reading and writing data in a byte array. This library requires feature(generic_const_exprs) to be enabled.

null 2 Nov 2, 2022
Rustic - a backup tool that provides fast, encrypted, deduplicated backups

Rustic is a backup tool that provides fast, encrypted, deduplicated backups. It can read the restic repo format desribed in the design document and writes a compatible repo format which can also be read by restic.

null 266 Jan 2, 2023