rsv is a command line tool to deal with small and big CSV, TXT, EXCEL files (especially >10G)

Overview

csv, excel toolkit written in Rust

rsv is a command line tool to deal with small and big CSV, TXT, EXCEL files (especially >10G). rsv has following features:

  • written in Rust
  • fast and parallel data processing (based on Rayon)
  • real-time progress bar
  • simple usage
  • support command pipelines

Usage

download rsv.exe from release tab, and append the file directory to system path.

Available commands

  • head - Show head n lines of CSV, TXT or EXCEL file.
  • header - Show file headers.
  • count - Count the number of lines of file πŸƒ .
  • estimate - Fast estimate the number of lines.
  • clean - Clean file with escape char (e.g. ") or other strings πŸƒ .
  • frequency - Show frequency table for column(s) πŸƒ ⭐ .
  • split - Split file into separate files sequentially or based on column value πŸƒ ⭐ .
  • select - Select rows and columns by filter πŸƒ .
  • flatten - Prints flattened records to view records one by one.
  • slice - Prints a slice of rows from file.
  • search - Search with regexes πŸƒ ⭐ .
  • sort - In-memory data sorting, support for at most two columns ⭐ .
  • stats - Statistics for column(s), including min, max, mean, unique, null πŸƒ ⭐ .
  • excel2csv - Convert excel to csv.
  • to - Save command output data to disk, can be one of TXT, CSV, TSV, XLSX or XLS.
  • table - Format data as an aligned table.

Tips 1:

  • πŸƒ means the command is supported with a real-time progress bar.
  • ⭐ means the command is supported with parallel data processing.

Tips 2:

All commands, except "clean" and "excel2csv", are allowed to be chained.

Tips 3:

You can always check usage of each command by rsv command --help or rsv command -h, for example, rsv frequency --help.

Basic Usage

  • rsv head
rsv head data.csv                   # print as the file is
rsv head --tabled data.csv          # tabled
rsv head -t data.csv                # tabled too
rsv head -s \t data.csv             # CSV file with a tab separator
rsv head data.xlsx                  # EXCEL file
rsv head --help                     # help info on all flags
  • rsv header
rsv headers data.csv                # separator "," (default)
rsv headers -s \t data.csv          # separator tab
rsv headers data.xlsx               # EXCEL file
rsv headers --help                  # help info on all flags
  • rsv count
rsv count data.csv                  # plain-text file
rsv count data.xlsx                 # EXCEL file
rsv count --no-header data.csv
rsv count --help                    # help info on all flags
  • rsv estimate
rsv estimate data.csv
rsv estimate data.xlsx
rsv estimate --help                 # help info on all flags
  • rsv clean
rsv clean data.csv                               # default to clean escape char "
rsv clean -e \"content-to-delete\" data.csv      # escape is a str, clean str to empty
rsv clean -o new-file.csv data.csv               # save to new-file.csv, the default is data-cleaned.csv
rsv clean --help                                 # help info on all flags
  • rsv frequency
rsv frequency -c 0 data.csv              # default to the first column, descending order
rsv frequency -c 0 data.xlsx             # EXCEL file
rsv frequency -c 0,1,2,5 data.csv        # columns 0, 1, 2, and 5
rsv frequency -c 0-2,5 data.csv          # same as above
rsv frequency -c 0-2 --export data.csv   # export result to data-frequency.csv
rsv frequency -n 10 data.csv             # keep top 10 frequent items
rsv frequency -a 10 data.csv             # in ascending order
rsv frequency --help                     # help info on all flags

column selection syntax:
-c 0,1,2,5   -->    cols [0,1,2,5]
-c 0-2,5     -->    same as cols [0,1,2,5]
  • rsv split
rsv split data.csv                # default to first column and field separator of ,
rsv split data.xlsx               # EXCEL file
rsv split -s \t data.csv          # tab separator
rsv split -c 1 data.csv           # split based on second column
rsv split -c 0 -s \t data.csv     # first column, \t separator
rsv split --size 1000 data.xlsx   # Sequential split, 1000 records in a file.
rsv split --help                  # help info on all flags
  • rsv select
10&1=c" data.csv # first column > 10 numerically, AND the second column equals c rsv select -f 0!= --export data.csv # export result, in which the first column is non-empty rsv select --help # help info on other options Filter syntax, support =, !=, >, >=, <, <= and &: -f 0=a,b,c --> first column is a, b, or c -f 0N=1,2 --> first column numerically equals to 1 or 2 -f 0!= --> first column is not empty -f "0>=2022-01-21" --> first column equal to or bigger than 2022-01-21, lexicographically -f "0N>10" --> first column > 10 numerically -f "0N>10&2=pattern" --> first column > 10 numerically, AND the third column equals to NOTE: 1. only & (AND) operation is supported, | (OR) operation is not supported. 2. The filter option can be omitted to select all rows. column selection syntax: -c 0,1,2,5 --> cols [0,1,2,5] -c 0-2,5 --> same as cols [0,1,2,5]">
rsv select -f 0=a,b,c data.csv          # first column has values of a, b, or c
rsv select -f 0=a,b,c data.xlsx         # EXCEL file, sheet can be specified with the --sheet flag
rsv select -f "0N>10&1=c" data.csv      # first column > 10 numerically, AND the second column equals c
rsv select -f 0!= --export data.csv     # export result, in which the first column is non-empty
rsv select --help                       # help info on other options

Filter syntax, support =, !=, >, >=, <, <= and &:
-f 0=a,b,c           -->  first column is a, b, or c
-f 0N=1,2            -->  first column numerically equals to 1 or 2
-f 0!=               -->  first column is not empty
-f "0>=2022-01-21"   -->  first column equal to or bigger than 2022-01-21, lexicographically
-f "0N>10"           -->  first column > 10 numerically
-f "0N>10&2=pattern" -->  first column > 10 numerically, AND the third column equals to <pattern>

NOTE: 1. only & (AND) operation is supported, | (OR) operation is not supported.
      2. The filter option can be omitted to select all rows.

column selection syntax:
-c 0,1,2,5   -->    cols [0,1,2,5]
-c 0-2,5     -->    same as cols [0,1,2,5]
  • rsv flatten
rsv flatten data.csv                       # default to show first 5 records
rsv flatten -n 50 data.csv                 # show 50 records
rsv flatten data.xls                       # EXCEL file
rsv flatten --delimiter \"--\" data.csv    # change line delimiter to anything
rsv flatten --help                         # help info on all flags
  • rsv slice
rsv slice -s 100 -e 150 data.csv           # set start and end index
rsv slice -s 100 -l 50 data.csv            # set start index and the length
rsv slice -s 100 -l 50 data.xlsx           # EXCEL FILE
rsv slice -s 100 -l 50 --export data.csv   # export to data-slice.csv
rsv slice -e 10 --export data.csv          # set end index and export data
rsv slice -i 9 data.csv                    # the 10th line sliced only
rsv slice --help                           # help info on all flags
  • rsv search
rsv search PATTERN data.csv                # search PATTERN
rsv search "^\d{4}-\d{2}-\d{2}$" data.csv  # search dates
rsv search --export PATTERN data.csv       # export result
rsv search PATTERN data.xlsx               # search EXCEL file
rsv slice --help                           # help info on all flags
  • rsv sort
rsv sort -c 0 data.csv        # default to sort by first column in ascending
rsv sort -c 0D data.csv       # descending sort
rsv sort -c 0DN data.csv      # sort as numeric values
rsv sort -c 0DN,2N data.csv   # sort two columns
rsv sort -E data.csv          # export result
rsv sort data.xlsx            # sort EXCEL file
  • rsv stats
rsv stats data.csv                       # all columns, statistics include: min, max, mean, unique, null
rsv stats data.xlsx                      # EXCEL FILE
rsv stats -c 0,1 data.csv                # first two columns
rsv stats -c 0,1 --export data.csv       # export to data-stats.csv
rsv slice --help                         # help info on all flags
  • rsv excel2csv
rsv excel2csv data.xlsx                 # apply to xlsx file, default to first sheet (or sheet1)
rsv excel2csv data.xls                  # apply also to xls file
rsv excel2csv --sheet 1 data.xls        # second sheet, e.g., sheet 2
rsv excel2csv -S 1 data.xls             # same as above
  • rsv table
rsv head data.csv | rsv table                   # convert result to an aligned table
rsv slice -s 10 -e 15 data.csv | rsv table      # convert result to an aligned table

Command pipeline

  • two commands pipelined
10" data.csv | rsv head -n 5 # filter rows, and show head 5 records rsv select -f "2N=10,20" -c 0-4 data.csv | rsv stats # filter rows, select columns and make statistics rsv select -f "2N=10,20" -c 0-4 data.csv | rsv sort -c 2 # filter rows, select columns and sort data">
rsv search "^\d{4}-\d{2}-\d{2}$" data.csv | rsv table     # search date and print in an aligned table
rsv select -f 0=a,b data.csv | rsv frequency -c 0         # filter rows and get its frequency table
rsv select -f "0!=&2N>10" data.csv | rsv head -n 5        # filter rows, and show head 5 records
rsv select -f "2N=10,20" -c 0-4 data.csv | rsv stats      # filter rows, select columns and make statistics
rsv select -f "2N=10,20" -c 0-4 data.csv | rsv sort -c 2  # filter rows, select columns and sort data
  • more commands pipelined
=10&0N<20" data.csv | rsv search pattern | rsv table # select, search, and print in a table">
rsv search pattern1 data.csv | rsv sort -c 1ND | rsv table             # search, sort and print
rsv select -f 1=a,b data.csv | rsv search pattern | rsv stats          # select, search, and make statistics
rsv select -f "0N>=10&0N<20" data.csv | rsv search pattern | rsv table # select, search, and print in a table

Data export

  • method 1: by the --export or -E flag, support exporting to csv file only
=10" --export pattern data.xlsx # export select data">
rsv slice -s 1000 -e 2000 --export data.csv           # the data export flag
rsv slice -s 1000 -e 2000 -E data.csv                 # same as above
rsv search --export pattern data.xlsx                 # export search data
rsv select -f "0N>=10" --export pattern data.xlsx     # export select data
  • method 2: by "rsv to" subcommand, support csv, txt, tsv, excel
=10" pattern data.xlsx | rsv to out.txt # export to TXT">
rsv slice -s 1000 -e 2000 data.csv | rsv to out.csv          # export to CSV
rsv slice -s 1000 -e 2000 data.csv | rsv to out.xlsx         # export to EXCEL
rsv search pattern data.xlsx | rsv to out.tsv                # export to TSV
rsv select -f "0N>=10" pattern data.xlsx | rsv to out.txt    # export to TXT

Bug report and suggestion

219352261 QQ Chat Room

Next

new features will be added in the future.

You might also like...
A lightweight async Web crawler in Rust, optimized for concurrent scraping while respecting `robots.txt` rules.

πŸ•·οΈ crawly A lightweight and efficient web crawler in Rust, optimized for concurrent scraping while respecting robots.txt rules. πŸš€ Features Concurren

FUSE filesystem that provides FizzBuzz.txt(8 Exabyte)

FizzBuzzFS root@8a2db3fc6292:/# cd /mnt/FizzBuzz/ root@8a2db3fc6292:/mnt/FizzBuzz# ls -l total 9007199254740992 -rw-r--r-- 1 501 dialout 9223372036854

πŸ… A command-line tool to get and set values in toml files while preserving comments and formatting

tomato Get, set, and delete values in TOML files while preserving comments and formatting. That's it. That's the feature set. I wrote tomato to satisf

A simple CLI tool for converting CSV file content to JSON.

fast-csv-to-json A simple CLI tool for converting CSV file content to JSON. ζˆ‘θŠ±δΊ†δΈ€ε€‹ε°ζ™‚ζ“ε‡ΊδΎ†οΌŒζŽ₯θ‘—ε„ͺεŒ–δΊ†ε…©ε€©ηš„εΏ«ι€Ÿ CSV 轉 JSON CLI 小ε·₯ε…· Installation Install Rust with ru

RnR is a command-line tool to securely rename multiple files and directories that supports regular expressions
RnR is a command-line tool to securely rename multiple files and directories that supports regular expressions

RnR is a command-line tool to securely rename multiple files and directories that supports regular expressions. Features Batch rename files and direct

Command line tool to extract various data from Blender .blend files

blendtool Command line tool to extract various data from Blender .blend files. Currently supports dumping Eevee irradiance volumes to .dds, new featur

 apkeep - A command-line tool for downloading APK files from various sources
apkeep - A command-line tool for downloading APK files from various sources

apkeep - A command-line tool for downloading APK files from various sources Installation Precompiled binaries for apkeep on various platforms can be d

Anglosaxon is a command line tool to parse XML files using SAX
Anglosaxon is a command line tool to parse XML files using SAX

anglosaxon - Convert large XML files to other formats anglosaxon is a command line tool to parse XML files using SAX. You can do simple transformation

Command line tool for editing .ini files

Edit-ini Command line tool for editing .ini files Usage Usage: edit-ini [OPTIONS] Options: -i, --input file Input file to read f

Owner
Zhuang Dai
Cool down and write some code
Zhuang Dai
Cork is a simple command-line calculator, mainly targeted towards people who deal with hex numbers

Cork is a simple command-line calculator, mainly targeted towards people who deal with hex numbers. It deals only with integer arithmetic. Expressions may involve mixed bases (limited to decimal, hexadecimal, octal and binary numbers). The global output format may be set to a particular radix - by default it is hex.

Deep Majumder 50 Dec 22, 2022
Small command-line tool to switch monitor inputs from command line

swmon Small command-line tool to switch monitor inputs from command line Installation git clone https://github.com/cr1901/swmon cargo install --path .

William D. Jones 5 Aug 20, 2022
A simple overlay for OBS to display consecutive keystrokes. Primarily for programming, especially for Vim.

Keystroke-Display A simple overlay for OBS to display consecutive keystrokes. Primarily for programming, especially for Vim. This will work on Windows

null 7 May 13, 2023
Xsv - A fast CSV command line toolkit written in Rust.

xsv is a command line program for indexing, slicing, analyzing, splitting and joining CSV files. Commands should be simple, fast and composable: Simpl

Andrew Gallant 9.1k Dec 31, 2022
Sniffer - a tool to quickly inspect csv and flat-file files for basic information

sniffer sniffer is a tool to quickly inspect csv and flat-file files for basic information. Need to see how many rows are in a csv file? Want to see t

Daniel B 10 Apr 4, 2023
Fast & Memory Efficient NodeJs Excel Writer using Rust Binding

FastExcel This project need Rust to be installed, check here for Rust installation instruction This project using Rust and Neon as a binding to Rust t

Aditya Kresna 2 Dec 15, 2022
Blazingly fast Rust CLI app to sync data from a folder of excel workbooks into generated c# code for unity usage

Extensions supported ( .xls, .xlsx, .xlsm, .xlsb, .xla, .xlam, .ods ) Speed Test Image shows the results of 5000defs synced from 2 workbooks and 5 she

null 4 Feb 16, 2023
Microsoft Excel (XLSX) to Unicode Separated Values (USV) Rust crate

xlsx-to-usv Convert Microsoft Excel (XLSX) to Unicode Separated Values (USV). Built with the USV Rust crate. Syntax: stdin | xlsx-to-usv [options] | s

SixArm 3 Mar 31, 2024
A simple and efficient terminal UI implementation with ratatui.rs for getting quick insights from csv files right on the terminal

CSV-GREP csv-grep is an intuitive TUI application writting with ratatui.rs for reading, viewing and quickly analysing csv files right on the terminal.

Anthony Ezeabasili 16 Mar 10, 2024
A tool for collecting rollup blocks from the Aztec Connect rollup, and exporting them to csv

Aztec Connect Data Gobbler The Aztec Connect Data gobbler is a tool made for extracting data from the Aztec Connect system using only L1 as its source

Lasse Herskind 6 Feb 17, 2023