qsv: Ultra-fast CSV data-wrangling toolkit

Joel Natividad

Last update: Jan 3, 2023

Related tags

Command-line rust cli tsv csv opendata data-engineering ckan data-wrangling

Overview

qsv: Ultra-fast CSV data-wrangling toolkit

	Table of Contents
	qsv is a command line program for indexing, slicing, analyzing, splitting, enriching, validating & joining CSV files. Commands are simple, fast & composable. * Available Commands * Installation * Whirlwind Tour * Cookbook * FAQ * Changelog * Performance Tuning * Benchmarks * NYC School of Data 2022 slides * Sponsor

ℹ️ NOTE: qsv is a fork of the popular xsv utility, merging several pending PRs since xsv 0.13.0's May 2018 release. It also has numerous new features & 53 additional commands/subcommands/operations (for a total of 73). See FAQ for more details.

Available commands

Command	Description
apply¹²	Apply series of string, date, currency & geocoding transformations to a CSV column. It also has some basic NLP functions (similarity, sentiment analysis, profanity, eudex & language detection).
behead	Drop headers from a CSV.
cat	Concatenate CSV files by row or by column.
count³	Count the rows in a CSV file. (Instantaneous with an index.)
dedup⁴²	Remove redundant rows.
enum	Add a new column enumerating rows by adding a column of incremental or uuid identifiers. Can also be used to copy a column or fill a new column with a constant value.
excel	Exports a specified Excel/ODS sheet to a CSV file.
exclude³	Removes a set of CSV data from another set based on the specified columns.
explode	Explode rows into multiple ones by splitting a column value based on the given separator.
extsort²	Sort an arbitrarily large CSV/text file using a multithreaded external merge sort algorithm.
fetch	Fetches HTML/data from web pages or web services for every row. Comes with jql JSON query language support, dynamic throttling (RateLimit) & caching with optional Redis support for persistent caching.
fill	Fill empty values.
fixlengths	Force a CSV to have same-length records by either padding or truncating them.
flatten	A flattened view of CSV records. Useful for viewing one record at a time. e.g. `qsv slice -i 5 data.csv \| qsv flatten`.
fmt	Reformat a CSV with different delimiters, record terminators or quoting rules. (Supports ASCII delimited data.)
foreach¹	Loop over a CSV to execute bash commands. (not available on Windows)
frequency³⁵	Build frequency tables of each column. (Uses multithreading to go faster if an index is present.)
generate¹	Generate test data by profiling a CSV using Markov decision process machine learning.
headers	Show the headers of a CSV. Or show the intersection of all headers between many CSV files.
index	Create an index for a CSV. This is very quick & provides constant time indexing into the CSV file. Also enables multithreading for `frequency`, `split`, `stats` and `schema` commands.
input³	Read CSV data with special quoting, trimming, line-skipping and UTF-8 transcoding rules. Typically used to "normalize" a CSV for further processing with other qsv commands.
join³	Inner, outer, cross, anti & semi joins. Uses a simple hash index to make it fast.
jsonl	Convert newline-delimited JSON (JSONL/NDJSON) to CSV.
lua¹	Execute a Lua 5.4.4 script over CSV lines to transform, aggregate or filter them.
partition	Partition a CSV based on a column value.
pseudo	Pseudonymise the value of the given column by replacing them with an incremental identifier.
py¹	Evaluate a Python expression over CSV lines to transform, aggregate or filter them. Python's f-strings is particularly useful for extended formatting (Python 3.8+ required).
rename	Rename the columns of a CSV efficiently.
replace	Replace CSV data using a regex.
reverse⁴	Reverse order of rows in a CSV. Unlike the `sort --reverse` command, it preserves the order of rows with the same key.
sample³	Randomly draw rows (with optional seed) from a CSV using reservoir sampling (i.e., use memory proportional to the size of the sample).
schema⁵	Infer schema from CSV data and output in JSON Schema format. Uses multithreading to go faster if an index is present. See `validate` command.
search	Run a regex over a CSV. Applies the regex to each field individually & shows only matching rows.
searchset	Run multiple regexes over a CSV in a single pass. Applies the regexes to each field individually & shows only matching rows.
select	Select, re-order, duplicate or drop columns.
slice³⁴	Slice rows from any part of a CSV. When an index is present, this only has to parse the rows in the slice (instead of all rows leading up to the start of the slice).
sniff³	Quickly sniff CSV metadata (delimiter, header row, preamble rows, quote character, flexible, is_utf8, number of records, number of fields, field names & data types).
sort²	Sorts CSV data in alphabetical, numerical, reverse or random (with optional seed) order.
split³⁵	Split one CSV file into many CSV files of N chunks. (Uses multithreading to go faster if an index is present.)
stats³⁴⁵	Infer data type & compute descriptive statistics for each column in a CSV (sum, min/max, min/max length, mean, stddev, variance, nullcount, quartiles, IQR, lower/upper fences, skewness, median, mode & cardinality). Uses multithreading to go faster if an index is present.
table⁴	Show aligned output of a CSV using elastic tabstops.
transpose⁴	Transpose rows/columns of a CSV.
validate³²	Validate CSV data with JSON Schema (See `schema` command). If no jsonschema file is provided, validates if a CSV conforms to the RFC 4180 standard.

Installation

Pre-built binaries for Windows, Linux and macOS are available from GitHub, including binaries compiled with Rust Nightly/Unstable (click here for more info).

There are three versions of qsv. qsv enables all features valid for the target platform⁶; qsvlite has all features disabled (half the size of qsv); qsvdp is optimized for use with DataPusher+, with only DataPusher+ relevant commands and the self-update engine removed (a sixth of the size of qsv).

Alternatively, you can compile from source by installing Cargo (Rust's package manager) and installing qsv using Cargo:

cargo install qsv --features full

If you encounter compilation errors, ensure you're using the exact version of the dependencies qsv was built with by issuing:

cargo install qsv --locked --features full

Compiling from this repository also works similarly:

git clone [email protected]:jqnatividad/qsv.git
cd qsv
cargo build --release --features full
# or if you encounter compilation errors
cargo build --release --locked --features full

The compiled binary will end up in ./target/release/.

To enable optional features, use cargo --features (see Feature Flags for more info):

cargo install qsv --features apply,generate,lua,fetch,foreach,python,full
# or to build qsvlite
cargo install qsv --features lite
# or to build qsvdp
cargo install qsv --features datapusher_plus

# or when compiling from a local repo
cargo build --release --features apply,generate,lua,fetch,foreach,python,full
# for qsvlite
cargo build --release --features lite
# for qsvdp
cargo build --release --features datapusher_plus

Minimum Supported Rust Version

Building qsv requires Rust stable - currently version 1.62.0.

Tab Completion

qsv's command-line options are quite extensive. Thankfully, since it uses docopt for CLI processing, we can take advantage of docopt.rs' tab completion support to make it easier to use qsv at the command-line (currently, only bash shell is supported):

> $HOME/.bash_completion echo "source \"$(pwd)/scripts/docopt-wordlist.bash\"" >> $HOME/.bash_completion echo "complete -F _docopt_wordlist_commands qsv" >> $HOME/.bash_completion">

# install docopt-wordlist
cargo install docopt

# IMPORTANT: run these commands from the root directory of your qsv git repository
# to setup bash qsv tab completion
echo "DOCOPT_WORDLIST_BIN=\"$(which docopt-wordlist)"\" >> $HOME/.bash_completion
echo "source \"$(pwd)/scripts/docopt-wordlist.bash\"" >> $HOME/.bash_completion
echo "complete -F _docopt_wordlist_commands qsv" >> $HOME/.bash_completion

File formats

qsv recognizes UTF-8/ASCII encoded, CSV (.csv) and TSV files (.tsv and .tab). CSV files are assummed to have "," (comma) as a delimiter, and TSV files, "\t" (tab) as a delimiter. The delimiter is a single ascii character that can be set either by the --delimiter command-line option or with the QSV_DEFAULT_DELIMITER environment variable or automatically detected when QSV_SNIFF_DELIMITER is set.

When using the --output option, note that qsv will UTF-8 encode the file and automatically change the delimiter used in the generated file based on the file extension - i.e. comma for .csv, tab for .tsv and .tab files.

JSONL/NDJSON files are also recognized and converted to CSV with the jsonl command.

The fetch command also produces JSONL files when its invoked without the --new-column option.

The sniff and validate commands produce JSON files with their --json and --pretty-json options.

The schema command produces JSON Schema Validation (Draft 7) files with the ".schema.json" file extension, which can be used with the validate command.

The excel command recognizes Excel and Open Document Spreadsheet(ODS) files (.xls, .xlsx, .xlsm, .xlsb and .ods files).

RFC 4180

qsv validates against the RFC 4180 CSV standard. However IRL, CSV formats vary significantly and qsv is actually not strictly compliant with the specification so it can process "real-world" CSV files. qsv leverages the awesome Rust CSV library, which in turn, is built on top of the csv-core library to read CSV files.

Click here to find out more about how qsv conforms to the standard with csv-core.

UTF-8 Encoding

qsv requires UTF-8 encoded (of which ASCII is a subset) input files. On startup, it scans the input if it's UTF-8 encoded (for files, the first 8k; for stdin, the entire buffer), and will abort if its not unless QSV_SKIPUTF8_CHECK is set. On Linux and macOS, UTF-8 encoding is the default.

Previously, several qsv commands worked with non utf-8 encoded files. But to increase performance, qsv versions since 0.38 make extensive use of from_utf8_unchecked so as not to pay the repetitive utf-8 validation penalty, no matter how small, even for already utf-8 encoded files.

Only two commands still work with non utf-8 encoded files - sniff and input. sniff is meant to screen files and detect if a file is utf-8 encoded; and input, to prepare files for further qsv processing.

Should you need to reencode CSV/TSV files, you can use the input command to transcode to UTF-8. It will replace all invalid UTF-8 sequences with �. Alternatively, there are several utilities you can use to do so on Linux/macOS and Windows.

Windows Usage Note

Unlike other modern operating systems, Windows' default encoding is UTF16-LE. This will cause problems when redirecting qsv's output to a CSV file and trying to open it with Excel (which ignores the comma delimiter, with everything in the first column):

qsv stats wcp.csv > wcpstats.csv

Which is weird, since you would think Microsoft Excel would properly recognize UTF16-LE encoded CSV files. Regardless, to create a properly UTF-8 encoded file, use the --output option instead:

qsv stats wcp.csv --output wcpstats.csv

Environment Variables

Variable	Description
`QSV_DEFAULT_DELIMITER`	single ascii character to use as delimiter. Overrides `--delimeter` option. Defaults to "," (comma) for CSV files and "\t" (tab) for TSV files when not set. Note that this will also set the delimiter for qsv's output to stdout. However, using the `--output` option, regardless of this environment variable, will automatically change the delimiter used in the generated file based on the file extension - i.e. comma for `.csv`, tab for `.tsv` and `.tab` files.
`QSV_SNIFF_DELIMITER`	if set, the delimiter is automatically detected. Overrides `QSV_DEFAULT_DELIMITER` and `--delimiter` option. Note that this does not work with stdin.
`QSV_NO_HEADERS`	if set, the first row will NOT be interpreted as headers. Supersedes `QSV_TOGGLE_HEADERS`.
`QSV_TOGGLE_HEADERS`	if set to `1`, toggles header setting - i.e. inverts qsv header behavior, with no headers being the default, and setting `--no-headers` will actually mean headers will not be ignored.
`QSV_AUTOINDEX`	if set, automatically create an index when none is detected. Also automatically update stale indices.
`QSV_COMMENT_CHAR`	set to an ascii character. If set, any lines(including the header) that start with this character are ignored.
`QSV_MAX_JOBS`	number of jobs to use for multithreaded commands (currently `apply`, `dedup`, `extsort`, `frequency`, `schema`, `sort`, `split`, `stats` and `validate`). If not set, max_jobs is set to the detected number of logical processors. See Multithreading for more info.
`QSV_NO_UPDATE`	if set, prohibit self-update version check for the latest qsv release published on GitHub.
`QSV_PREFER_DMY`	if set, date parsing will use DMY format. Otherwise, use MDY format (used with `apply datefmt`, `schema`, `sniff` & `stats` commands).
`QSV_REGEX_UNICODE`	if set, makes `search`, `searchset` and `replace` commands unicode-aware. For increased performance, these commands are not unicode-aware and will ignore unicode values when matching and will panic when unicode characters are used in the regex.
`QSV_SKIPUTF8_CHECK`	if set, skip UTF-8 encoding check. Otherwise, qsv scans the first 8k of files. For stdin, it scans the entire buffer.
`QSV_RDR_BUFFER_CAPACITY`	reader buffer size (default (bytes): 16384)
`QSV_WTR_BUFFER_CAPACITY`	writer buffer size (default (bytes): 65536)
`QSV_LOG_LEVEL`	desired level (default - off; `error`, `warn`, `info`, `trace`, `debug`).
`QSV_LOG_DIR`	when logging is enabled, the directory where the log files will be stored. If the specified directory does not exist, qsv will attempt to create it. If not set, the log files are created in the directory where qsv was started. See Logging for more info.
`QSV_REDIS_CONNECTION_STRING`	the `fetch` command can use Redis to cache responses. Set to connect to the desired Redis instance. (default: `redis:127.0.0.1:6379`)
`QSV_REDIS_MAX_POOL_SIZE`	the maximum Redis connection pool size. (default: 20).
`QSV_REDIS_TTL_SECONDS`	set time-to-live of Redis cached values (default (seconds): 2419200 (28 days)).
`QSV_REDIS_TTL_REFRESH`	if set, enables cache hits to refresh TTL of cached values.

Several dependencies also have environment variables that influence qsv's performance & behavior:

Memory Management (mimalloc) When incorporating qsv into a data pipeline that runs in batch mode, particularly with very large CSV files using qsv commands that load entire CSV files into memory, you can fine-tune Mimalloc's behavior using its environment variables.
Network Access (reqwest) qsv uses reqwest for its fetch, validate and --update functions and will honor proxy settings set through HTTP_PROXY, HTTPS_PROXY and NO_PROXY.

ℹ️ NOTE: To get a list of all active qsv-relevant environment variables, run qsv --envlist.

Feature Flags

qsv has several features:

mimalloc (default) - use the mimalloc allocator (see Memory Allocator for more info).
apply - enable apply command. This swiss-army knife of CSV transformations is very powerful, but it has a lot of dependencies that increases both compile time and binary size.
fetch - enable fetch command.
generate - enable generate command.
full - enable to build qsv.
lite - enable to build qsvlite.
datapusher_plus - enable to build qsvdp - the DataPusher+ optimized qsv binary.
nightly - enable to turn on nightly/unstable features in the rand and regex creates when building with Rust nightly/unstable.

The following "power-user" commands can be abused and present "foot-shooting" scenarios.

lua - enable lua command. Embeds a Lua 5.4.4 interpreter into qsv.
foreach - enable foreach command (not valid for Windows).
python - enable py command (requires Python 3.8+). Note that qsv will automatically use the currently activated python version when running in a virtual environment.

ℹ️ NOTE: qsvlite, as the name implies, always has non-default features disabled. qsv can be built with any combination of the above features using the cargo --features & --no-default-features flags. The pre-built qsv binaries has all applicable features valid for the target platform⁶.

License

Dual-licensed under MIT or the UNLICENSE.

Sponsor

qsv was made possible by

Standards-based, best-of-breed, open source solutions to make your Data Useful, Usable & Used.

Naming Collision

This project is unrelated to Intel's Quick Sync Video.

enabled by optional feature flag. Not available on qsvlite. ↩ ↩ ² ↩ ³ ↩ ⁴ ↩ ⁵
multithreaded even without an index. ↩ ↩ ² ↩ ³ ↩ ⁴ ↩ ⁵
uses an index when available. ↩ ↩ ² ↩ ³ ↩ ⁴ ↩ ⁵ ↩ ⁶ ↩ ⁷ ↩ ⁸ ↩ ⁹ ↩ ¹⁰ ↩ ¹¹
loads the entire CSV into memory. Note that dedup, stats & transpose have modes that do not load the entire CSV into memory. ↩ ↩ ² ↩ ³ ↩ ⁴ ↩ ⁵ ↩ ⁶
multithreaded when an index is available. ↩ ↩ ² ↩ ³ ↩ ⁴
The foreach feature is not available on Windows. The python feature is not enabled on cross-compiled pre-built binaries as we don't have access to a native python interpreter for those platforms (aarch64, i686, and arm) on GitHub's action runners. Compile natively on those platforms with Python 3.8+ installed, if you want to enable the python feature. ↩ ↩ ²

Comments

Create `schema` command
stats does a great job of not only getting descriptive stats about a CSV, it also infers the data type. frequency compiles a frequency table.

The schema command will use the output of the stats, and optionally frequency (to specify the valid range of a field), to create a json schema file that can be used with the validate command (#46) to validate a CSV against the generated schema.

With the combo addition of schema and validate, qsv can be used in a more bullet-proof automated data pipeline that can fail gracefully when there are data quality issues:

use schema to create a json schema from a representative CSV file for a feed

adjust the schema to fine-tune the validation rules

use validate at the beginning of a data pipeline and fail gracefully when validate fails

for extra large files, use sample to validate against a sample

or alternatively, partition the CSV to break down the pipeline into smaller jobs

enhancement
opened by jqnatividad 38
`fetch` command

fetch will allow qsv to fetch HTML or data from web pages or services, to enrich a CSV (e.g. geocoding, wikidata api, etc.)

It will support authentication, concurrent requests, thresholds, etc.

Reminiscent of OpenRefine's fetch url... (https://docs.openrefine.org/manual/columnediting#add-column-by-fetching-urls) , but optimized for the command line.
enhancement

opened by jqnatividad 20
Auto-detect delimiter
Is your feature request related to a problem? Please describe.

In my daily work, a .csv file is comma-separated 50% of the time, and semi-colon-separated the other 50%. I use this command-line tool hundreds of times a day. It's incredibly frustrating to have to first figure out which separator character is being used, and then adjust my command appropriately.

Describe the solution you'd like

Ideally qsv would auto-detect a delimiter by default. However, this would break backwards compatibility, so I suggest having a environment variable to turn this on.

$ export QSV_AUTO_DETECT_DELIMITER=1 $ qsv table my_file.csv

When this environment variable is set, any value of QSV_DELIMITER would be ignored.

Describe alternatives you've considered

I tried to achieve this using a bash wrapper, but it was a bit fiddly because I need to do different things depending on whether qsv is being passed a file (in which case I sniff the file and then pass the delimiter to the qsv command) or a stream (in which case I sniff the stream, and then pass the amount I've already sniffed plus the rest of the stream to qsv).

Additional context

Duplicate of https://github.com/BurntSushi/xsv/issues/294
opened by harrybiddle 16
Feature Request: deduplicate columns/extract unique columns

Cross reference https://github.com/BurntSushi/xsv/issues/283

We can use qsv dedup or the Unix command line tools sort and uniq to remove duplicate rows in plain text table, but I find myself wanting to do something similar with duplicated columns.

For example, after doing qsv join ... there will be at least one pair of duplicated columns (the values used for the join).

I am hoping for something like a column based version of the row based qsv dedup command (see #26).

I suspect I could workaround this via the qsv transpose command (see #3).
no-issue-activity

opened by peterjc 16
Add a subrule to field inferencing
Hi, daily I work with data where some columns have only numerical values in the cells, but which are actually string fields.

A typical case is that of administrative codes, of regions, provinces and cities. In Italy (in example in the file below) these are some official regions code: the Lombardy has the code 03, and obviously it's a string. But if you run stats on it, the field state_code is mapped wrongly as integer.

| field | type | sum | min | max | min_length | max_length | mean | stddev | variance | nullcount | | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | | state_code | Integer | 71 | 2 | 20 | 2 | 2 | 10.1429 | 7.7354 | 59.8367 | 0 | | label | String | | Aosta Valley | Veneto | 6 | 29 | | | | 0 |

I know, octal numbers exist, but in 99% of cases (I'm talking about my experience), if I have fields with cells starting with zero they are not octal numbers, but string codes.

If it were possible I would add a subrule for all fields mapped as numbers: if there are cells that start with a zero, not followed by a , or ., set that inferenced field as string and not as number.

Thank you.

Example file

state_code,label 19,Sicily 18,Calabria 20,Sardinia 02,Aosta Valley 03,Lombardy 04,Trentino-Alto Adige/Südtirol 05,Veneto
enhancement
opened by aborruso 13
qsv excel only works with Little Endian?
I have a number of spreadsheets written by the Perl module Spreadsheet::WriteExcel.

When I extract with qsv excel I get something like ^@ for every character and further processing such as search becomes impossible.

When I look at the spreadsheets with file I get something like

a.xls: CDFV2 Microsoft Excel

whereas the "normal" files (which extract properly) look like

b.xls: Composite Document File V2 Document, Little Endian...

Desktop (please complete the following information):

OS: Mac Ventura

qsv 0.72.0-standard-apply;fetch;foreach;generate;luau;self_update-8-8 (x86_64-apple-darwin compiled with Rust 1.65) prebuilt
opened by ondohotola 13

Glibc & Python library not found

Bug qsv binaries downloaded from release page of the repo are throwing the error as below. For linux-64bit-glibc

[~]$ qsv
qsv: error while loading shared libraries: libpython3.8.so.1.0: cannot open shared object file: No such file or directory
[~]$ qsvlite
qsvlite: /lib64/libc.so.6: version `GLIBC_2.25' not found (required by qsvlite)
qsvlite: /lib64/libc.so.6: version `GLIBC_2.28' not found (required by qsvlite)
qsvlite: /lib64/libc.so.6: version `GLIBC_2.27' not found (required by qsvlite)
qsvlite: /lib64/libc.so.6: version `GLIBC_2.18' not found (required by qsvlite)
[~]$ qsvdp
qsvdp: /lib64/libc.so.6: version `GLIBC_2.25' not found (required by qsvdp)
qsvdp: /lib64/libc.so.6: version `GLIBC_2.28' not found (required by qsvdp)
qsvdp: /lib64/libc.so.6: version `GLIBC_2.18' not found (required by qsvdp)

For linux-64bit-musl

[~]$ qsv
Segmentation fault (core dumped)

To Reproduce Download latest (0.61.1) linux 64bit binaries.

Expected behavior Should run

Desktop:

OS: CentOS 7 x86_64
qsv Version: 0.61.1

Additional context I believe the shared libraries aren't bundled with the release of this version. I have tried few older versions too but they are too are not working.

bug

opened by useraccessdenied 13

Add the UNIX epoch timestamp to the list of time formats that qsv-dateparser can parse.

Please add the UNIX epoch timestamp format (seconds since the UNIX epoch) to the list of time formats that qsv-dateparser can parse, including timestamps having fractional seconds to nanosecond or microsecond precision. As an alternative to fractional seconds, the timestamp might instead be an integer representing the number of nanoseconds or microseconds since the epoch.

I would like to transform columns in CSV files that contain UNIX epoch timestamps to ISO 8601 format. Many of these timestamps include fractional seconds (milliseconds, microsecond, or nanoseconds).

Alternatives:

csvq

$ csvq --format csv "SELECT UTC(DATETIME(1671673426.123456789)) AS timestamp;"
timestamp
2022-12-22T01:43:46.1234567Z

DuckDB

$ duckdb -csv -c "SELECT epoch_ms('1671673426.123456789'::BIGINT * 1000) + INTERVAL (string_split('1671673426.123456789', '.')[2]::BIGINT/1000) MICROSECONDS AS timestamp"
timestamp
"2022-12-22 01:43:46.123456"

goawk and GNU date

$ printf "timestamp\n1671673426.123456789\n" | goawk -H -i csv -o csv 'BEGIN { print "timestamp" } { system("date --date=@" @"timestamp" " --utc +%FT%T.%N") }'
timestamp
2022-12-22T01:43:46.123456789

Miller

$ echo timestamp=1671673426.123456789 | mlr put '$timestamp = strftime($timestamp, "%FT%H:%M:%3S");'
timestamp=2022-12-22T01:43:46.123
$ echo timestamp=1671673426.123456789 | mlr put '$timestamp = strftime($timestamp, "%FT%H:%M:%6S");'
timestamp=2022-12-22T01:43:46.123456
$ echo timestamp=1671673426.123456789 | mlr put '$timestamp = strftime($timestamp, "%FT%H:%M:%9S");'
timestamp=2022-12-22T01:43:46.123456716

Notice in the last example using format string %9S that function strftime in Miller suffers from a precision problem.

SQLite

$ echo | sqlite3 -csv -header -cmd "SELECT strftime('%Y-%m-%dT%H:%M:%f', 1671673426.123456789, 'unixepoch') AS timestamp;"
timestamp
2022-12-22T01:43:46.123

Notice that function strftime in SQLite prints seconds to at most millisecond precision.

enhancement

opened by derekmahar 12

BUG Python library not found on WSL2 Pengwin
I get this error on qsv from this zip file - qsv-0.64.0-x86_64-unknown-linux-gnu.zip

qsv: error while loading shared libraries: libpython3.8.so.1.0: cannot open shared object file: No such file or directory

$ python3 --version Python 3.9.2

libpython3.9 is installed. qsvnp works fine.

Desktop (please complete the following information):

OS: Pengwin on WSL2 (based on Debian 11)

qsv Version 0.64
opened by eggbean 12
Validate: improve performance
Validate currently runs under a single-thread and could become a bottleneck for data validation pipelines.

Would like to improve performance through higher concurrency

~~Concurrency should be controlled via --jobs option or QSV_MAX_JOBS env var~~

~~Concurrency should not exceed CPU count~~

~~When jobs is set to 0, apply same rules as stats to calculate optimal concurrency~~

Use Rayon to automatically control concurrency

Include validate performance numbers in performance suite

$ head -50000 NYC_311_SR_2010-2020-sample-1M.csv > NYC-short.csv $ qsvlite index NYC-short.csv $ time qsvlite schema NYC-short.csv --value-constraints --enum-threshold=25 Schema written to NYC-short.csv.schema.json real 0m6.941s user 0m12.050s sys 0m3.960s $ time qsvlite validate NYC-short.csv NYC-short.csv.schema.json [00:00:08] [==================== 100% validated 49,999 records.] (6,015/sec) 0 out of 49,999 records invalid. real 0m8.424s user 0m8.202s sys 0m0.128s
opened by mhuang74 12

filter feature

Is your feature request related to a problem? Please describe. filter is part of data wrangling process and usually one of the steps

Describe the solution you'd like qsv is very fast and capable. but it lacks (arithmetic) filtering. String filtering can be achieved with search command but simple filtering such as Population<100000 or Age>50 is not possible (as far as I know). So, qsv filter -f 'Age>50' should be possibility

Describe alternatives you've considered

Obvious example is awk. awk '$3 > 50' will quickly filter according to third column values.
csvtk has both filter and filter2 (latter is awk-like filtering).
Finally, tsv-utils has tsv-filter has an comprehensive (checking null, checking is.numeric, etc.) filtering. I added the help section for an example:

Operators:
* Test if a field is empty (no characters) or blank (empty or whitespace only).
  Syntax:  --empty|not-empty|blank|not-blank  FIELD
  Example: --empty name               # True if the 'name' field is empty

* Test if a field is numeric, finite, NaN, or infinity
  Syntax:  --is-numeric|is-finite|is-nan|is-infinity FIELD
  Example: --is-numeric 5 --gt 5:100  # Ensure field 5 is numeric before --gt test.

* Compare a field to a number (integer or float)
  Syntax:  --eq|ne|lt|le|gt|ge  FIELD:NUM
  Example: --lt size:1000 --gt weight:0.5  # ('size' < 1000) and ('weight' > 0.5)

* Compare a field to a string
  Syntax:  --str-eq|str-ne|istr-eq|istr-ne  FIELD:STR
  Example: --str-eq color:red         # True if 'color' field is "red"

* Test if a field contains a string (substring search)
  Syntax:  --str-in-fld|str-not-in-fld|istr-in-fld|istr-not-in-fld  FIELD:STR
  Example: --str-in-fld color:dark    # True if 'color field contains "dark"

* Test if a field matches a regular expression.
  Syntax:  --regex|iregex|not-regex|not-iregex  FIELD:REGEX
  Example: --regex '3:ab*c'     # True if field 3 contains "ac", "abc", "abbc", etc.

* Test a field's character or byte length
  Syntax:  --char-len-[le|lt|ge|gt|eq|ne] FIELD:NUM
           --byte-len-[le|lt|ge|gt|eq|ne] FIELD:NUM
  Example: --char-len-lt 2:10   # True if field 2 is less than 10 characters long.
           --byte-len-gt 2:10   # True if field 2 is greater than 10 bytes long.

* Field to field comparisons - Similar to field vs literal comparisons, but field vs field.
  Syntax:  --ff-eq|ff-ne|ff-lt|ff-le|ff-gt|ff-ge  FIELD1:FIELD2
           --ff-str-eq|ff-str-ne|ff-istr-eq|ff-istr-ne  FIELD1:FIELD2
  Example: --ff-eq 2:4          # True if fields 2 and 4 are numerically equivalent
           --ff-str-eq 2:4      # True if fields 2 and 4 are the same strings

* Field to field difference comparisons - Absolute and relative difference
  Syntax:  --ff-absdiff-le|ff-absdiff-gt FIELD1:FIELD2:NUM
           --ff-reldiff-le|ff-reldiff-gt FIELD1:FIELD2:NUM
  Example: --ff-absdiff-lt 1:3:0.25   # True if abs(field1 - field2) < 0.25

opened by alperyilmaz 9

`tojsonl` crash when convert csv contains both boolean and null

To Reproduce

echo "a,b\nfalse,1\n,2" | qsv tojsonl

crash with error

thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', src/cmd/tojsonl.rs:124:28
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

opened by binhunstatic 0

Auto Inferencing date(time) columns

Hi, if I import in duckdb this input CSV file

field1,date1
4500233821,2018-11-29 12:15:06.353
4500245641,2018-11-29 12:15:06.353
4500242807,2018-11-29 12:55:03.097
4500233852,2018-11-29 12:55:03.097
4500234511,2018-11-29 13:09:25.697
4500242786,2018-11-29 13:17:48.977
4500242786,2018-11-29 13:17:48.977
4500253356,2018-11-29 13:28:45.723

I get this schema, by automatic inferencing:

CREATE TABLE test(field1 BIGINT, date1 TIMESTAMP);

And if I apply datefmt to this input CSV I have a right datetime field

field1,date1
4500233821,2018-11-29T12:15:06.353+00:00
4500245641,2018-11-29T12:15:06.353+00:00
4500242807,2018-11-29T12:55:03.097+00:00
4500233852,2018-11-29T12:55:03.097+00:00
4500234511,2018-11-29T13:09:25.697+00:00
4500242786,2018-11-29T13:17:48.977+00:00
4500242786,2018-11-29T13:17:48.977+00:00
4500253356,2018-11-29T13:28:45.723+00:00

But if I use schema or to (to create a parquet file) to the original file, the date1 field is mapped as string field.

It would be great to have auto inferencing of datetime fields, also when they are not written perfectly (in my input I do not have the T and I have a space).

Thank you

opened by aborruso 3

currencytonum/numtocurrency still returns wrong results
IDescribe the bug #660 still has a corner case :-)-O

To Reproduce

cat > test.csv<<EOF CUR,RATE EUR,1 GBP,1.14088829562698 NAD,18.3529 EOF

qsv table test.csv

returns

CUR RATE EUR 1 GBP 1.14088829562698 NAD 18.3529

qsv apply operations currencytonum RATE test.csv | qsv table

returns

CUR RATE EUR 1.00 GBP 319.79 NAD 18.35

as does

qsv apply operations numtocurrency RATE test.csv | qsv table

by the way

Expected behavior

The result should be something like

CUR RATE EUR 1 GBP 1.14 NAD 18.35

Screenshots/Backtrace/Sample Data N/A

Desktop (please complete the following information): macOS 13.1 qsv 0.80.0-standard-apply;fetch;foreach;generate;luau;to;self_update-8-8 (x86_64-apple-darwin compiled with Rust 1.66) prebuilt

Additional context

This behavior starts when there are more than 11 decimals :-)-O
enhancement
opened by ondohotola 9
to and stat comment: field inferencing not coherent

Hi, if I transform this CSV in parquet using to, I have that the luogo_esecuzione_istat field is mapped as integer in the output parquet file.

This is wrong, it's a string, and it's properly mapped as string using stats.

I think that to command should use the same field inferencing of stats.

Thank you

opened by aborruso 2
Add MySQL support and generate SQL script instead of/in addition to connecting

Is your feature request related to a problem? Please describe. I like the news PostgreSQL and SqlIte support, so of course one would like MySQL/MariaDB as well.

I would (sometimes) like to be able to read the SQL script produced before connecting and adding.

Describe the solution you'd like As above.

Describe alternatives you've considered Nil

Additional context Nil.
enhancement

opened by ondohotola 3
add index support to `luau` and `py` commands

This will allow random access of CSV files from luau and python scripts.

This will be done by using a special variable _index that can be accessed from the scripts to get/set the underlying index position.

If an index is not available/stale, and "_index" is referred to in the scripts, the index will be created/updated before executing the scripts proper and _rowcount will be set to the number of records.

If _index is set to a negative value, starts from the last record.
enhancement

opened by jqnatividad 0

Releases(0.81.0)

0.81.0(Jan 2, 2023)
[0.81.0] - 2023-01-02

Added

stats: added range statistic https://github.com/jqnatividad/qsv/pull/691

stats: added additional mode stats. For mode, added mode_count and mode_occurrences. Added "antimode" (opposite of mode - least frequently non-zero occurring value), antimode_count and antimode_occurrences. https://github.com/jqnatividad/qsv/pull/694

qsv-dateparser now recognizes unix timestamp values with fractional seconds to nanosecond precision as dates. stats, sniff, apply datefmt and schema, which all use qsv-dateparser, now infer unix timestamps as dates - https://github.com/jqnatividad/qsv/commit/a29ff8ea255d5aed9992556a0a23ab76117c8340 https://github.com/jqnatividad/qsv/pull/702

USAGE NOTE: As timestamps can be float or integer, and data type inferencing will guess dates last, preprocess timestamp columns with apply datefmt first to more date-like, non-timestamp formats, so they are recognized as dates by other qsv commands.

Changed

apply: document numtocurrency --comparand & --replacement behavior https://github.com/jqnatividad/qsv/commit/cc88fe921d8cdf7eedcb0008e16ebb5c46744f33

index: explicitly flush buffer after creating index https://github.com/jqnatividad/qsv/commit/ee5d790af1cde73dfc57b028bf52fa88e83cdaa4

sample: no longer requires an index to do percentage sampling https://github.com/jqnatividad/qsv/commit/45d4657713ebe2ae8388ce55f4cb1a733e727024

slice: removed unneeded utf8 check https://github.com/jqnatividad/qsv/commit/5a199f4442bd025cec31309bee44ac71bacbdfaa

schema: expand usage text regarding --strict-dates https://github.com/jqnatividad/qsv/commit/3d22829f3cf0441961e854555cd0c333bcb3ffb1

stats: date stats refactor. Date stats are returned in rfc3339 format. Dates are converted to timestamps with millisecond precision while calculating date stats. https://github.com/jqnatividad/qsv/pull/690 https://github.com/jqnatividad/qsv/commit/e7c297795ff5e82cf1dc242090be11ecced6da9a

filter out variance/stddev in tests as float precision issues are causing flaky CI tests https://github.com/jqnatividad/qsv/pull/696

Bump qsv-dateparser from 0.4.4 to 0.6.0

Bump qsv-stats from 0.4.6 to 0.5.2

Bump qsv-sniffer from 0.5.0 to 0.6.0

Bump serde from 1.0.151 to 1.0.152 by @dependabot in https://github.com/jqnatividad/qsv/pull/692

Bump csvs_convert from 0.7.7 to 0.7.8 by @dependabot in https://github.com/jqnatividad/qsv/pull/693

Bump once_cell from 0.16.0 to 0.17.0 https://github.com/jqnatividad/qsv/commit/d3ac2556c74e2ddd66dcee00e5e836d284b662a7

Bump self-update from 0.32.0 to 0.34.0 https://github.com/jqnatividad/qsv/commit/5f95933f01e2e0c592b52d7424b6a832aafd3591

Bump cpc from 1.8 to 1.9; set csvs_convert dependency to minor version https://github.com/jqnatividad/qsv/commit/ee9164810559f5496dfafba0e789b9cd84000a17

applied select clippy recommendations

deeplink to Cookbook from Table of Contents

pin Rust nightly to 2023-01-01

implementation comments on stats, sample, sort & Python distribution

Fixed

stats: prevent premature rounding, and make sum statistic use the same rounding method https://github.com/jqnatividad/qsv/commit/879214a1f3032f140f0207fe8807e1bb641110d7 https://github.com/jqnatividad/qsv/commit/1a1362031de8973b623598748bea4bc5fc6e08d3

fix autoindex so we return the index path properly https://github.com/jqnatividad/qsv/commit/d3ce6a3918683d66bf0f3246c7d6e8518eead392

fetch & fetchpost: corrected typo https://github.com/jqnatividad/qsv/commit/684036bbc237d5b80ea060f9ee8b8d46c1a2ad88

Full Changelog: https://github.com/jqnatividad/qsv/compare/0.80.0...0.81.0
Source code(tar.gz)
Source code(zip)
qsv-0.81.0-aarch64-apple-darwin.zip(12.56 MB)
qsv-0.81.0-aarch64-unknown-linux-gnu.zip(12.75 MB)
qsv-0.81.0-i686-pc-windows-msvc.zip(13.05 MB)
qsv-0.81.0-i686-unknown-linux-gnu.zip(13.80 MB)
qsv-0.81.0-x86_64-apple-darwin.zip(29.64 MB)
qsv-0.81.0-x86_64-pc-windows-gnu.zip(26.20 MB)
qsv-0.81.0-x86_64-pc-windows-msvc.zip(30.15 MB)
qsv-0.81.0-x86_64-unknown-linux-gnu.zip(30.57 MB)
qsv-0.81.0-x86_64-unknown-linux-musl.zip(29.80 MB)
0.80.0(Dec 23, 2022)
Added

new to command. Converts CSVs "to" PostgreSQL, SQLite, XLSX, Parquet and Data Package by @kindly in https://github.com/jqnatividad/qsv/pull/656

apply: add numtocurrency operation https://github.com/jqnatividad/qsv/pull/670

sort: add --ignore-case option https://github.com/jqnatividad/qsv/pull/673

stats: now computes summary statistics for dates as well https://github.com/jqnatividad/qsv/pull/684

added --updatenow option, resolves https://github.com/jqnatividad/qsv/issues/661 https://github.com/jqnatividad/qsv/pull/662

replace footnotes in Available Commands list with emojis :smile:

Changed

apply & applydp: expose --batch size option https://github.com/jqnatividad/qsv/pull/679

validate: add last valid row to validation error https://github.com/jqnatividad/qsv/commit/7680011a2fcc459aa621414122ecaa869e98ae83

input: add last valid row to error message https://github.com/jqnatividad/qsv/commit/492e51f85ab5a0637c201d7020d7ac2fdb72be96

upgrade to csvs-convert 0.7.5 by @kindly in https://github.com/jqnatividad/qsv/pull/668

Bump serial_test from 0.9.0 to 0.10.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/671

Bump csvs_convert from 0.7.5 to 0.7.7 by @dependabot in https://github.com/jqnatividad/qsv/pull/674

Bump num_cpus from 1.14.0 to 1.15.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/678

Bump robinraju/release-downloader from 1.6 to 1.7 by @dependabot in https://github.com/jqnatividad/qsv/pull/677

Bump actions/stale from 6 to 7 by @dependabot in https://github.com/jqnatividad/qsv/pull/676

Bump actions/setup-python from 4.3.1 to 4.4.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/683

added concurrency check to CI tests so that redundant CI test are canceled when new ones are launched

instead of saying "descriptive statistics", use more understandable "summary statistics"

changed publishing workflows to enable to feature for applicable target platforms

cargo update bump dependencies, notably qsv-stats from 0.4.5 to 0.4.6 and qsv_currency from 0.5.0 to 0.6.0

pin Rust nightly to 2022-12-22

Fixed

stats: fix leading zero handling https://github.com/jqnatividad/qsv/pull/667

apply: fix currencytonum bug https://github.com/jqnatividad/qsv/pull/669

Full Changelog: https://github.com/jqnatividad/qsv/compare/0.79.0...0.80.0
Source code(tar.gz)
Source code(zip)
qsv-0.80.0-aarch64-apple-darwin.zip(12.56 MB)
qsv-0.80.0-aarch64-unknown-linux-gnu.zip(12.74 MB)
qsv-0.80.0-i686-pc-windows-msvc.zip(13.05 MB)
qsv-0.80.0-i686-unknown-linux-gnu.zip(13.81 MB)
qsv-0.80.0-x86_64-apple-darwin.zip(29.64 MB)
qsv-0.80.0-x86_64-pc-windows-gnu.zip(26.20 MB)
qsv-0.80.0-x86_64-pc-windows-msvc.zip(30.15 MB)
qsv-0.80.0-x86_64-unknown-linux-gnu.zip(30.56 MB)
qsv-0.80.0-x86_64-unknown-linux-musl.zip(29.80 MB)
0.79.0(Dec 16, 2022)
Added

safenames: add --reserved option, allowing user to specify additional "unsafe" names https://github.com/jqnatividad/qsv/pull/657

safenames: add --prefix option https://github.com/jqnatividad/qsv/pull/658

fetch & fetchpost: added simple retry backoff multiplier - https://github.com/jqnatividad/qsv/commit/e343398ddd9c804237e73bbc652cc9e51c657b78

Changed

excel: refactored --metadata processing; added more debug messages; minor perf tweaks https://github.com/jqnatividad/qsv/commit/f137bab42f81518acd3ef825cd223b9970d70b02

set MSRV to Rust 1.6.6

cargo update bump several dependencies, notably qsv-dateparser

pin Rust nightly to 2022-12-15

Full Changelog: https://github.com/jqnatividad/qsv/compare/0.78.2...0.79.0
Source code(tar.gz)
Source code(zip)
qsv-0.79.0-aarch64-apple-darwin.zip(12.53 MB)
qsv-0.79.0-aarch64-unknown-linux-gnu.zip(12.72 MB)
qsv-0.79.0-i686-pc-windows-msvc.zip(13.02 MB)
qsv-0.79.0-i686-unknown-linux-gnu.zip(13.78 MB)
qsv-0.79.0-x86_64-apple-darwin.zip(25.00 MB)
qsv-0.79.0-x86_64-pc-windows-gnu.zip(26.15 MB)
qsv-0.79.0-x86_64-pc-windows-msvc.zip(25.56 MB)
qsv-0.79.0-x86_64-unknown-linux-gnu.zip(25.90 MB)
qsv-0.79.0-x86_64-unknown-linux-musl.zip(25.08 MB)
0.78.2(Dec 13, 2022)
Changed

cargo update bump paste 1.0.9 to 1.0.10

pin Rust nightly to 2022-12-12

Removed

excel: remove --safenames option. If you need safenames, use the safenames command https://github.com/jqnatividad/qsv/commit/e5da73bcc64ef3a8c66c611fd6247fa331117544

Full Changelog: https://github.com/jqnatividad/qsv/compare/0.78.1...0.78.2
Source code(tar.gz)
Source code(zip)
qsv-0.78.2-aarch64-apple-darwin.zip(12.52 MB)
qsv-0.78.2-aarch64-unknown-linux-gnu.zip(12.71 MB)
qsv-0.78.2-i686-pc-windows-msvc.zip(13.02 MB)
qsv-0.78.2-i686-unknown-linux-gnu.zip(13.77 MB)
qsv-0.78.2-x86_64-apple-darwin.zip(24.99 MB)
qsv-0.78.2-x86_64-pc-windows-gnu.zip(26.14 MB)
qsv-0.78.2-x86_64-pc-windows-msvc.zip(25.55 MB)
qsv-0.78.2-x86_64-unknown-linux-gnu.zip(25.87 MB)
qsv-0.78.2-x86_64-unknown-linux-musl.zip(25.06 MB)
0.78.1(Dec 12, 2022)
Changed

qsvdp: apply now available in qsvdp asapplydp - removing the geocode and calconv subcommands, and removing all operations that require third-party crates EXCEPT dynfmt and datefmt which is needed for Datapusher+ https://github.com/jqnatividad/qsv/pull/652

excel: fine-tune --metadata processing https://github.com/jqnatividad/qsv/commit/09530d4f65b06060d24b7ed3948aeab25b2aa7c8

bump serde from 1.0.149 to 1.0.150

qsvdp in now included in CI tests

Full Changelog: https://github.com/jqnatividad/qsv/compare/0.78.0...0.78.1
Source code(tar.gz)
Source code(zip)
qsv-0.78.1-aarch64-apple-darwin.zip(12.53 MB)
qsv-0.78.1-aarch64-unknown-linux-gnu.zip(12.71 MB)
qsv-0.78.1-i686-pc-windows-msvc.zip(13.02 MB)
qsv-0.78.1-i686-unknown-linux-gnu.zip(13.77 MB)
qsv-0.78.1-x86_64-apple-darwin.zip(24.99 MB)
qsv-0.78.1-x86_64-pc-windows-gnu.zip(26.15 MB)
qsv-0.78.1-x86_64-pc-windows-msvc.zip(25.55 MB)
qsv-0.78.1-x86_64-unknown-linux-gnu.zip(25.87 MB)
qsv-0.78.1-x86_64-unknown-linux-musl.zip(25.06 MB)
0.78.0(Dec 11, 2022)
Added

stats: added leading zero handling when inferring types (e.g. zipcodes like "07094" are strings not integers) https://github.com/jqnatividad/qsv/pull/648

stats: added --typesonly option, which infers only data types with date inferencing enabled for all columns https://github.com/jqnatividad/qsv/pull/650

stats: added underflow handing to sum statistic https://github.com/jqnatividad/qsv/commit/1b5e5451f929ad1c7dc5fb7f17b2a3261809ab05

excel: expanded --metadata functionality, with the option to return workbook metadata as JSON as well https://github.com/jqnatividad/qsv/pull/651

added platform-specific README for prebuilt zip archives https://github.com/jqnatividad/qsv/commit/15e247e523dbc22a50ebff1b15d7d0c4eb668bd5

Changed

safenames: improved usage text

stats: minor performance tweaks https://github.com/jqnatividad/qsv/commit/88be38b542fc61470a7b0331e7be3a3cad62a7bb and https://github.com/jqnatividad/qsv/commit/8aa58c5ad733116d246e171bcea622c1378b8e48

join: minor performance tweaks https://github.com/jqnatividad/qsv/commit/92d41910077148f769ccf2c8a283be2c30d68bbf

exclude: minor performance tweaks https://github.com/jqnatividad/qsv/commit/f3cc0ac29c5f3e6cec5a08d3aac3371d32b5eb0f

sniff: minor performance tweak https://github.com/jqnatividad/qsv/commit/d2a4676fcb5189fc9232538e68854cfcf4ef808b

sortcheck: minor performance tweak https://github.com/jqnatividad/qsv/commit/83c22ae5a623a8b0740f7024aac9448ee809eabd

switch GitHub Actions to use ubuntu-20.04 so as not to link to too new glibc libraries, preventing older distros from running the linux-gnu prebuilts.

switch GitHub Actions to use macos-12 to minimize flaky CI tests

expanded qsvdp description in README

Bump actions/setup-python from 4.3.0 to 4.3.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/645

cargo update bump several indirect dependencies

pin Rust nightly to 2022-12-10

Full Changelog: https://github.com/jqnatividad/qsv/compare/0.77.0...0.78.0
Source code(tar.gz)
Source code(zip)
qsv-0.78.0-aarch64-apple-darwin.zip(12.49 MB)
qsv-0.78.0-aarch64-unknown-linux-gnu.zip(12.67 MB)
qsv-0.78.0-i686-pc-windows-msvc.zip(12.98 MB)
qsv-0.78.0-i686-unknown-linux-gnu.zip(13.73 MB)
qsv-0.78.0-x86_64-apple-darwin.zip(24.90 MB)
qsv-0.78.0-x86_64-pc-windows-gnu.zip(26.05 MB)
qsv-0.78.0-x86_64-pc-windows-msvc.zip(25.46 MB)
qsv-0.78.0-x86_64-unknown-linux-gnu.zip(25.78 MB)
qsv-0.78.0-x86_64-unknown-linux-musl.zip(24.99 MB)
0.77.0(Dec 9, 2022)
Added

safenames: added Verbose JSON options https://github.com/jqnatividad/qsv/pull/644

Changed

py & luau: improved usage text

opt-in self-update in https://github.com/jqnatividad/qsv/pull/640 and https://github.com/jqnatividad/qsv/pull/641

Create README in prebuilt zip archive with platform specific notes (Logic created but not implemented until next release) https://github.com/jqnatividad/qsv/pull/642

Simplify python map_datetime test so it works on older Python versions https://github.com/jqnatividad/qsv/commit/e85e4e7bf9bf379f8478b066a9f6dea21afbf0e8

include date.lua in qsv package so cargo install works https://github.com/jqnatividad/qsv/commit/11a0ff8edc5405afd9cc6637de026bf2138a7df0

Bump data-encoding from 2.3.2 to 2.3.3 by @dependabot in https://github.com/jqnatividad/qsv/pull/638

cargo update bump several dependencies

pin Rust nightly to 2022-12-07

Fixed:

safenames: fixed calculation of unsafe headers as it was dupe-counting some unsafe headers - https://github.com/jqnatividad/qsv/pull/644

Full Changelog: https://github.com/jqnatividad/qsv/compare/0.76.3...0.77.0
Source code(tar.gz)
Source code(zip)
qsv-0.77.0-aarch64-apple-darwin.zip(12.55 MB)
qsv-0.77.0-aarch64-unknown-linux-gnu.zip(12.66 MB)
qsv-0.77.0-i686-pc-windows-msvc.zip(12.98 MB)
qsv-0.77.0-i686-unknown-linux-gnu.zip(13.72 MB)
qsv-0.77.0-x86_64-apple-darwin.zip(24.95 MB)
qsv-0.77.0-x86_64-pc-windows-gnu.zip(26.04 MB)
qsv-0.77.0-x86_64-pc-windows-msvc.zip(25.45 MB)
qsv-0.77.0-x86_64-unknown-linux-gnu.zip(25.74 MB)
qsv-0.77.0-x86_64-unknown-linux-musl.zip(13.73 MB)
0.76.3(Dec 5, 2022)
Changed

cargo update bump serde from 1.0.148 to 1.0.149

simplify python datetime test so it runs on Python 3.6 and above

Fixed

reverted not_luau_compatible introduced in 0.76.2 and 0.76.3. Adjusted Github Action publish workflow instead to properly build luau in qsvdp when the platform supports it.

Full Changelog: https://github.com/jqnatividad/qsv/compare/0.76.2...0.76.3
Source code(tar.gz)
Source code(zip)
qsv-0.76.3-aarch64-apple-darwin.zip(12.56 MB)
qsv-0.76.3-aarch64-unknown-linux-gnu.zip(12.66 MB)
qsv-0.76.3-i686-pc-windows-msvc.zip(12.99 MB)
qsv-0.76.3-i686-unknown-linux-gnu.zip(13.73 MB)
qsv-0.76.3-x86_64-apple-darwin.zip(24.95 MB)
qsv-0.76.3-x86_64-pc-windows-gnu.zip(26.03 MB)
qsv-0.76.3-x86_64-pc-windows-msvc.zip(25.43 MB)
qsv-0.76.3-x86_64-unknown-linux-gnu.zip(25.74 MB)
qsv-0.76.3-x86_64-unknown-linux-musl.zip(13.73 MB)
0.76.2(Dec 5, 2022)
Fixed

tweak not_luau_compatible feature so we can more easily disable luau feature when cross-compiling for some platforms where we cannot properly build luau.

NOTE: Not published on crates.io due to problems creating prebuilt binaries

Full Changelog: https://github.com/jqnatividad/qsv/compare/0.76.1...0.76.2
Source code(tar.gz)
Source code(zip)
0.76.1(Dec 5, 2022)
Fixed

added not_luau_compatible feature so we can more easily disable luau feature when cross-compiling for some platforms where we cannot properly build luau.

NOTE: Not published on crates.io due to problems creating prebuilt binaries

Full Changelog: https://github.com/jqnatividad/qsv/compare/0.76.0...0.76.1
Source code(tar.gz)
Source code(zip)
0.76.0(Dec 5, 2022)
Added

qsvdp: add luau in anticipation of Datapusher+ optional preprocessing https://github.com/jqnatividad/qsv/pull/634

luau: added ability to load libraries using "require"; preload LuaDate library https://github.com/jqnatividad/qsv/pull/633

luau: added more extensive debug logging support, adding _idx to debug log messages; trace log level support showing global vars and record values when an error occurs https://github.com/jqnatividad/qsv/pull/636 and https://github.com/jqnatividad/qsv/pull/637

Changed

py and luau: when errors encountered, return non-zero exit code, along with error count to stderr https://github.com/jqnatividad/qsv/pull/631

safenames and excel: Unsafe empty column/header names are replaced with "_blank" instead of "_" https://github.com/jqnatividad/qsv/pull/632

frequency: replace foreach iterator with regular for; remove unneeded assert https://github.com/jqnatividad/qsv/commit/74eb321defbf294675872a7dd891e8a7aedd31f1

bumped qsv-stats from 0.4.1 to 0.4.5 - fixing sum rounding and variance precision errors.

cargo update bump several indirect dependencies

pin Rust nightly to 2022-12-03

Fixed

stats: fix sum rounding and variance precision errors https://github.com/jqnatividad/qsv/pull/635

NOTE: Not published on crates.io due to problems creating prebuilt binaries

Full Changelog: https://github.com/jqnatividad/qsv/compare/0.75.0...0.76.0
Source code(tar.gz)
Source code(zip)
0.75.0(Dec 1, 2022)
Added:

py: added python datetime module by default in https://github.com/jqnatividad/qsv/pull/629

qsvdp (Datapusher+ optimized binary variant): added self-update. However, unlike qsv and qsvlite binary variants, qsvdp will not automatically prompt for a self-update, and will only inform the user if there is a new release. The user will need to invoke the --update option explicitly. https://github.com/jqnatividad/qsv/pull/622

Changed:

stats: Speedup type checking by @kindly in https://github.com/jqnatividad/qsv/pull/625

validate: Added a useful note about validate output by @aborruso in https://github.com/jqnatividad/qsv/pull/624

luau: Now precompiles all scripts, including the --prologue & --epilogue scripts, into bytecode https://github.com/jqnatividad/qsv/commit/e97c2caf81316bcf655875a9bee4c78dac5a8b70

frequency: remove unsafe from_utf8_unchecked https://github.com/jqnatividad/qsv/commit/16642e8ee3364309c1a774142976f6207ba5c594

More robust autoindexing in https://github.com/jqnatividad/qsv/pull/623

minor clippy performance tweaks to rust-csv fork

Bump serde from 1.0.147 to 1.0.148 by @dependabot in https://github.com/jqnatividad/qsv/pull/620

cargo update bump several indirect dependencies

improved README; use :sparkle: to indicate commands behind a feature flag

pin Rust nightly to 2022-11-30

New Contributors

@aborruso made their first contribution in https://github.com/jqnatividad/qsv/pull/624

@kindly made their first contribution in https://github.com/jqnatividad/qsv/pull/625

Full Changelog: https://github.com/jqnatividad/qsv/compare/0.74.0...0.75.0
Source code(tar.gz)
Source code(zip)
qsv-0.75.0-aarch64-apple-darwin.zip(12.55 MB)
qsv-0.75.0-aarch64-unknown-linux-gnu.zip(12.65 MB)
qsv-0.75.0-i686-pc-windows-msvc.zip(12.97 MB)
qsv-0.75.0-i686-unknown-linux-gnu.zip(13.71 MB)
qsv-0.75.0-x86_64-apple-darwin.zip(24.39 MB)
qsv-0.75.0-x86_64-pc-windows-gnu.zip(25.38 MB)
qsv-0.75.0-x86_64-pc-windows-msvc.zip(24.87 MB)
qsv-0.75.0-x86_64-unknown-linux-gnu.zip(25.14 MB)
qsv-0.75.0-x86_64-unknown-linux-musl.zip(13.72 MB)
0.74.0(Nov 27, 2022)
Added:

safenames: added --verify and --verbose modes in https://github.com/jqnatividad/qsv/pull/610 and https://github.com/jqnatividad/qsv/pull/615

Changed:

excel: align --safenames option to safenames command in https://github.com/jqnatividad/qsv/pull/611 and https://github.com/jqnatividad/qsv/pull/616

luau: Now precompiles main script to bytecode; now allow loading luau script from file for main, prologue and epilogue scripts in https://github.com/jqnatividad/qsv/pull/619

sniff: increase default sample size from 100 to 1000 in https://github.com/jqnatividad/qsv/commit/40d52cf0c67e39d645a1c76a26ae234999317b0b

validate: applied various optimizations in https://github.com/jqnatividad/qsv/commit/bfed127f28c4ccf6e9a18a5998588396594831d2 and https://github.com/jqnatividad/qsv/commit/06c109a0335326f57d903211334b4f2fb1ab7ccc

updated Github Actions workflows to reflect removal of luajit feature

Bump sysinfo from 0.26.7 to 0.26.8 by @dependabot in https://github.com/jqnatividad/qsv/pull/614

Bump rust_decimal from 1.26.1 to 1.27.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/617

cargo bump update several indirect dependencies

applied various clippy recommendations

pin Rust nightly to 2022-11-25

Removed:

luajit: removed as its been deprecated by optimized luau command which now support precompiling to bytecode, largely obviating the main feature of LuaJIT - Just-in-Time compilation in https://github.com/jqnatividad/qsv/pull/619

Full Changelog: https://github.com/jqnatividad/qsv/compare/0.73.2...0.74.0
Source code(tar.gz)
Source code(zip)
qsv-0.74.0-aarch64-apple-darwin.zip(12.41 MB)
qsv-0.74.0-aarch64-unknown-linux-gnu.zip(12.55 MB)
qsv-0.74.0-i686-pc-windows-msvc.zip(12.82 MB)
qsv-0.74.0-i686-unknown-linux-gnu.zip(13.60 MB)
qsv-0.74.0-x86_64-apple-darwin.zip(24.15 MB)
qsv-0.74.0-x86_64-pc-windows-gnu.zip(25.14 MB)
qsv-0.74.0-x86_64-pc-windows-msvc.zip(24.59 MB)
qsv-0.74.0-x86_64-unknown-linux-gnu.zip(24.85 MB)
qsv-0.74.0-x86_64-unknown-linux-musl.zip(24.63 MB)
0.73.2(Nov 22, 2022)
Changed:

Link to tests as examples from usage text in https://github.com/jqnatividad/qsv/pull/608

Bump serde_json from 1.0.88 to 1.0.89 by @dependabot in https://github.com/jqnatividad/qsv/pull/607

cargo update bump to get latest crossbeam crates to replace yanked crates https://github.com/jqnatividad/qsv/commit/5108a87b0f5e2d5a7cfef3f60f4cd6b3659bce7d

Full Changelog: https://github.com/jqnatividad/qsv/compare/0.73.1...0.73.2
Source code(tar.gz)
Source code(zip)
qsv-0.73.2-aarch64-apple-darwin.zip(12.40 MB)
qsv-0.73.2-aarch64-unknown-linux-gnu.zip(12.82 MB)
qsv-0.73.2-i686-pc-windows-msvc.zip(13.07 MB)
qsv-0.73.2-i686-unknown-linux-gnu.zip(13.59 MB)
qsv-0.73.2-x86_64-apple-darwin.zip(24.12 MB)
qsv-0.73.2-x86_64-pc-windows-gnu.zip(25.07 MB)
qsv-0.73.2-x86_64-pc-windows-msvc.zip(24.57 MB)
qsv-0.73.2-x86_64-unknown-linux-gnu.zip(24.81 MB)
qsv-0.73.2-x86_64-unknown-linux-musl.zip(25.26 MB)
0.73.1(Nov 22, 2022)
Changed:

rename safename command to safenames for consistency

cargo update bump indirect dependencies

Full Changelog: https://github.com/jqnatividad/qsv/compare/0.73.0...0.73.1
Source code(tar.gz)
Source code(zip)
qsv-0.73.1-aarch64-apple-darwin.zip(12.40 MB)
qsv-0.73.1-aarch64-unknown-linux-gnu.zip(12.81 MB)
qsv-0.73.1-i686-pc-windows-msvc.zip(13.07 MB)
qsv-0.73.1-i686-unknown-linux-gnu.zip(13.59 MB)
qsv-0.73.1-x86_64-apple-darwin.zip(24.12 MB)
qsv-0.73.1-x86_64-pc-windows-gnu.zip(25.07 MB)
qsv-0.73.1-x86_64-pc-windows-msvc.zip(24.57 MB)
qsv-0.73.1-x86_64-unknown-linux-gnu.zip(24.81 MB)
qsv-0.73.1-x86_64-unknown-linux-musl.zip(25.26 MB)
0.73.0(Nov 21, 2022)
Added

safenames: new command to modify header names to db-safe names in https://github.com/jqnatividad/qsv/pull/606

apply: added censor-count operation in https://github.com/jqnatividad/qsv/pull/599

apply: added escape operation in https://github.com/jqnatividad/qsv/pull/600

excel: added --safe-names option in https://github.com/jqnatividad/qsv/pull/598

Changed

apply: refactored to use enums instead of strings for operations in https://github.com/jqnatividad/qsv/pull/601

fetch & fetchpost: --http-header -H shortcut in https://github.com/jqnatividad/qsv/pull/596

excel: smarter date parsing for XLSX files; rename --safe-column-names to --safe-names in https://github.com/jqnatividad/qsv/pull/603

Smarter safe names in https://github.com/jqnatividad/qsv/pull/605

Bump uuid from 1.2.1 to 1.2.2 by @dependabot in https://github.com/jqnatividad/qsv/pull/594

Bump mimalloc from 0.1.31 to 0.1.32 by @dependabot in https://github.com/jqnatividad/qsv/pull/595

Bump censor from 0.2.0 to 0.3.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/597

Bump Swatinem/rust-cache from 1 to 2 by @dependabot in https://github.com/jqnatividad/qsv/pull/602

cargo update bump several indirect dependencies

pin Rust nightly to 2022-11-19

Full Changelog: https://github.com/jqnatividad/qsv/compare/0.72.0...0.73.0
Source code(tar.gz)
Source code(zip)
qsv-0.73.0-aarch64-apple-darwin.zip(12.41 MB)
qsv-0.73.0-aarch64-unknown-linux-gnu.zip(12.82 MB)
qsv-0.73.0-i686-pc-windows-msvc.zip(13.07 MB)
qsv-0.73.0-i686-unknown-linux-gnu.zip(13.59 MB)
qsv-0.73.0-x86_64-apple-darwin.zip(24.13 MB)
qsv-0.73.0-x86_64-pc-windows-gnu.zip(25.08 MB)
qsv-0.73.0-x86_64-pc-windows-msvc.zip(24.58 MB)
qsv-0.73.0-x86_64-unknown-linux-gnu.zip(24.82 MB)
qsv-0.73.0-x86_64-unknown-linux-musl.zip(25.27 MB)
0.72.0(Nov 14, 2022)
Added

apply: added --keep-zero-time option in https://github.com/jqnatividad/qsv/pull/590

lua and luajit: added --prologue & --epilogue options in https://github.com/jqnatividad/qsv/pull/592

luau & luajit: switched from Lua to Luau; added special vars _idx and _rowcount in https://github.com/jqnatividad/qsv/pull/593

luau & luajit: return exitcode 1 if interpretation error is encountered https://github.com/jqnatividad/qsv/commit/655041b86c86c3ce0024d1e20599c98dfab28658

Changed

schema & validate: expand description/usage text in https://github.com/jqnatividad/qsv/commit/60dfebc9f401045467417b2065481b657ff82c92

validate: return exitcode 0 if CSV is valid; exitcode 1 otherwise in https://github.com/jqnatividad/qsv/pull/591

Bump hashbrown from 0.12.3 to 0.13.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/586

cargo bump update indirect dependencies, notably chrono from 0.4.22 to 0.4.23

Shortened command descriptions for luau & luajit and added salient notes to new interpreter section

adjust GitHub Actions workflows to use luau feature

pin Rust nightly to 2022-11-14

Full Changelog: https://github.com/jqnatividad/qsv/compare/0.71.1...0.72.0
Source code(tar.gz)
Source code(zip)
qsv-0.72.0-aarch64-apple-darwin.zip(13.17 MB)
qsv-0.72.0-aarch64-unknown-linux-gnu.zip(12.75 MB)
qsv-0.72.0-i686-pc-windows-msvc.zip(12.99 MB)
qsv-0.72.0-i686-unknown-linux-gnu.zip(13.52 MB)
qsv-0.72.0-x86_64-apple-darwin.zip(24.01 MB)
qsv-0.72.0-x86_64-pc-windows-gnu.zip(24.99 MB)
qsv-0.72.0-x86_64-pc-windows-msvc.zip(24.46 MB)
qsv-0.72.0-x86_64-unknown-linux-gnu.zip(24.71 MB)
qsv-0.72.0-x86_64-unknown-linux-musl.zip(25.16 MB)
0.71.1(Nov 9, 2022)
Changed

python feature is no longer enabled in the prebuilt binaries to avoid distribution issues and qsv panicking if the exact python version it was statically linked against is not available. If you require the python feature, you'll have to install/build from source.

Fixed

whirlwind tour: join's --no-case option has been replaced by --ignore-case by @alperyilmaz in https://github.com/jqnatividad/qsv/pull/585

New Contributors

@alperyilmaz made their first contribution in https://github.com/jqnatividad/qsv/pull/585

Full Changelog: https://github.com/jqnatividad/qsv/compare/0.71.0...0.71.1
Source code(tar.gz)
Source code(zip)
qsv-0.71.1-aarch64-apple-darwin.zip(12.46 MB)
qsv-0.71.1-aarch64-unknown-linux-gnu.zip(12.60 MB)
qsv-0.71.1-i686-pc-windows-msvc.zip(12.97 MB)
qsv-0.71.1-i686-unknown-linux-gnu.zip(13.67 MB)
qsv-0.71.1-x86_64-apple-darwin.zip(24.12 MB)
qsv-0.71.1-x86_64-pc-windows-gnu.zip(25.03 MB)
qsv-0.71.1-x86_64-pc-windows-msvc.zip(24.56 MB)
qsv-0.71.1-x86_64-unknown-linux-gnu.zip(24.74 MB)
qsv-0.71.1-x86_64-unknown-linux-musl.zip(25.14 MB)
0.71.0(Nov 8, 2022)
Added

apply: new encode and decode operations in https://github.com/jqnatividad/qsv/pull/569

apply: add ability to show confidence to whatlang language detection in https://github.com/jqnatividad/qsv/pull/579

count: add --width option in https://github.com/jqnatividad/qsv/pull/582

fetch & fetchpost: Added --user_agent option by @minhajuddin2510 in https://github.com/jqnatividad/qsv/pull/565 and https://github.com/jqnatividad/qsv/commit/f59bd8591079f22df3c40e5f036c5e2ff83e77f8

Documented Homebrew installer :rocket: created by @FnControlOption

Changed

apply: refactor operations validation in https://github.com/jqnatividad/qsv/pull/564 and https://github.com/jqnatividad/qsv/commit/f83ec6f7e7fa7bed9bcc2b5e55516a61e5154b52

sortcheck: expand usage text and use fail_clierror macro https://github.com/jqnatividad/qsv/commit/8513b53eaac594d20106b3f77f73f3d1b63e227d

stats: minor refactoring https://github.com/jqnatividad/qsv/commit/38795134e3ed66bf0816eeee2a68aa9b557c4908

tojsonl: it does "smart" conversion of CSV to JSONL https://github.com/jqnatividad/qsv/commit/af98290bf1803ae5ab3e01df5f20f5b007912e02

validate: also show --progressbar when doing schemaless validation https://github.com/jqnatividad/qsv/commit/aae550aa0b1042e205689ae40d19c0532e7ae584

only show enabled commands in command list in https://github.com/jqnatividad/qsv/pull/583

Updated the benchmark script by @minhajuddin2510 in https://github.com/jqnatividad/qsv/pull/581

Switch back to using num_cpus for detecting parallelism https://github.com/jqnatividad/qsv/commit/b7dbed88f7d931e03a835ca4a929328c2c4a34b6

qsv now links against Python 3.11 for the py command in https://github.com/jqnatividad/qsv/pull/576

Bump robinraju/release-downloader from 1.5 to 1.6 by @dependabot in https://github.com/jqnatividad/qsv/pull/568

Bump newline-converter from 0.2.0 to 0.2.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/567

Bump sysinfo from 0.26.5 to 0.26.7 by @dependabot in https://github.com/jqnatividad/qsv/pull/566 and https://github.com/jqnatividad/qsv/pull/572

Bump ahash from 0.8.0 to 0.8.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/575

Bump flexi_logger from 0.24.0 to 0.24.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/574

Bump pyo3 from 0.17.2 to 0.17.3 by @dependabot in https://github.com/jqnatividad/qsv/pull/573

Bump jql from 5.1.1 to 5.1.2 by @dependabot in https://github.com/jqnatividad/qsv/pull/577

Bump num_cpus from 1.13.1 to 1.14.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/580

Bump mimalloc from 0.1.30 to 0.1.31

Bump indicatif from 0.17.1 to 0.17.2

cargo update bump several indirect dependencies

updated rustfmt.toml with comment and string formatting options

bump MSRV to 1.65.0

pin Rust Nightly to 2022-11-07

Full Changelog: https://github.com/jqnatividad/qsv/compare/0.70.0...0.71.0
Source code(tar.gz)
Source code(zip)
qsv-0.71.0-aarch64-apple-darwin.zip(12.46 MB)
qsv-0.71.0-aarch64-unknown-linux-gnu.zip(12.60 MB)
qsv-0.71.0-i686-pc-windows-msvc.zip(12.97 MB)
qsv-0.71.0-i686-unknown-linux-gnu.zip(13.67 MB)
qsv-0.71.0-x86_64-apple-darwin.zip(40.87 MB)
qsv-0.71.0-x86_64-pc-windows-gnu.zip(39.95 MB)
qsv-0.71.0-x86_64-pc-windows-msvc.zip(39.28 MB)
qsv-0.71.0-x86_64-unknown-linux-gnu.zip(56.47 MB)
qsv-0.71.0-x86_64-unknown-linux-musl.zip(57.06 MB)
0.70.0(Oct 24, 2022)
Added

apply: additional operations - squeeze0, strip_prefix and strip_suffix https://github.com/jqnatividad/qsv/pull/518 & https://github.com/jqnatividad/qsv/pull/519

apply: add calcconv subcommand, which parses & evaluate math expressions, with support for units & conversions. https://github.com/jqnatividad/qsv/pull/560

Changed

search & searchset: make match count optional https://github.com/jqnatividad/qsv/pull/526

jsonl: remove panic and do proper error handling; add --ignore-errors option https://github.com/jqnatividad/qsv/pull/531

py: py command does not do aggregations (reduce) operations https://github.com/jqnatividad/qsv/pull/548

lua & luajit can do aggregations across CSV rows and py cannot https://github.com/jqnatividad/qsv/pull/549

py: add more complex f-string formatting example https://github.com/jqnatividad/qsv/pull/556

Standardize ignore case option https://github.com/jqnatividad/qsv/pull/535

Use rustfmt nightly to take advantage of advanced features like StdExternalCrate https://github.com/jqnatividad/qsv/pull/514 & https://github.com/jqnatividad/qsv/pull/517

Update benchmark-basic.sh by @minhajuddin2510 in https://github.com/jqnatividad/qsv/pull/542

Use fail macros more consistently https://github.com/jqnatividad/qsv/pull/545

Use Redis ahash feature for performance

Added wix file for future Windows Installer by @minhajuddin2510 in https://github.com/jqnatividad/qsv/pull/546

Bump console from 0.15.1 to 0.15.2 by @dependabot in https://github.com/jqnatividad/qsv/pull/512

Bump pyo3 from 0.17.1 to 0.17.2 by @dependabot in https://github.com/jqnatividad/qsv/pull/522

Bump jql from 5.0.2 to 5.1.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/521

Bump titlecase from 2.2.0 to 2.2.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/534

Bump itoa from 1.0.3 to 1.0.4 by @dependabot in https://github.com/jqnatividad/qsv/pull/533

Bump sysinfo from 0.26.4 to 0.26.5 by @dependabot in https://github.com/jqnatividad/qsv/pull/554

Bump mlua from 0.8.3 to 0.8.4 by @dependabot in https://github.com/jqnatividad/qsv/pull/540

Bump uuid from 1.1.2 to 1.2.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/539

Bump flexi_logger from 0.23.3 to 0.24.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/538

Bump serde_json from 1.0.85 to 1.0.86 by @dependabot in https://github.com/jqnatividad/qsv/pull/537

Bump actions/setup-python from 4.2.0 to 4.3.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/541

Bump filetime from 0.2.17 to 0.2.18 by @dependabot in https://github.com/jqnatividad/qsv/pull/559

Bump redis from 0.21.6 to 0.22.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/555

Bump cached from 0.39.0 to 0.40.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/561

Bump whatlang from 0.16.1 to 0.16.2 by @dependabot in https://github.com/jqnatividad/qsv/pull/562

cargo update bump several indirect dependencies

Pin Rust nightly to 2022-10-22

Fixed

excel: xls float temporary workaround for #516 that was ultimately fixed in PR 558 https://github.com/jqnatividad/qsv/pull/520

tojsonl: escape newlines and double quotes. Fixes #552 https://github.com/jqnatividad/qsv/pull/553

tojsonl: better error handling; when checking stdin for utf8, make sure its not empty. Fixes #530 https://github.com/jqnatividad/qsv/pull/536

Removed

excel: removed xls float workaround now that calamine crate has been fixed. Fixes #516 removing need for PR 520 workaround. https://github.com/jqnatividad/qsv/pull/558

removed obsolete Rust Nightly workflow https://github.com/jqnatividad/qsv/commit/2a99318242040300130c323dc3e7df504a6e3b2e

New Contributors

@minhajuddin2510 made their first contribution in https://github.com/jqnatividad/qsv/pull/542

Full Changelog: https://github.com/jqnatividad/qsv/compare/0.69.0...0.70.0
Source code(tar.gz)
Source code(zip)
qsv-0.70.0-aarch64-apple-darwin.zip(23.12 MB)
qsv-0.70.0-aarch64-unknown-linux-gnu.zip(12.60 MB)
qsv-0.70.0-i686-pc-windows-msvc.zip(12.86 MB)
qsv-0.70.0-i686-unknown-linux-gnu.zip(13.62 MB)
qsv-0.70.0-x86_64-apple-darwin.zip(38.40 MB)
qsv-0.70.0-x86_64-pc-windows-gnu.zip(39.60 MB)
qsv-0.70.0-x86_64-pc-windows-msvc.zip(38.86 MB)
qsv-0.70.0-x86_64-unknown-linux-gnu.zip(51.96 MB)
qsv-0.70.0-x86_64-unknown-linux-musl.zip(52.55 MB)
0.69.0(Sep 28, 2022)
Added

luajit: new command using LuaJIT, which is much faster than Lua https://github.com/jqnatividad/qsv/pull/500

Changed

python: tweaks. Expanded usage text. Only show python version when logging is on. https://github.com/jqnatividad/qsv/pull/507

fetch & fetchpost: apply clippy recommendation https://github.com/jqnatividad/qsv/commit/dd7220bce2811d9e8248c379af5d5c38da3b02d5

excel: use winfo! macro https://github.com/jqnatividad/qsv/commit/7211ff214a58394d68c8c7484e8ef4505d75b482

Removed anyhow dependency https://github.com/jqnatividad/qsv/pull/508

Bump actions/stale from 5 to 6 by @dependabot in https://github.com/jqnatividad/qsv/pull/505

Bump sysinfo from 0.26.3 to 0.26.4 by @dependabot in https://github.com/jqnatividad/qsv/pull/510

Cargo update bump several indirect dependencies

include Python 3.10 shared libraries when publishing for select platforms

bump MSRV to Rust 1.64.0

Pin Rust nightly to 2022-09-26

Fixed

python: corrected erroneous --helper example. Included hashhelper.py example.

extsort: fixed --help bug (https://github.com/jqnatividad/qsv/issues/506)

Full Changelog: https://github.com/jqnatividad/qsv/compare/0.68.0...0.69.0
Source code(tar.gz)
Source code(zip)
qsv-0.69.0-aarch64-apple-darwin.zip(12.09 MB)
qsv-0.69.0-aarch64-unknown-linux-gnu.zip(12.39 MB)
qsv-0.69.0-i686-pc-windows-msvc.zip(12.69 MB)
qsv-0.69.0-i686-unknown-linux-gnu.zip(13.42 MB)
qsv-0.69.0-x86_64-apple-darwin.zip(37.23 MB)
qsv-0.69.0-x86_64-pc-windows-gnu.zip(38.46 MB)
qsv-0.69.0-x86_64-pc-windows-msvc.zip(38.05 MB)
qsv-0.69.0-x86_64-unknown-linux-gnu.zip(50.79 MB)
qsv-0.69.0-x86_64-unknown-linux-musl.zip(51.37 MB)
0.68.0(Sep 16, 2022)
Changed

Simplify python support. For prebuilt binaries, Python 3.10 is now required and the python 3.10 shared libraries are bundled for select platforms. If you require an earlier version of Python (3.6 and up), you'll have to install/compile from source. https://github.com/jqnatividad/qsv/pull/492

Smarter self update. --update can still be explicitly invoked even when self-update feature has been disabled. Further, if you compiled qsv from source, self-update will only notify you of new releases, instead of proceeding with self-update. https://github.com/jqnatividad/qsv/pull/490 and https://github.com/jqnatividad/qsv/pull/493

lua: switch from Lua 5.4 to LuaJIT 2.1, primarily for performance https://github.com/jqnatividad/qsv/pull/495

lua: when filtering using floats, "0.0" is false

join: removed unneeded utf8 check

search: simplify regex_unicode check

fetch & fetchpost: optimize imports; remove unneeded utf8 check

Bump anyhow from 1.0.64 to 1.0.65 by @dependabot in https://github.com/jqnatividad/qsv/pull/498

Bump self_update from 0.31.0 to 0.32.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/499

add additional copyright holder to MIT License

Improved publishing workflow for prebuilt binaries

cargo update bumped several dependencies

pin Rust nightly to 2022-09-14

Fixed

fix typos by @kianmeng in https://github.com/jqnatividad/qsv/pull/491

python: better error handling. When mapping/filtering, python expression errors no longer cause a panic, but instead fail to map/filter as expected (when mapping, "<ERROR>" is returned, when filtering, the filter is not applied), and continue processing. Also, other errors are properly propagated instead of panicking. https://github.com/jqnatividad/qsv/pull/496

lua: better error handling. When mapping/filtering, Lua errors no longer cause a panic, but instead fail to map/filter as expected (when mapping, "<ERROR>" is returned, when filtering, the filter is not applied), and continue processing. https://github.com/jqnatividad/qsv/pull/497

New Contributors

@kianmeng made their first contribution in https://github.com/jqnatividad/qsv/pull/491

Full Changelog: https://github.com/jqnatividad/qsv/compare/0.67.0...0.68.0
Source code(tar.gz)
Source code(zip)
publish-testing(Sep 17, 2022)

This release is for testing the Release Github Action publishing workflow.

These downloads are not meant for general use.
Source code(tar.gz)
Source code(zip)
qsv-publish-testing-aarch64-apple-darwin-testing.zip(12.63 MB)
qsv-publish-testing-i686-pc-windows-msvc-testing.zip(13.05 MB)
qsv-publish-testing-x86_64-apple-darwin-testing.zip(16.16 MB)
qsv-publish-testing-x86_64-pc-windows-gnu-testing.zip(14.11 MB)
qsv-publish-testing-x86_64-pc-windows-msvc-testing.zip(16.75 MB)
qsv-publish-testing-x86_64-unknown-linux-gnu-testing.zip(16.80 MB)
qsv-publish-testing-x86_64-unknown-linux-musl-testing.zip(16.42 MB)
0.67.0(Sep 9, 2022)
Added

added self_update feature, so users can build qsv without self-update engine.
Previously, self-update was enabled by default. https://github.com/jqnatividad/qsv/pull/483 and https://github.com/jqnatividad/qsv/pull/484

Changed

search & searchset: --quick option returns first match row to stderr https://github.com/jqnatividad/qsv/pull/475

python: make --batch size configurable https://github.com/jqnatividad/qsv/pull/485

stats: added more implementation comments; standardize string creation

replace: add conditional compilation to eliminate dead_code warning

lua: when filtering, non-zero integers are true

refactored workdir.rs test helpers

refactored util:init_logger() to log command-line arguments

Bump url from 2.3.0 to 2.3.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/489

Bump anyhow from 1.0.63 to 1.0.64 by @dependabot in https://github.com/jqnatividad/qsv/pull/478

Bump sysinfo from 0.26.1 to 0.26.2 by @dependabot in https://github.com/jqnatividad/qsv/pull/477

Bump robinraju/release-downloader from 1.4 to 1.5 by @dependabot in https://github.com/jqnatividad/qsv/pull/481

cargo update bump indirect dependencies

pin Rust nightly to 2022-09-07

Full Changelog: https://github.com/jqnatividad/qsv/compare/0.66.0...0.67.0
Source code(tar.gz)
Source code(zip)
qsv-0.67.0-aarch64-apple-darwin.zip(19.00 MB)
qsv-0.67.0-aarch64-unknown-linux-gnu.zip(12.44 MB)
qsv-0.67.0-arm-unknown-linux-gnueabihf.zip(12.33 MB)
qsv-0.67.0-arm-unknown-linux-musleabihf.zip(12.45 MB)
qsv-0.67.0-i686-pc-windows-msvc.zip(12.57 MB)
qsv-0.67.0-i686-unknown-linux-gnu.zip(13.47 MB)
qsv-0.67.0-x86_64-apple-darwin.zip(8.00 MB)
qsv-0.67.0-x86_64-pc-windows-gnu.zip(6.47 MB)
qsv-0.67.0-x86_64-pc-windows-msvc.zip(35.87 MB)
qsv-0.67.0-x86_64-unknown-linux-gnu.zip(21.04 MB)
qsv-0.67.0-x86_64-unknown-linux-musl.zip(21.13 MB)
0.66.0(Sep 2, 2022)
Added

apply: added Multi-column subcommands by @udsamani in https://github.com/jqnatividad/qsv/pull/462

stats: added --round option https://github.com/jqnatividad/qsv/pull/474

created fail_format! macro for more concise error handling in https://github.com/jqnatividad/qsv/pull/471

Changed

Move command usage text to beginning of cmd source code, so we don't need to move around deeplinks to usage texts from README https://github.com/jqnatividad/qsv/pull/467

Optimize conditional compilation of various qsv binary variants, removing dead code https://github.com/jqnatividad/qsv/pull/473

fetch & fetchpost: removed initial burst of requests, making the commands "friendlier" to rate-limited APIs

search, searchset & replace: minor performance optimizations

created dedicated rustfmt GitHub action workflow to ensure code is always rust formatted. Previously, rustfmt check was in Linux workflow.

applied some clippy recommendations

Bump actix-governor from 0.3.1 to 0.3.2 by @dependabot in https://github.com/jqnatividad/qsv/pull/461

cargo update bumped several dependencies

pin Rust nightly to 2022-08-31

set RUSTFLAGS to emit=asm when compiling pre-built binaries for performance see http://likebike.com/posts/How_To_Write_Fast_Rust_Code.html#emit-asm

Fixed

extsort code was being compiled for qsvdp even if it was not enabled

bump sysinfo from 0.25.2 to 0.26.0, fixing segfault on Apple Silicon

fixed qsvnp on Windows so it doesn't look for python shared libraries even if python is not enabled

fixed CliError::Other so it returns bad exitcode (exitcode 1) instead of incorrect_usage (exit code 2)

New Contributors

@udsamani made their first contribution in https://github.com/jqnatividad/qsv/pull/462

Full Changelog: https://github.com/jqnatividad/qsv/compare/0.65.0...0.66.0
Source code(tar.gz)
Source code(zip)
qsv-0.66.0-aarch64-apple-darwin.zip(19.03 MB)
qsv-0.66.0-aarch64-unknown-linux-gnu.zip(12.48 MB)
qsv-0.66.0-arm-unknown-linux-gnueabihf.zip(12.39 MB)
qsv-0.66.0-arm-unknown-linux-musleabihf.zip(12.51 MB)
qsv-0.66.0-i686-pc-windows-msvc.zip(12.65 MB)
qsv-0.66.0-i686-unknown-linux-gnu.zip(13.53 MB)
qsv-0.66.0-x86_64-apple-darwin.zip(35.15 MB)
qsv-0.66.0-x86_64-pc-windows-gnu.zip(36.55 MB)
qsv-0.66.0-x86_64-pc-windows-msvc.zip(36.03 MB)
qsv-0.66.0-x86_64-unknown-linux-gnu.zip(36.13 MB)
qsv-0.66.0-x86_64-unknown-linux-musl.zip(36.72 MB)
0.65.0(Aug 28, 2022)
Added

Major refactoring of main variants - moved redundant code to a new module - clitypes.rs. Added custom exit codes. Removed --exitcode option in several commands as qsv now returns exit codes for ALL commands in a standard way. https://github.com/jqnatividad/qsv/pull/460

Major refactoring of CI test helpers in workdir.rs

Changed

py: use python interning to amortize allocs https://github.com/jqnatividad/qsv/pull/457

search & searchset: return num of matches to stderr; add --quick option; remove --exitcode option https://github.com/jqnatividad/qsv/pull/458

extsort: improved error handling

fetch & fetchpost: better --report option handling https://github.com/jqnatividad/qsv/pull/451

lua: faster number to string conversion using itoa and ryu

replace: removed --exitcode option; return number of replaced patterns to stderr

sortcheck: --json options now always cause full scan of CSV

stats: expanded usage text, explicitly listing stats that require loading the entire CSV into memory. Mentioned data type inferences are guaranteed.

cargo update bumped several dependencies

pin Rust nightly to 2022-08-27

Fixed

py: batched python processing refactor. Instead of using one GILpool for one session, py now processes in batches of 30,000 rows, releasing memory after each batch. This resulted in memory consumption levelling out, instead of increasing to gigabytes of memory with very large files. As an added bonus, this made the py command ~30% faster in testing. :smile: https://github.com/jqnatividad/qsv/pull/456

Full Changelog: https://github.com/jqnatividad/qsv/compare/0.64.0...0.65.0
Source code(tar.gz)
Source code(zip)
qsv-0.65.0-aarch64-apple-darwin.zip(12.63 MB)
qsv-0.65.0-aarch64-unknown-linux-gnu.zip(12.47 MB)
qsv-0.65.0-arm-unknown-linux-gnueabihf.zip(12.38 MB)
qsv-0.65.0-arm-unknown-linux-musleabihf.zip(12.50 MB)
qsv-0.65.0-i686-pc-windows-msvc.zip(12.63 MB)
qsv-0.65.0-i686-unknown-linux-gnu.zip(13.52 MB)
qsv-0.65.0-x86_64-apple-darwin.zip(35.56 MB)
qsv-0.65.0-x86_64-pc-windows-gnu.zip(36.43 MB)
qsv-0.65.0-x86_64-pc-windows-msvc.zip(35.89 MB)
qsv-0.65.0-x86_64-unknown-linux-gnu.zip(36.02 MB)
qsv-0.65.0-x86_64-unknown-linux-musl.zip(36.53 MB)
0.64.0(Aug 24, 2022)
Added

added sortcheck command https://github.com/jqnatividad/qsv/pull/445

replace: added --exitcode and --progressbar options

Changed

apply: improved usage text

excel: replace --list-sheets option with expanded --metadata option https://github.com/jqnatividad/qsv/pull/448

sortcheck improvements https://github.com/jqnatividad/qsv/pull/447

extsort: improved error handling

progressbar messages are now logged

bump pyo3 from 0.16 to 0.17

bump reqwest & redis "patches" further upstream

cargo update bump several indirect dependencies

pin Rust nightly to 2022-08-22

Fixed

extsort: fixed sysinfo segfault on Apple Silicon by pinning sysinfo to 0.25.2 https://github.com/jqnatividad/qsv/pull/446

tojsonl: fixed panic with stdin input

Full Changelog: https://github.com/jqnatividad/qsv/compare/0.63.2...0.64.0
Source code(tar.gz)
Source code(zip)
qsv-0.64.0-aarch64-apple-darwin.zip(12.03 MB)
qsv-0.64.0-aarch64-unknown-linux-gnu.zip(12.48 MB)
qsv-0.64.0-arm-unknown-linux-gnueabihf.zip(12.39 MB)
qsv-0.64.0-arm-unknown-linux-musleabihf.zip(12.51 MB)
qsv-0.64.0-i686-pc-windows-msvc.zip(12.64 MB)
qsv-0.64.0-i686-unknown-linux-gnu.zip(13.53 MB)
qsv-0.64.0-x86_64-apple-darwin.zip(35.75 MB)
qsv-0.64.0-x86_64-pc-windows-gnu.zip(36.58 MB)
qsv-0.64.0-x86_64-pc-windows-msvc.zip(36.05 MB)
qsv-0.64.0-x86_64-unknown-linux-gnu.zip(36.22 MB)
qsv-0.64.0-x86_64-unknown-linux-musl.zip(36.72 MB)
0.63.2(Aug 18, 2022)
🥳 🎉 100th release of qsv! 🎆 💯

Added

fetchpost: added formdata to report https://github.com/jqnatividad/qsv/pull/434

search & searchset: added --exitcode option https://github.com/jqnatividad/qsv/pull/439

search & searchset: added --progressbar option

progressbars are now optional by default; added QSV_PROGRESSBAR env var to override setting

search, searchset & replace: added mem-limit options for regex-powered commands https://github.com/jqnatividad/qsv/pull/440

Changed

Bump jql from 4.0.7 to 5.0.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/436

progressbars are now off by default, and are disabled with stdin input https://github.com/jqnatividad/qsv/pull/438

lua & py: improved error-handling when loading script files

stats: changed to using AtomicBool instead of OnceCell, use with_capacity in hot compute loop to minize allocs - hyperfine shows 18% perf increase with these changes

qsv now uses custom exit codes introduced in Rust 1.61

self-update now gives a proper error message when GitHub is rate-limiting updates

cargo update bump several dependencies

document MSRV policy

pin Rust Nightly to 2022-08-16

Fixed

fixed stdin input causing an error when progressbars are enabled https://github.com/jqnatividad/qsv/pull/438

Full Changelog: https://github.com/jqnatividad/qsv/compare/0.62.0...0.63.2

ℹ️ NOTE: 0.63.0 and 0.63.1 were yanked due to publishing/packaging errors
Source code(tar.gz)
Source code(zip)
qsv-0.63.2-aarch64-apple-darwin.zip(12.01 MB)
qsv-0.63.2-aarch64-unknown-linux-gnu.zip(12.45 MB)
qsv-0.63.2-arm-unknown-linux-gnueabihf.zip(12.37 MB)
qsv-0.63.2-arm-unknown-linux-musleabihf.zip(12.48 MB)
qsv-0.63.2-i686-pc-windows-msvc.zip(12.62 MB)
qsv-0.63.2-i686-unknown-linux-gnu.zip(13.51 MB)
qsv-0.63.2-x86_64-apple-darwin.zip(35.79 MB)
qsv-0.63.2-x86_64-pc-windows-gnu.zip(36.52 MB)
qsv-0.63.2-x86_64-pc-windows-msvc.zip(36.04 MB)
qsv-0.63.2-x86_64-unknown-linux-gnu.zip(36.23 MB)
qsv-0.63.2-x86_64-unknown-linux-musl.zip(36.73 MB)
0.62.0(Aug 12, 2022)
Added

fetchpost: new command that uses HTTP POST, as opposed to fetch - which uses
HTTP GET (difference between HTTP GET & POST methods) https://github.com/jqnatividad/qsv/pull/431

Added qsvnp binary variant to prebuilt binaries - qsv with all the features EXCEPT python

Changed

fetch: refactor report parameter processing https://github.com/jqnatividad/qsv/pull/426

Bump serde from 1.0.142 to 1.0.143 by @dependabot in https://github.com/jqnatividad/qsv/pull/423

Bump ahash from 0.7.6 to 0.8.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/425

Bump serial_test from 0.8.0 to 0.9.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/428

Bump anyhow from 1.0.60 to 1.0.61 by @dependabot in https://github.com/jqnatividad/qsv/pull/427

Bump sysinfo from 0.25.1 to 0.25.2 by @dependabot in https://github.com/jqnatividad/qsv/pull/429

Bump actix-governor from 0.3.0 to 0.3.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/430

cargo update bump various indirect dependencies

pin Rust nightly to 2022-08-11

change MSRV to 1.63

Fixed

excel: fixed empty sheet handling https://github.com/jqnatividad/qsv/pull/422

Full Changelog: https://github.com/jqnatividad/qsv/compare/0.61.4...0.62.0
Source code(tar.gz)
Source code(zip)
qsv-0.62.0-aarch64-apple-darwin.zip(12.00 MB)
qsv-0.62.0-aarch64-unknown-linux-gnu.zip(12.44 MB)
qsv-0.62.0-arm-unknown-linux-gnueabihf.zip(12.35 MB)
qsv-0.62.0-arm-unknown-linux-musleabihf.zip(12.47 MB)
qsv-0.62.0-i686-pc-windows-msvc.zip(12.61 MB)
qsv-0.62.0-i686-unknown-linux-gnu.zip(13.50 MB)
qsv-0.62.0-x86_64-apple-darwin.zip(35.77 MB)
qsv-0.62.0-x86_64-pc-windows-gnu.zip(36.49 MB)
qsv-0.62.0-x86_64-pc-windows-msvc.zip(36.06 MB)
qsv-0.62.0-x86_64-unknown-linux-gnu.zip(36.21 MB)
qsv-0.62.0-x86_64-unknown-linux-musl.zip(36.71 MB)
0.61.4(Aug 8, 2022)
Changed

py: qsv uses the present working directory to find python shared library

py: show python version info on startup

publish qsvnp - another binary variant with all features except python

bumped once_cell from 1.12 to 1.13

use reqwest upstream with MSRV from 1.49 to 1.56; lazy_static to once_cell

update calamine fork with chrono time feature disabled

BetterTOML reformat cargo.toml

pin Rust nightly to 2022-08-06

Fixed

excel: remove unneeded checkutf8 for writer

Full Changelog: https://github.com/jqnatividad/qsv/compare/0.61.2...0.61.4
Source code(tar.gz)
Source code(zip)
qsv-0.61.4-aarch64-apple-darwin.zip(11.94 MB)
qsv-0.61.4-aarch64-unknown-linux-gnu.zip(12.41 MB)
qsv-0.61.4-arm-unknown-linux-gnueabihf.zip(12.34 MB)
qsv-0.61.4-arm-unknown-linux-musleabihf.zip(12.45 MB)
qsv-0.61.4-i686-pc-windows-msvc.zip(12.58 MB)
qsv-0.61.4-i686-unknown-linux-gnu.zip(13.47 MB)
qsv-0.61.4-x86_64-apple-darwin.zip(35.60 MB)
qsv-0.61.4-x86_64-pc-windows-gnu.zip(36.38 MB)
qsv-0.61.4-x86_64-pc-windows-msvc.zip(35.95 MB)
qsv-0.61.4-x86_64-unknown-linux-gnu.zip(36.09 MB)
qsv-0.61.4-x86_64-unknown-linux-musl.zip(36.60 MB)