Find files with SQL-like queries

Overview

fselect

Find files with SQL-like queries

Crates.io Build Status

Why use fselect?

While it doesn't tend to fully replace traditional find and ls, fselect has these nice features:

  • SQL-like (not real SQL, but highly relaxed!) grammar easily understandable by humans
  • complex queries
  • aggregate, statistics, date, and other functions
  • search within archives
  • .gitignore, .hgignore, and .dockerignore support (experimental)
  • search by width and height of images, EXIF metadata
  • search by MP3 info
  • search by extended file attributes
  • search by file hashes
  • search by MIME type
  • shortcuts to common file types
  • interactive mode
  • various output formatting (CSV, JSON, and others)

More is under way!

Installation

Latest release from source

  • Install Rust with Cargo and its dependencies to build a binary
  • Run cargo install fselect

Arch Linux

AUR package, thanks to @asm0dey

NixOS

fselect in nixpkgs, thanks to @filalex77

Other Linux

Static build with musl.

Windows 64bit

A statically precompiled binary is available at Github downloads.

Windows from Chocolatey

Mac via Homebrew

  • Install brew
  • Run brew install fselect

Mac via MacPorts

  • Install MacPorts
  • Run:
    sudo port selfupdate
    sudo port install fselect
    

Usage

fselect [ARGS] COLUMN[, COLUMN...] [from ROOT[, ROOT...]] [where EXPR] [order by COLUMNS] [limit N] [into FORMAT]

Interactive mode

fselect -i

Documentation

More detailed description. Look at examples first.

Examples

Find temporary or config files (full path and size):

fselect size, path from /home/user where name = '*.cfg' or name = '*.tmp'

Windows users may omit the quotes:

fselect size, path from C:\Users\user where name = *.cfg or name = *.tmp

Or put all the arguments into the quotes like this:

fselect "name from /home/user/tmp where size > 0"

Find files (just names) with any content (size > 0):

fselect name from /home/user/tmp where size gt 0

Specify file size, get absolute path, and add it to the results:

cd /home/user
fselect size, abspath from ./tmp where size gt 2g
fselect fsize, abspath from ./tmp where size = 5m
fselect hsize, abspath from ./tmp where size lt 8k

More complex query:

fselect "name from /tmp where (name = *.tmp and size = 0) or (name = *.cfg and size > 1000000)"

Aggregate functions:

fselect "MIN(size), MAX(size), AVG(size), SUM(size), COUNT(*) from /home/user/Downloads"

Formatting functions:

fselect "LOWER(name), UPPER(name), LENGTH(name), YEAR(modified) from /home/user/Downloads"

Get the year of an oldest file:

fselect "MIN(YEAR(modified)) from /home/user"

Use single quotes if you need to address files with spaces:

fselect "path from '/home/user/Misc stuff' where name != 'Some file'"

Regular expressions of Rust flavor are supported:

fselect name from /home/user where path =~ '.*Rust.*'

Negate regular expressions:

fselect "name from . where path !=~ '^\./config'"

Simple globs expand automatically and work with = and != operators:

fselect name from /home/user where path = '*Rust*'

Classic LIKE:

fselect "path from /home/user where name like '%report-2018-__-__???'"

Exact match operators to search with regexps disabled:

fselect "path from /home/user where name === 'some_*_weird_*_name'"

Find files by date:

fselect path from /home/user where created = 2017-05-01
fselect path from /home/user where modified = today
fselect path from /home/user where accessed = yesterday
fselect "path from /home/user where modified = 'apr 1'"
fselect "path from /home/user where modified = 'last fri'"

Be more specific to match all files created at interval between 3PM and 4PM:

fselect path from /home/user where created = '2017-05-01 15'

And even more specific:

fselect path from /home/user where created = '2017-05-01 15:10'
fselect path from /home/user where created = '2017-05-01 15:10:30'

Date and time intervals possible (find everything updated since May 1st):

fselect path from /home/user where modified gte 2017-05-01

Default is current directory:

fselect path, size where name = '*.jpg'

Search within multiple locations:

fselect path from /home/user/oldstuff, /home/user/newstuff where name = '*.jpg'

With minimum and/or maximum depth specified (depth is a synonym for maxdepth):

fselect path from /home/user/oldstuff depth 5 where name = '*.jpg'
fselect path from /home/user/oldstuff mindepth 2 maxdepth 5, /home/user/newstuff depth 10 where name = '*.jpg'

Optionally follow symlinks:

fselect path, size from /home/user symlinks where name = '*.jpg'

Search within archives (currently only zip-archives are supported):

fselect path, size from /home/user archives where name = '*.jpg'

Or in combination:

fselect size, path from /home/user depth 5 archives symlinks where name = '*.jpg' limit 100

Enable .gitignore or .hgignore support:

fselect size, path from /home/user/projects gitignore where name = '*.cpp'
fselect size, path from /home/user/projects hgignore where name = '*.py'        

Search by image dimensions:

fselect CONCAT(width, 'x', height), path from /home/user/photos where width gte 2000 or height gte 2000

Find square images:

fselect path from /home/user/Photos where width = height

Find old-school rap MP3 files:

fselect duration, path from /home/user/music where genre = Rap and bitrate = 320 and mp3_year lt 2000  

Shortcuts to common file extensions:

fselect path from /home/user where is_archive = true
fselect path, mime from /home/user where is_audio = 1
fselect path, mime from /home/user where is_book != false
fselect path from /home/user where is_doc != 1
fselect path from /home/user where is_image = false
fselect path from /home/user where is_video != true

Find files with dangerous permissions:

fselect mode, path from /home/user where other_write = true or other_exec = true
fselect mode, path from /home/user where other_all = true

Simple glob-like expressions or even regular expressions on file mode are possible:

fselect mode, path from /home/user where mode = '*rwx'
fselect mode, path from /home/user where mode =~ '.*rwx$'

Find files by owner's uid or gid:

fselect uid, gid, path from /home/user where uid != 1000 or gid != 1000

Or by owner's or group's name:

fselect user, group, path from /home/user where user = mike or group = mike

Find special files:

fselect name from /usr/bin where suid = true
fselect path from /tmp where is_pipe = true
fselect path from /tmp where is_socket = 1

Find files with xattrs, check if particular xattr exists, or get its value:

fselect "path, has_xattrs, has_xattr(user.test), xattr(user.test) from /home/user"

Include arbitrary text as columns:

fselect "name, ' has size of ', size, ' bytes'"

Order results:

fselect path from /tmp order by size desc, name
fselect modified, fsize, path from ~ order by 1 desc, 3

Finally limit the results:

fselect name from /home/user/samples limit 5 

Format output:

fselect size, path from /home/user limit 5 into json
fselect size, path from /home/user limit 5 into csv
fselect size, path from /home/user limit 5 into html

License

MIT/Apache-2.0


Supported by JetBrains IDEA open source license

Comments
  • thread 'main' panicked

    thread 'main' panicked

    thread 'main' panicked at 'called Result::unwrap() on an Err value: Os { code: 63, kind: Other, message: "File name too long" }', libcore/result.rs:945:5 note: Run with RUST_BACKTRACE=1 for a backtrace.

    opened by adals 12
  • Enhancement: directory globbing for `from` clause

    Enhancement: directory globbing for `from` clause

    It would be useful to be able to glob the directories specified for from similar to the way bash does it. I.e., it would be nice to:

    fselect abspath from ~/.[a-zA-Z]* where ...
    

    For example to find executables in my "hidden" directories right now I need to something like (I think):

    fselect abspath from $(fselect abspath from ~ maxdepth 1 where abspath =~ '^/home/rick/\.[a-zA-Z]+' and is_dir=true into json | jq -r 'map(.AbsPath)|join(",")') where is_file=true and user_exec=true
    

    If instead I try:

    fselect abspath from ~ where abspath =~ '^/home/rick/\.[a-zA-Z]+' and is_file=true and user_exec=true
    

    ... it takes forever because of the very many directories/files fselect would need to parse.

    opened by rickhg12hs 10
  • gitignore option only uses `.gitignore` from current directory

    gitignore option only uses `.gitignore` from current directory

    When using the gitignore option it only reads the .gitignore file from the current directory. It is pretty common to have multiple tiers of gitignore files effecting different part's of a repo.

    It would be great if fselect could use the repo's true gitignore list. You can get the list of ignored files and folders with git status --ignored --porcelain (git v1.7.6 and later). There are also some other options.

    Looking at gitignore.rs it looks like you are manually parsing the ignore file and basically implementing the logic, so shelling out to git to get the files would be a bit of a change. There is also a Rust libgit2 binding that might be an option, although that is probably even more work and a new dependency.

    opened by giodamelio 7
  • DIRNAME, BASENAME functions?

    DIRNAME, BASENAME functions?

    Would be useful to have functions that operate on paths using the system-local path separator, so you can extract directory and filename portions of a path.

    Example: /this/is/a/sample/directory.txt

    DIRNAME - /this/is/a/sample BASENAME - directory.txt FILEEXT - txt

    in progress 
    opened by herbygillot 6
  • Better config path handling

    Better config path handling

    Instead of hardcoded path C:\Users\user_name\AppData\Roaming\jhspetersson\fselect\config.toml

    It is better if you use %HOME% or even %APPDATA% instead.

    A related improvement is to look for the config file next to the executable first (for portable use), or allow defining location using a switch.

    Thanks

    opened by gd4c 5
  • Add Cargo.lock file

    Add Cargo.lock file

    In the process of making changes to formulae in the Homebrew package manager, I noticed that fselect was one of a handful of Rust binary projects without a Cargo.lock file in version control. The Cargo book recommends the following (source):

    If you’re building an end product, which are executable like command-line tool or an application, or a system library with crate-type of staticlib or cdylib, check Cargo.lock into git.

    More information about the reasoning can be found in the "Why do binaries have Cargo.lock in version control, but not libraries?" section of the Cargo FAQ.

    The Cargo.lock file helps package managers to keep builds reproducible, since cargo install simply uses the latest dependency versions unless the --locked flag is added to the command, in which case it will use the versions outlined in Cargo.lock. Without a Cargo.lock file, there's a chance that a dependency update will break the build sometime in the future, which is something I've already encountered with other Rust binary projects.

    Would you please consider checking Cargo.lock into version control?

    opened by samford 5
  • gitignore doesn't work on Windows

    gitignore doesn't work on Windows

    I am doing a quick test of fselect on Windows 10 on the top-level of an Angular project. I started with the simple request adapted from readme: size, path from . where name = *.ts or name = *.css or name = *.html but it started to collect all files in node_modules. So I changed the request to: size, path from . gitignore where name = *.ts or name = *.css or name = *.html but I get the same.

    1087    .\node_modules\ignore\index.d.ts
    2861    .\node_modules\ipaddr.js\lib\ipaddr.js.d.ts
    110     .\node_modules\is-plain-object\index.d.ts
    97      .\node_modules\isobject\index.d.ts
    

    and so on. In .gitignore, we have the classical paths:

    /build/
    /bin/
    /node_modules
    

    My suspicion is that you check against slashes but get backslashes out of the filesystem API calls. If I change the request to: size, path from ./src where name = *.ts or name = *.css or name = *.html I get a listing of:

    30      ./src\shared\file\index.ts
    2750    ./src\shared\file\item.model.ts
    60      ./src\shared\main\index.ts
    1297    ./src\shared\main\modal.service.ts
    66      ./src\shared\sharing\index.ts
    

    with mixed slashes and backslashes, which isn't very nice... :-) Since forward slashes are well understood overall in Windows, I suggest to normalize on them.

    Thanks for this very interesting and flexible utility.

    opened by PhiLhoSoft 5
  • Colorize output when printing to terminal

    Colorize output when printing to terminal

    Just like some other file-searching related tools do (fd, ripgrep, exa - to mention some), fselect could use colors too. Since fselect doesn't support command line options, there might be a simple rule, colorize the output if it prints to terminal, but disable colors if the output is redirected.

    Personally, I'd like to see paths colorized according to LS_COLORS variable, that would help to visually distinguish the files by types.


    Thank you for a great tool! :wrench:

    opened by sudo-nice 5
  • Support count(*)

    Support count(*)

    Would be nice to have built-in support for counting the resultset.

    fselect 'count(*) from node_modules where name = package.json'
    

    gets the total amount of installed packages in a JS project, for example.

    opened by passcod 5
  • Columns are printed in separate lines

    Columns are printed in separate lines

    When I select multiple columns, they get printed in separate lines:

    $ fselect name size from /usr depth 1
    lib32
    36864
    etc
    4096
    share
    20480
    lib64
    270336
    include
    69632
    man
    4096
    bin
    139264
    lib
    270336
    local
    4096
    sbin
    139264
    src
    4096
    

    I would have expected a tabular format:

    lib32   36864
    etc     4096
    share   20480
    lib64   270336
    include 69632
    man     4096
    bin     139264
    lib     270336
    local   4096
    sbin    139264
    src     4096
    

    Or, at least, CSV (or even tab separated):

    lib32,36864
    etc,4096
    share,20480
    lib64,270336
    include,69632
    man,4096
    bin,139264
    lib,270336
    local,4096
    sbin,139264
    src,4096
    
    opened by idanarye 5
  • Incorrect MODE returned

    Incorrect MODE returned

    |Product|Version| |---|---| | Debian |buster| |fselect-x86_64-linux-musl.gz| 0.7.6|


    Query:

    fselect name,mode FROM ./ INTO csv >result.csv
    

    Non matching results:

    |ls -la| fselect| |---|---| |-rwxr-xr-x 1 xxx xxx 2922344 Jul 6 10:54 fselect|/fselect,-wxr--r-x| |-rw-r--r-- 1 xxx xxx 1228202 Jul 6 10:53 fselect-x86_64-linux-musl.gz|fselect-x86_64-linux-musl.gz,------r--| |-rw-r--r-- 1 xxx xxx 756 Jul 6 11:26 result.csv|/result.csv,------r--|

    opened by CompuRoot 4
  • Print a message if a newer version is available

    Print a message if a newer version is available

    Hi everyone! And thank you for this awesome project!

    I found this feature very useful in the GitHub CLI application and npm.

    I think it'll be useful for users who install fselect using cargo. And this will let them know that a new version is available and it's time to update.

    Ready-made crates to do that:

    • https://github.com/mgrachev/update-informer
    • https://github.com/tarikeshaq/update-notifier
    opened by nobsaibot 0
  • Automatically build debs with each release

    Automatically build debs with each release

    Hi! It seems that newer releases don't have .deb assets, perhaps automating the production of binary assets in CI would be worthwhile?

    Several examples from other projects:

    https://github.com/dandavison/delta/blob/master/.github/workflows/cd.yml https://github.com/sharkdp/fd/blob/1b71425419a2bfdbbb9b47c52d78ba373e2465d6/.github/workflows/CICD.yml#L224

    opened by ethanhs 0
  • SQLite backend in the future?

    SQLite backend in the future?

    Hi,

    As fselect is SQL oriented, could a SQLite backend be conceivable as an alternate direct output format for storing and querying fselect results? Many use cases would be possible for having a better knowledge of a might be large dataset. SQLite could be easily updated from a new run of fselect. Importing into SQLite from CSV files is not ideal. What's your feeling about it? Thanks

    opened by jgranduel 0
  • created not working

    created not working

    In version 0.8.1, fselect path,created from . order by created does not seem to work - the created date is not showing nor is the results being sorted by the created date.

    opened by Dialga 2
  • Suggestion for how to compare contents of two paths?

    Suggestion for how to compare contents of two paths?

    This is a fantastic little utility, thanks for putting it together! One of my projects (a statically generated site with over 100k files) has a "audit" requirement which I was wondering whether fselect might fulfill.

    Specifically we need to see if each *.md file in the source directory (e.g. content) has generated one or more files in the destination directory (e.g. public). Would fselect be able to "diff" the entries in the public directory and indicate whether something in the content directory appears in the same path in the public directory?

    Something like below would be ideal but since the documentation says fselect is not real SQL I wanted to see if it was possible anyway :-).

    SELECT path
      FROM public
     WHERE NOT EXISTS (
           SELECT * FROM content 
           WHERE content.path_parent = public.path_parent AND content.path = CONCAT(public.path_without_extn, '.md')) 
       AND name = '*.html' 
    
    opened by shah 0
  • permission denied when using fselect in folder with lost+found

    permission denied when using fselect in folder with lost+found

    $ fselect size, abspath from . where size gt 2g
    ./lost+found: Permission denied (os error 13)
    

    I can't even get around it by adding user=myusername or group=mygroup.

    opened by Dialga 0
Releases(0.8.1)
Owner
null
ergonomic paths and files in rust

path_abs: ergonomic paths and files in rust. This library aims to provide ergonomic path and file operations to rust with reasonable performance. See

Rett Berg 45 Oct 29, 2022
fftp is the "Fast File Transport Protocol". It transfers files quickly between computers on a network with low overhead.

fftp fftp is the "Fast File Transport Protocol". It transfers files quickly between computers on a network with low overhead. Motivation FTP uses two

leo 4 May 12, 2022
Collects accurate files while running in parallel through directories. (Simple, Fast, Powerful)

collectfiles Collects accurate files while running in parallel through directories. (Simple, Fast, Powerful) | Docs | Latest Note | [dependencies] col

Doha Lee 2 Jun 1, 2022
Merge together and efficiently time-sort compressed .pcap files stored in AWS S3 object storage (or locally) to stdout for pipelined processing.

Merge together and efficiently time-sort compressed .pcap files stored in AWS S3 object storage (or locally) to stdout for pipelined processing. High performance and parallel implementation for > 10 Gbps playback throughput with large numbers of files (~4k).

null 4 Aug 19, 2022
ezio offers an easy to use IO API for reading and writing to files and stdio

ezio - a crate for easy IO ezio offers an easy to use IO API for reading and writing to files and stdio. ezio includes utilities for generating random

Nick Cameron 98 Dec 21, 2022
A tool for analyzing the size of dependencies in compiled Golang binary files, providing insights into their impact on the final build.

gsv A simple tool to view the size of a Go compiled binary. Build on top of bloaty. Usage First, you need to compile your Go program with the followin

null 70 Apr 12, 2023
Running SQL-like queries on files.

filesql Running SQL-like queries on files. Features Supported: REPL Basic SQL expressions. INSERT clause. (which inserts data into another file) WHERE

Zhang Li 1 Nov 15, 2021
command line tool to navigate JSON files with basic SQL-like queries

navi-json command line tool to navigate JSON files with basic SQL-like queries. The name plays with the assonance with the word 'navigator', at least

Giulio Toldo 2 Oct 2, 2022
🧰 The Rust SQL Toolkit. An async, pure Rust SQL crate featuring compile-time checked queries without a DSL. Supports PostgreSQL, MySQL, SQLite, and MSSQL.

SQLx ?? The Rust SQL Toolkit Install | Usage | Docs Built with ❤️ by The LaunchBadge team SQLx is an async, pure Rust† SQL crate featuring compile-tim

launchbadge 7.6k Dec 31, 2022
An object-relational in-memory cache, supports queries with an SQL-like query language.

qlcache An object-relational in-memory cache, supports queries with an SQL-like query language. Warning This is a rather low-level library, and only p

null 3 Nov 14, 2021
Run SQL queries on CSV files

zsql run SQL queries on csv files A terminal utility to easily run SQL queries on CSV files. zsql is shipped as a small single binary powered by rust

Zizaco 9 Jul 9, 2022
A tool that allow you to run SQL-like query on local files instead of database files using the GitQL SDK.

FileQL - File Query Language FileQL is a tool that allow you to run SQL-like query on local files instead of database files using the GitQL SDK. Sampl

Amr Hesham 39 Mar 12, 2024
Fgr - Find & Grep utility with SQL-like query language

fgr Find & Grep utility with SQL-like query language. Examples # Find all files with name equal to sample under the current directory: fgr -e name=sam

Igor 3 Dec 22, 2022
A Rust library for evaluating log4j substitution queries in order to determine whether or not malicious queries may exist.

log4j_interpreter A Rust library for evaluating log4j substitution queries in order to determine whether or not malicious queries may exist. Limitatio

Fastly 22 Nov 7, 2022
Distributed, version controlled, SQL database with cryptographically verifiable storage, queries and results. Think git for postgres.

SDB - SignatureDB Distributed, version controlled, SQL database with cryptographically verifiable storage, queries and results. Think git for postgres

Fremantle Industries 5 Apr 26, 2022
Rust library to parse, deparse and normalize SQL queries using the PostgreSQL query parser

This Rust library uses the actual PostgreSQL server source to parse SQL queries and return the internal PostgreSQL parse tree.

pganalyze 37 Dec 18, 2022
HTTP Proxy based solution for real-time interception and prioritization of SQL queries.

starproxy ⚠️ starproxy is a prototype: Not currently used in production, but will likely be some day. Table of Contents starproxy Table of Contents Ba

Will Eaton 5 Mar 6, 2023
A Rust SQL query builder with a pleasant fluent API closely imitating actual SQL

Scooby An SQL query builder with a pleasant fluent API closely imitating actual SQL. Meant to comfortably build dynamic queries with a little bit of s

Aleksei Voronov 100 Nov 11, 2022
Gh-sql - Query GitHub Projects (beta) with SQL

gh-sql: Query GitHub Projects (beta) with SQL Installation gh extension install KOBA789/gh-sql Features SELECT items DELETE items UPDATE item fields

Hidekazu Kobayashi 108 Dec 7, 2022
SQL validator tool for BigQuery standard SQL.

bqvalid What bqvalid does bqvalid is the SQL validator tool for BigQuery standard SQL. bqvalid fails with error message if there's the expression that

null 10 Dec 25, 2022