🕵️‍♀️ Find, locate, and query files for ops and security experts ⚡️⚡️⚡️

Overview


Recon

Find, locate, and query files for ops and security experts

Key FeaturesHow To UseDownloadContributingLicense

screenshot

Key Features

  • Query with SQL over files - find files using the full power of SQL queries
  • Find content with digests - use SHA256/512, md5, crc32 for duplicates, and other matchers for nontrivial matches to locate artifacts on hosts
  • Find malware or binaries with YARA - use YARA rules for matching against binary files efficiently
  • Finetune your search runtime - you can choose only the processing you need in order to cover more files quickly
  • Build your own scripts - pipe recon results to your own scripts with --xargs
  • Export - use --csv or --json, or upload recon.db to your own servers for analysis

How to Use

$ recon --help
SQL over files with security processing and tests

Usage: recon [OPTIONS]

Options:
  -c, --config <CONFIG_FILE>  Point to a configuration
  -r, --root <ROOT>           Target folder to scan
  -q, --query <SQL>           Query with SQL
  -f, --file <DB_FILE>        Use a specific DB file (file or :inmem: for in memory) [default: recon.db]
  -d, --delete                Clear data: delete existing cache database before running
  -u, --update                Always walk files and update DB before query. Leave off to run query on existing recon.db.
  -a, --all                   Walk all files (dont consider .gitignore)
      --no-spinner            Don't display a spinner for progress
      --xargs                 Output as xargs formatted list
      --json                  Output as JSON
      --csv                   Output as CSV
      --fail-some             Exit code failure if *some* files are found
      --fail-none             Exit code failure if *no* files are found
      --verbose               Show logs
  -h, --help                  Print help information
  -V, --version               Print version information

Query files

By default recon walks the current directory and allows you to perform standard SQL queries against it:

$ recon -q 'select path,is_binary,mode from files limit 5'
┌───────────────┬───────────┬────────────┐
│ path          │ is_binary │ mode       │
├───────────────┼───────────┼────────────┤
│ ./os.rs       │ null      │ -rw-r--r-- │
├───────────────┼───────────┼────────────┤
│ ./out.rs      │ null      │ -rw-r--r-- │
├───────────────┼───────────┼────────────┤
│ ./bin/main.rs │ null      │ -rw-r--r-- │
├───────────────┼───────────┼────────────┤
│ ./config.rs   │ null      │ -rw-r--r-- │
├───────────────┼───────────┼────────────┤
│ ./lib.rs      │ null      │ -rw-r--r-- │
└───────────────┴───────────┴────────────┘
5 files in 16.872083ms

recon also caches the metadata it indexes in recon.db (an sqlite DB), to avoid re-indexing when you're experimenting with different queries.

To query against the cache (fastest, creates the cache if missing):

$ recon -q <your query>

To always update the cache before runs (good for incomplete index runs or changed folders between runs):

$ recon -u -q <your query>

To delete the cache and recreate it before runs (good for starting from scratch):

$ recon -d -q <your query>

Add processors and/or matchers

Configure processors and matchers in case you need some compute-heavy data to be available for query or matching such as:

  • Digests or checksums (md5, SHAxxx, crc32)
  • Path, content matching
  • YARA rules
  • For more, see config.yaml.example

How to know which ones you need?

  • If there aren't a lot of files, you can use all of them. Generally - don't worry about it.
  • If you have a performance constraint (lots of files + limited time to run full index), pick only the fields you need to query on.

Let's add binary detection:

source:
  computed_fields:
    is_binary: true

Did you know? The common way to detect if a file is binary, is to read a chunk of data from it (usually 1k bytes) and search for any non-text bytes in it. Largely: there's no built-in OS metadata "flag" that indicates if a file is a binary file.

Upon updating configuration use -u to update the local cache database.

$ recon -u -c ../config.yaml -q 'select path,is_binary,mode from files limit 6'
┌───────────────┬───────────┬────────────┐
│ path          │ is_binary │ mode       │
├───────────────┼───────────┼────────────┤
│ ./os.rs       │ 0         │ -rw-r--r-- │
├───────────────┼───────────┼────────────┤
│ ./out.rs      │ 0         │ -rw-r--r-- │
├───────────────┼───────────┼────────────┤
│ ./bin/main.rs │ 0         │ -rw-r--r-- │
├───────────────┼───────────┼────────────┤
│ ./config.rs   │ 0         │ -rw-r--r-- │
├───────────────┼───────────┼────────────┤
│ ./lib.rs      │ 0         │ -rw-r--r-- │
├───────────────┼───────────┼────────────┤
│ ./recon.db    │ 1         │ -rw-r--r-- │
└───────────────┴───────────┴────────────┘
6 files in 145.81675ms

Download

For macOS:

brew tap rusty-ferris-club/tap && brew install recon

Otherwise, grab a release from releases and run recon --help:

Examples

Fully configured examples (open a PR to add your own!):

Example Purpose
all-processors.yaml Turn on all processors by default
custom-walking.yaml Compute fields only for part of the walked directory tree
file-classes.yaml Configure and classify your own file classes
find-log4shell.yaml An example for finding the log4shell vulnerability using known digests
using-yara.yaml Using a simple YARA ruleset for matching

FAQ

What can I use this tool for?

Recon is a multi-purpose metadata indexing search and query tool with a focus on content types.

You can use it as:

  • A better find. Find files with a more expressive SQL query abilities.
  • A forensics tool: locate files matching advanced or simple criteria from machines that have issues.
  • A reconnaissance tool for security experts: explore infected / suspected machines and discover and capture malware or harmful content
$ DATABASE_URL=sqlite::memory: recon <..args..>
What are some example queries?

Make sure you have a config file which configures computed fields for digests:

# ...
  computed_fields:
    byte_type: true
    is_binary: true
    file_magic: true
    # crc32: true
    sha256: true
    sha512: true
    simhash: true

You can try any of these:

Show binary files only:

recon -c config.yaml -q 'select path from files where is_binary = 1'

Create a list of file digests

recon -c config.yaml -q 'select path,sha256,sha512 from files'

Export everything

recon -c config.yaml -q 'select * from files' --csv
How can I configure cache behavior?

Recon uses a sqlite database file for caching, query, and capture.

The default is to use a file called recon.db. You can pick your own database file name:

$ recon -f DB_FILE -q <query>

To use an inmemory database, use the special file name :memory::

$ recon -f :memory: -q <query>

Or, you can use an inmemory database by giving a full URL to your database via a DATABASE_URL environment variable. This environment variable takes precedence over specifying a file.

$ DATABASE_URL=sqlite::memory: recon <..args..>
How can I see what fields to select from?

You can view the full schema here, or issue an ad-hoc sqlite query:

$ recon -q 'PRAGMA table_info(files)'
What are computed fields and how can I pick them?

Computed fields are fields that are either compute-intensive or not always needed. You have a fully annotated example configuration.

You can run with a configuration file:

$ recon -c your-config.yaml
How can I use this in shell scripts?

You can use xargs to follow with extra actions for files that you discover:

$ recon -q <your query> --xargs | xargs -I {} echo {}
How can I capture remote state?

You can send this out from a given machine using JSON:

$ recon -q <your query> --json | your-http-post-command

Or you can ship out as CSV using --csv or even the recon.db file as-is out of the machine.

How do I run on large folders or complete disks?

recon can run without interruption on very large folders. Each run consists of two stages:

  1. Walking the file system. Here, we don't know how many files there are, so we're showing number of files indexed and time elapsed.
  2. Processing compute intensive fields. Here we have a list of target files to compute, so we're showing a progress bar.

You can interrupt walking the file system, and resume it later. Use the -u flag to make recon always update DB before query, this will also make it resume any non-processed file.

$ recon -u <other args>

Contributing

We are accepting PRs. Feel free to submit PRs.

To all Contributors - you make this happen, thanks!

License

Copyright (c) 2022 @jondot. See LICENSE for further details.

You might also like...
Find the ideal fuzz targets in a Rust codebase

Siderophile Siderophile finds the "most unsafe" functions in your Rust codebase, so you can fuzz them or refactor them out entirely. It checks the cal

Fuzzer to automatically find side-channel (timing) vulnerabilities
Fuzzer to automatically find side-channel (timing) vulnerabilities

SideFuzz: Fuzzing for side-channel vulnerabilities SideFuzz is an adaptive fuzzer that uses a genetic-algorithm optimizer in combination with t-statis

A tiny program that locates and extracts public save files from Windows to your local directory!

Save Game Extractor | Download Save Game Extractor is a tool that automatically locates and copies save files for Windows games in public directories.

Checks your files for existence of Unicode BIDI characters which can be misused for supply chain attacks. See CVE-2021-42574

BIDI Character Detector This tool checks your files for existence of Unicode BIDI characters which can be misused for supply chain attacks to mitigate

Verdict-as-a-Service SDKs: Analyze files for malicious content
Verdict-as-a-Service SDKs: Analyze files for malicious content

Verdict-as-a-Service Verdict-as-a-Service (VaaS) is a service that provides a platform for scanning files for malware and other threats. It allows eas

Cyg will help you to secure files in your repository directly using PGP encryption

cyg: Secure files in your repository Cyg will help you to secure files in your repository directly using PGP encryption. The name "cyg" was inspired b

Kepler is a vulnerability database and lookup store and API currently utilising National Vulnerability Database and NPM Advisories as data sources
Kepler is a vulnerability database and lookup store and API currently utilising National Vulnerability Database and NPM Advisories as data sources

Kepler — Kepler is a vulnerability database and lookup store and API currently utilising National Vulnerability Database and NPM Advisories as data so

Steals browser passwords and cookies and sends to webhook.
Steals browser passwords and cookies and sends to webhook.

Browser-Stealer Steals browser passwords and cookies and sends to webhook. Donating Educational Purposes Only This code is made so you can learn from

Xori is an automation-ready disassembly and static analysis library for PE32, 32+ and shellcode
Xori is an automation-ready disassembly and static analysis library for PE32, 32+ and shellcode

Xori - Custom disassembly framework Xori is an automation-ready disassembly and static analysis library that consumes shellcode or PE binaries and pro

Comments
  • Add unpack processing option

    Add unpack processing option

    ###Changelog:

    Add unpack to config

    • Config takes a new value unpack under source
    • Files under the is_archive class would be unpacked and recon would perfrom analysis on the unpacked files aswell
    • Refer issue: https://github.com/rusty-ferris-club/recon/issues/1
    opened by eskawl 4
  • Add unpack as a processor

    Add unpack as a processor

    Suggestion / Feature Request

    Add support for unpacking files as a preprocessor so that next matchers can match against uncompressed content.

    • use https://github.com/rusty-ferris-club/decompress for unpacking
    • first, walking the filesystem happens, so unpacking should happen somewhere here: https://github.com/rusty-ferris-club/recon/blob/master/recon/src/workflow.rs#L152

    important

    • see that unpacking happens and then matching can happen later
    • no need for recursive unpacking
    • handle errors gracefully -- continue running when there are files that cannot be unpacked
    enhancement good-first-issue 
    opened by jondot 0
Owner
Rusty Ferris Club
A builders club for people who love building stuff in Rust. We're also helping and teaching each other.
Rusty Ferris Club
OpenSK is an open-source implementation for security keys written in Rust that supports both FIDO U2F and FIDO2 standards.

OpenSK This repository contains a Rust implementation of a FIDO2 authenticator. We developed OpenSK as a Tock OS application. We intend to bring a ful

Google 2.4k Jan 7, 2023
Minimal and persistent key-value store designed with security in mind

microkv Minimal and persistent key-value store designed with security in mind. Introduction microkv is a persistent key-value store implemented in Rus

Alan 17 Jan 2, 2023
Automate device security provisioning with edge intelligence.

UNiD Automate device security provisioning with edge intelligence Features Decentralized PKI(DPKI), DIDs, DKMS, and Credential Management End-to-End E

UNiD 101 Oct 21, 2022
A utility like pkg-audit for Arch Linux. Based on Arch Security Team data.

arch-audit pkg-audit-like utility for Arch Linux. Based on data from security.archlinux.org collected by the awesome Arch Security Team. Installation

Andrea Scarpino 316 Nov 22, 2022
The Swiss Army Knife for Binary (In)security

binsec Swiss Army Knife for Binary (In)security binsec is a minimal static analysis utility for detecting security capabilities in ELF/PE/Mach-O execu

Alan 15 Dec 16, 2022
A Rust implementation of the Message Layer Security group messaging protocol

Molasses An extremely early implementation of the Message Layer Security group messaging protocol. This repo is based on draft 4 of the MLS protocol s

Trail of Bits 109 Dec 13, 2022
Bindings to the macOS Security.framework

macOS/iOS Security framework for Rust Documentation Bindings to the Apple's Security.framework. Allows use of TLS and Keychain from Rust. License Lice

Kornel 172 Jan 2, 2023
Applied offensive security with the Rust programming language

Black Hat Rust Applied offensive security with the Rust programming language Buy the book now! While the Rust Book does an excellent job teaching What

Sylvain Kerkour 2.2k Jan 8, 2023
Security advisory database for Rust crates published through crates.io

RustSec Advisory Database The RustSec Advisory Database is a repository of security advisories filed against Rust crates published via https://crates.

RustSec 682 Jan 1, 2023
irulescan is a static security analyzer for iRules

irulescan is a tool to scan iRules for unexpected/unsafe expressions that may have undesirable effects like double substitution.

Simon Kowallik 2 Dec 18, 2022