Hopper is a tool for generating fuzzing test cases for libraries automatically using interpretative fuzzing.

Overview

Hopper

Hopper is an tool for generating fuzzing test cases for libraries automatically using interpretative fuzzing. It transforms the problem of library fuzzing into the problem of interpreter fuzzing, enabling exploration of a vast range of API usages for library fuzzing out of the box. Some key features of Hopper include:

  • Interpretative API invoking without any fuzz driver.
  • Type-aware mutation for arguments.
  • Automatic intra- and inter-API constraints leanring.
  • Binary instrumentation support.

To learn more about Hopper, check out our paper at CCS '23.

Build Hopper

Build Requirements

  • Linux-amd64 (Tested on Ubuntu 20.04 and Debian Buster)
  • Rust stable (>= 1.60), can be obtained using rustup
  • Clang (>= 5.0, Install Clang), rust-bindgen leverages libclang to preprocess, parse, and type check C and C++ header files.

Build Hopper itself

./build.sh

The script will create a install directory in hopper's root directory, then you can use hopper. To use the command anywhere, you can set your project directory in your PATH variable.

Using Docker

You can choose to use the Dockerfile, which build the requirements and Hopper.

docker build -t hopper ./
docker run --name hopper_dev --privileged -v /path-to-lib:/fuzz -it --rm hopper /bin/bash

Compile library with Hopper

Take csjon for example (More examples).

hopper compile --header ./cJSON.h --library ./libcjson.so --output output

Use hopper compile --help to see detailed usage. If the compiling reports errors about header file, refer to the usage of rust-bindgen, which we used for parsing header file. You may wrap the header file with the missing definitions. Hopper uses E9Patch to instrument binaries by default.

After running compile, you will find that it generates the following files in the output directory:

  • bin/hopper-fuzzer: generates inputs, maintatins states, and use harness to excuted the inputs.
  • bin/hopper-harness: executes the inputs.
  • bin/hopper-translate: translates inputs to C source code.
  • bin/hopper-generator: replays the generate process.
  • bin/hopper-sanitizer: sanitize and minimize crashes.

Header files

  • If there are multiple header files, you can crate a new header file, and include all of them.
  • If header files are compiled depending on specific envoironment variables. You can set it by : BINDGEN_EXTRA_CLANG_ARGS.
  • If the header file includes API functions that you do not want to test, use --func-pattern to filter them while running the fuzzer.

Environment variable for compiling

  • HOPPER_MAP_SIZE_POW2: controls the size of coverage path. The defult value is 16, and it should be in the range of [16, 20]. e.g. HOPPER_MAP_SIZE_POW2=18.
  • HOPPER_INST_RATIO: controls how likely a block will be chosen for instrumentation. The default value is 100, and it should be in the range of (0, 100]. e.g. HOPPER_INST_RATIO=75.
  • HOPPER_INCLUDE_SEARCH_PATH: includes the search path of file in header files. e.g. HOPPER_INCLUDE_SEARCH_PATH=../.
  • HOPPER_FUNC_BLACKLIST: includes function blacklists that hopper won't compile. bindgen will not generate code for the functions. e.g. HOPPER_FUNC_BLACKLIST=f1,f2.
  • HOPPER_TYPE_BLACKLIST: includes type blacklists that hopper won't compile. bindgen will not generate code for the types. e.g. HOPPER_TYPE_BLACKLIST=type1,type2.
  • HOPPER_ITEM_BLACKLIST: includes item(constants/variables) blacklists that hopper won't compile. bindgen will not generate code for the items. e.g. HOPPER_ITEM_BLACKLIST=IPPORT_RESERVED
  • HOPPER_CUSTOM_OPAQUE_LIST: includes custom opaque types we defined. e.g. HOPPER_CUSTOM_OPAQUE_LIST=type1.

Tips

  • You can set the arguments and environment variables for compiling and running in a configuration file named hopper.config, see examples/* for details.

  • Reduce density: If density is larger than 20%, the IDs of edges is likely to have hash-collisions. We can a) increase HOPPER_MAP_SIZE_POW2 or b) reduce HOPPER_INST_RATIO.

  • Multiple libraries: (1) merge the archives into one shared library, e.g. gcc -shared -o c.so -Wl,--whole-archive a.a b.a -Wl,--no-whole-archive; (2) pass all of them into hopper compiler by --library a.so b.so.

Fuzz Library with Hopper

hopper fuzz output --func-pattern cJSON_*

Use hopper fuzz output --help to see detailed usage.

After running fuzz, it will generate following directories.

  • queue: generated normal inputs.
  • hangs: generated timeout inputs.
  • crashes: generated crash inputs.
  • misc: store some temporal files or stats.

Environment variable for running

  • DISABLE_CALL_DET: disables call's deterministic mutating.
  • DISABLE_GEN_FAIL: disables generating programs for functions that have been failed to invoke.
  • HOPPER_SEED_DIR: provides seeds for byte-like arguments (default: output/seeds if t exists).
  • HOPPER_DICT: provides dictionary for byte-like arguments. The grammar is the same as AFL's.
  • HOPPER_API_INSENSITIVE_COV: disables API-sensitive branch counting.
  • HOPPER_FAST_EXECUTE_LOOP: number of programs excuted (in a loop) for each fork, set as 0 or 1 to break the loop. e.g. HOPPER_FAST_EXECUTE_LOOP=10.

System configuration

Set system core dumps as AFL (on the host if you execute Hopper in a Docker container).

echo core | sudo tee /proc/sys/kernel/core_pattern

Function pattern

Hopper generates inputs for all functions in libiries by default. However, there are two ways to filter functions in Hopper: exlucding functions or including functions. This way, it can be focus on intersting functions.

--func-pattern

hopper fuzz output --func-pattern @cJSON_parse,!cJSON_InitHook,cJSON_*
  • The pattern can be a function name, e.g. cJSON_parse, or a simple pattern, e.g. cJSON_*.
  • If you have multiple patterns, use , to join them, e.g cJSON_*,HTTP_*.
  • You can use @ prefix to limit the fuzzer to only fuzz specific function, while the others can be candidates that provding values for fields or arguments, e.g. @cJSON_parse,cJSON_*.
  • ! is used as prefix for excluding some specific functions, e.g !cJSON_InitHook,cJSON_*.

--custom-rules

The patterns can be defined in the file passed by --custom-rules.

// hopper fuzz output --custom-rules path-to-file
func_target cJSON_parse
func_exclude cJSON_InitHook
func_include cJSON_*,HTTP_*

Constraints

Hopper infers both intra- and inter-API constraints to invoking the APIs correctlly.
The constraints are written in output/misc/constraint.config. You can remove the file to reset the constraints. Addtionally, users can defined a file that describe custom constraints for API invocations, which passed by --custom-rules. The constraints will override the infered ones.

// hopper fuzz output --custom-rules path-to-file
// Grammar: 
// func, type : prefix for adding a rule for function or type
// $[0-9]+    : function's i-th argument, or index in array
// [a-zA-Z_]+ : object field
// 0, 128 ..  : integer constants
// "xxxx"     : string constants
// methods    : $len, $range, $null, $non_null, $need_init, $read_file, $write_file, $ret_from, $cast_from, $use, $arr_len, $opaque, $len_factors
// others     :  pointer(&) , option(?), e.g &.$0.len,  `len` field in the pointer's first element
//
// Set one argument in a function to be specific constant
func test_add[$0] = 128
// One argument must be the length of another one
func test_arr[$1] = $len($0)
// Or one field must be the length of another field
func test_arr[$0][len] = $len([$0][name])
// One argument must be in a certain range
func test_arr[$1] = $range(0, $len($0))
// Argument should be non-null
func test_non_null[$0] = $non_null
// Argument should be null
func test_null[$0] = $null
// Argument should be specific string
func test_magic[$0] = "magic"
// Argument should be a file and the file will be read
func test_path[$0] = $read_file
// Argument should be use the value of specific function's return
func test_use[$0] = $ret_from(test_create)
// Argument should be specific type for void pointer. The type should start with *mut or *cosnt.
func test_void[$0] = $cast_from(*mut u8)
// The array suppose has a minimal array length
func test_void[$0][&] = $arr_len(256)
// The array's length is formed by the factors
func fread[$0][&] = $len_factors(1, $2)
// Or
func gzfread[$0][&] = $len_factors($1, $2)
// Field in argument should be specific constant
func test_field[$0][len] = 128
// Deeper fields
func test_field[$0][&.elements.$0] = 128

// One field `len` in a type must be the length of another field `p`
type ArrayWrap[len] = $len(p)
// One nested union `inner_union` in a type must be set to `member2` 
type ComplicatedStruct[inner_union] = $use(member2)
// Type is opaque that used as an opaque pointer
type Partial = $opaque
// A type should be init with specific function
type Partial = $init_with(test_init, 0)

// ctx: set context for specific function
// Add a context for function
ctx test_use[$0] <- test_init
// Add implicit context
ctx test_use[*] <- test_init
// Add optional context that prefered to use
ctx test_use[$0] <- test_init ?
// Add forbidden context
ctx test_use[$0] <- ! test_init

// alias: alias types across different function
alias handleA <- useA($0),createA($ret),freeA($0)

// assert: adding specific assertions for calls
assert test_one == 1
assert test_non_zero != 0

Seeds for bytes arguments

If there is a seeds direcotry (Set by HOPPER_SEED_DIR), Hopper will try to read files inside it and uses them as the seeds for bytes arguments (e.g. char*). Also, you can indicate the seeds for specific argument via its parameter names, e.g make the subdirectory as @buf for parameter whose name is buf.

Logging

Hopper uses Rust's log crate to print log information. The default log level is INFO. If you want to print all logging information (DEBUG and TRACE), you can set the environment LOG_TYPE during running Hopper, e.g. LOG_TYPE=trace ./hopper. The detailed logging will be written at output/fuzzer_r*.log and output/harness_r*.log.

Reproduce execution

Hopper can reproduce the execution of programs at output directories.

  • hopper-harness can parse and explain the inputs by Hopper's runtime. It wiil print the internal states during execution in detail.
./bin/hopper-harness ./queue/id_000000
  • hopper-translate can translate the input to C source code. The C files can be a witness for reporting issues.
./bin/hopper-translate --input ./queue/id_000000  --header path-to/xx.h --output test.c
# then compile it with specific library
gcc -I/path-to-head -L/path-to-lib -l:libcjson.so test.c -o test
  • hopper-generator is able to replay input generation except execution. You can use it to analyse how the input was generated or mutated.
./bin/hopper-generator ./queue/id_000000
  • hopper-sanitizer can minimize and verify the crashes generated by Hopper. It excludes crashes that violate constraints and de-duplicate crashes according to call stacks.
./bin/hopper-sanitizer

Test

Test rust code

  • Run all testcases
RUST_BACKTRACE=1 cargo test -- --nocapture

Testsuite (test libraries)

Real world examples

Evaluating results via source-based code coverage

export CFLAGS="${CFLAGS:-} -fprofile-instr-generate -fcoverage-mapping -gline-tables-only -g"
make
  • Compile the libraries with cov instrumentation mode. e.g.
hopper compile --instrument cov --header ./cJSON.h --library ./libcjson_cov.so --output output_cov
  • Run the interpreter with all generated seed inputs (SEED_DIR).
# run hopper and use llvm-cov to compute the coverage.
SEED_DIR=./output/queue hopper cov output_cov

Contributing guidelines

We have listed some tasks in Roadmap. If you are interested, please feel free to discuss with us and contribute your code.

Coding

  • Zero cargo check warnning
  • Zero cargo clippy warnning
  • Zero FAILED in cargo test
  • Try to write tests for your code

Profiling

perf record --call-graph=dwarf ./bin/hopper-fuzzer
# use flamegraph directly
perf script | stackcollapse-perf.pl | rust-unmangle | flamegraph.pl > flame.svg
# use inferno
perf script | inferno-collapse-perf | inferno-flamegraph > flamegraph.svg

perf will produce huge intermediate data for analysis, so do not run fuzzer more than 2 minutes.

You might also like...
A library for generating fake data in Rust.

Fake A Rust library for generating fake data. Installation Default (rand is required): [dependencies] fake = "2.4" rand = "0.8" If you want to use #[d

The trait for generating structured data from arbitrary, unstructured input.

Arbitrary The trait for generating structured data from arbitrary, unstructured input. About The Arbitrary crate lets you construct arbitrary instance

Fuzzer to automatically find side-channel (timing) vulnerabilities
Fuzzer to automatically find side-channel (timing) vulnerabilities

SideFuzz: Fuzzing for side-channel vulnerabilities SideFuzz is an adaptive fuzzer that uses a genetic-algorithm optimizer in combination with t-statis

TestDrive automatically scrapes input/output data from BOJ(Baekjoon Online Judge) and runs tests for your executable binary file!

🚖 TestDrive What does it do? TestDrive automatically scrapes input/output data from BOJ(Baekjoon Online Judge) and runs tests for your executable bin

Travis CI and AppVeyor template to test your Rust crate on 5 architectures and publish binary releases of it for Linux, macOS and Windows

trust Travis CI and AppVeyor template to test your Rust crate on 5 architectures and publish binary releases of it for Linux, macOS and Windows Featur

Test social media cards locally
Test social media cards locally

Share Preview Test social media cards locally Description Preview and debug websites metadata tags for social media share. Third Party Packages Distri

Competitive Programming Stress Test Tools
Competitive Programming Stress Test Tools

Competitive Programming Stress Test Tools 競技プログラミング用 ストレステストツール このプログラムの役割 のプログラムに対して,それより実行時間がかかるが確実に できる愚直プログラムと比較することで, となるテストケースを探し出す 最大コーナーケースに対し

A heckin small test generator

heckcheck A heckin small test generator API Docs | Releases | Contributing Installation $ cargo add heckcheck Safety This crate uses #![deny(unsafe_co

Cucumber testing framework for Rust. Fully native, no external test runners or dependencies.

An implementation of the Cucumber testing framework for Rust. Fully native, no external test runners or dependencies.

Comments
  • Coverage report is empty

    Coverage report is empty

    Dear authors,

    I am trying to obtain the Hopper's coverage after a fuzzing campaign, but the report results empty, i.e.,

    $ cat output_cov/cov/coverage.report 
    Filename                      Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    TOTAL                               0                 0         -           0                 0         -           0                 0         -
    

    Lemme explain what I tried.

    I am targeting libvpx commit 8f8e7414684e97ea9b94710ac7853565c8a11c3a. The Hopper is at commit f7437dc250a73a2851c22e3d828d26ea345b0e1f. I am operating in the Docker shipped with the repo.

    1. I compile vpx in release mode, in short:
    mkdir libvpx_build
    cd libvpx_build
    ../libvpx/configure --enable-shared --disable-static
    make
    
    1. I generate the harness:
    # include all the relevant headers in vpx_all.h
    hopper compile --header ${VPX_PATH}/vpx_all.h --library ${VPX_PATH}/libvpx_build/libvpx.so.8.0.0 --output output
    
    1. I run a fuzzing campaign. I want all the APIs. I also assume timeout is the correct way to handle campaign's duration.
    timeout 1h hopper fuzz output
    

    Hopper seems doing something since the folder output/queue/ is not empty, i.e.,

    $ ls output/queue/ | wc -l
    685
    
    1. I compile libvpx with SourceCov. Similarly how I do for measuring coverage in OSS-Fuzz. (maybe this is not the best configuration, but it does its job for OSS-Fuzz and libfuzzer)
    export CC=/bin/clang
    export CXX=/bin/clang++
    
    export CXXFLAGS="-fprofile-instr-generate -fcoverage-mapping -g"
    export CFLAGS="-fprofile-instr-generate -fcoverage-mapping -g"
    
    # oss-fuzz has 2 GB total memory allocation limit. So, we limit per-allocation
    # limit in libvpx to 1 GB to avoid OOM errors. A smaller per-allocation is
    # needed for MemorySanitizer (see bug oss-fuzz:9497 and bug oss-fuzz:9499).
    if [[ $CFLAGS = *sanitize=memory* ]]; then
            extra_c_flags='-DVPX_MAX_ALLOCABLE_MEMORY=536870912'
    else
            extra_c_flags='-DVPX_MAX_ALLOCABLE_MEMORY=1073741824'
    fi
    
    LDFLAGS="$CXXFLAGS" LD=$CXX ../libvpx/configure \
            --enable-shared  \
            --disable-static \
            --extra-cflags="${extra_c_flags}" \
            --enable-debug 
    
    make -j all
    make 
    

    For the sake of clarity, I obtain the same result by setting:

    export CXXFLAGS="-fprofile-instr-generate -fcoverage-mapping -gline-tables-only  -g"
    export CFLAGS="-fprofile-instr-generate -fcoverage-mapping -gline-tables-only  -g"
    

    To be sure, I also double-checked that libvpx.so.8.0.0 contains the coverage instrumentation:

    $ objdump -M intel -d libvpx.so.8.0.0 | grep cov | head
       3c5b0:       0f 84 06 03 00 00       je     3c8bc <__llvm_coverage_mapping+0x274>
       3c5c2:       0f 85 fd 02 00 00       jne    3c8c5 <__llvm_coverage_mapping+0x27d>
       3c602:       0f 84 a4 00 00 00       je     3c6ac <__llvm_coverage_mapping+0x64>
       3c614:       0f 8c 7d 01 00 00       jl     3c797 <__llvm_coverage_mapping+0x14f>
       3c681:       0f 84 1c 01 00 00       je     3c7a3 <__llvm_coverage_mapping+0x15b>
       3c699:       0f 85 95 01 00 00       jne    3c834 <__llvm_coverage_mapping+0x1ec>
       3c6a7:       e9 cb 00 00 00          jmp    3c777 <__llvm_coverage_mapping+0x12f>
       3c6b8:       0f 84 f2 00 00 00       je     3c7b0 <__llvm_coverage_mapping+0x168>
       3c6da:       0f 8c 63 01 00 00       jl     3c843 <__llvm_coverage_mapping+0x1fb>
       3c74c:       7f b2                   jg     3c700 <__llvm_coverage_mapping+0xb8>
    
    1. I compile the harness in coverage mode.
    $ hopper compile  --instrument cov --header ${VPX_PATH}/vpx_all.h --library ${VPX_PATH}/libvpx_build_cov/libvpx.so.8.0.0 --output output_cov
    $ SEED_DIR=./output/queue hopper cov output_cov
    

    The seeds seem to be processed, but then the report is empty:

    $ cat output_cov/cov/coverage.report 
    Filename                      Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    TOTAL                               0                 0         -           0                 0         -           0                 0         -
    
    

    Both documentation and tool usability are at a very high level (I took less than 30 min to run a campaign). However, to simplify reproducibility, would it be possible to include at least the compilation set-up for the libraries tested in your paper, please?

    opened by tregua87 3
  • some mistakes about asan

    some mistakes about asan

    "Does Hopper support ASan compilation? I encountered the following error when compiling a publicly available library that was compiled with ASan."

    error: failed to compile hopper-harness v1.0.0 (~/Desktop/fuzzer/hopper/hopper-harness), intermediate artifacts can be found at ~/Desktop/fuzzer/hopper/output.

    error: could not compile hopper-harness (bin "hopper-slice") due to previous error

    opened by 0xfocu5 11
Releases(1.0.0)
Owner
FuzzAnything
FuzzAnything
A series of test cases to evaluate async Rust on the nrf52840 in terms of power usage and ergonomics.

A series of test cases to evaluate async Rust on the nrf52840 in terms of power usage and ergonomics. This is an experiment that uses unstable features only available on nightly rust.

Tweede golf 1 Oct 15, 2021
µFUZZ: Redesign of Parallel Fuzzing using Microservice Architecture

mufuzz, a parallel fuzzing framework TODO: Add reference Build Install cargo and protoc curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Software Systems Security Team at Penn State University 7 May 15, 2023
🐇 Fuzzing Rust code with American Fuzzy Lop

afl.rs Fuzzing Rust code with AFLplusplus What is it? Fuzz testing is a software testing technique used to find security and stability issues by provi

Rust Fuzzing Authority 1.3k Jan 5, 2023
Advanced Fuzzing Library - Slot your Fuzzer together in Rust! Scales across cores and machines. For Windows, Android, MacOS, Linux, no_std, ...

LibAFL, the fuzzer library. Advanced Fuzzing Library - Slot your own fuzzers together and extend their features using Rust. LibAFL is written and main

Advanced Fuzzing League ++ 1.2k Dec 29, 2022
Structure-aware, in-process, coverage-guided, evolutionary fuzzing engine for Rust functions.

fuzzcheck Fuzzcheck is a structure-aware, in-process, coverage-guided, evolutionary fuzzing engine for Rust functions. Given a function test: (T) -> b

Loïc Lecrenier 394 Dec 20, 2022
A self-hosted Fuzzing-As-A-Service platform

OneFuzz A self-hosted Fuzzing-As-A-Service platform Project OneFuzz enables continuous developer-driven fuzzing to proactively harden software prior t

Microsoft 2.6k Dec 30, 2022
An example fuzzer about how to fuzz a JS engine combinign Nautilus with Token-level fuzzing

LibAFL QuickJS Fuzzing Example An example fuzzer about how to fuzz a JS engine combinign Nautilus with Token-level fuzzing. Prepare Make sure to have

Andrea Fioraldi 32 Dec 21, 2022
A structure-aware HTTP fuzzing library

?? FeroxFuzz ?? A structure-aware HTTP fuzzing library ?? Another ferox? why? ?? Chill, it's not another command-line tool, this one's a library! ?? M

epi 141 Dec 27, 2022
Pre-Silicon Hardware Fuzzing Toolkit

Disclaimer All components are provided for research and validation purposes only. Use at your own risk. Pre-Silicon Hardware Fuzzing Toolkit From CPU

Intel Labs 12 Dec 13, 2022
Verdun is a HTTP stress-test/benchmark tool written in Rust.

Verdun is a HTTP stress-test/benchmark tool written in Rust. ?? It supports testing a single URL, loading multiples URLs from a file or automatically navigating a website (auto discovery)

Alex Hortopan 2 Feb 23, 2022