Binary coverage tool without binary modification for Windows

Overview

Bag of MesosBag of MesosBag of MesosBag of MesosBag of MesosBag of MesosBag of MesosBag of MesosBag of MesosBag of MesosBag of MesosBag of MesosBag of Mesos

Summary

Mesos is a tool to gather binary code coverage on all user-land Windows targets without need for source or recompilation. It also provides an automatic mechanism to save a full minidump of a process if it crashes under mesos.

Mesos is technically just a really fast debugger, capable of handling tens of millions of breakpoints. Using this debugger, we apply breakpoints to every single basic block in a program. These breakpoints are removed as they are hit. Thus, mesos converges to 0-cost coverage as gathering coverage only has a cost the first time the basic block is hit.

Why?

This is effectively the successor of my 5+ year old Chrome IPC fuzzer. It doesn't have any fuzz components in it, but it is a high-performance debugger. This debugger can apply millions of breakpoints to gather coverage, and handle thousands of breakpoints per second to modify memory to inject inputs.

This strategy has worked out well for me historically and still is my go-to tooling for fuzzing targets on live systems.

Out of the box it can be used to gather simple code coverage but it's designed to be easily modified to add fast breakpoint handlers to inject inputs. For example, put a breakpoint after NtReadFile() returns and modify the buffer in flight. I used this in Chrome to modify inbound IPC traffic in the browser.

Features

Code coverage

code coverage

Automatic full minidump saving

Crash being saved

IDA Coloring

IDA gettin colored up

Quick Usage Guide

Set %PATH% such that idat64.exe is in it:

path %PATH%;"C:\Program Files\IDA 7.2"

Generate mesos (the first time will be slow):

powershell .\offline_meso.ps1 <pid>
python generate_mesos.py process_ida

Gather coverage on target!

cargo build --release
target\release\mesos.exe <pid>

Applying 1.6 million breakpoints? No big deal.

C:\dev\mesos>target\release\mesos.exe 13828
mesos  is 64-bit: true
target is 64-bit: true
[      0.003783] Applied       5629 breakpoints (      5629 total breakpoints) notepad.exe
[      0.028071] Applied      61334 breakpoints (     66963 total breakpoints) ntdll.dll
[      0.035298] Applied      25289 breakpoints (     92252 total breakpoints) kernel32.dll
[      0.058815] Applied      55611 breakpoints (    147863 total breakpoints) kernelbase.dll
...
[      0.667417] Applied      11504 breakpoints (   1466344 total breakpoints) oleacc.dll
[      0.676151] Applied      19557 breakpoints (   1485901 total breakpoints) textinputframework.dll
[      0.705431] Applied      66650 breakpoints (   1552551 total breakpoints) coreuicomponents.dll
[      0.717276] Applied      25202 breakpoints (   1577753 total breakpoints) coremessaging.dll
[      0.720487] Applied       7557 breakpoints (   1585310 total breakpoints) ntmarta.dll
[      0.732045] Applied      28569 breakpoints (   1613879 total breakpoints) iertutil.dll

API

Currently this tool has a debugger lib you can easily bring in and start using to make custom debuggers. However this API is not finalized yet. I suggest not building anything using it yet as it may change very quickly. Once this message is removed it's probably stable :)

Performance

  • We can register (request breakpoints to be at module load) about ~6 million/second
  • We can apply them (actually install the breakpoints into the target at about ~3 million/second
  • We can clear breakpoints at about 15 million/second
  • We can hit and handle about 10k breakpoints/second

Given breakpoints are cleared as they're hit for coverage, that means you can observe 10k new blocks per second. Once you've hit a breakpoint they no longer have a performance cost!

C:\dev\mesos\examples\benchmark>cargo run --release
    Finished release [optimized + debuginfo] target(s) in 0.03s
     Running `target\release\benchmark.exe`
mesos  is 64-bit: true
target is 64-bit: true
Registered    1000000 breakpoints in   0.162230 seconds |  6164072.8 / second
Applied       1000000 breakpoints in   0.321347 seconds |  3111897.0 / second
Cleared       1000000 breakpoints in   0.067024 seconds | 14920028.6 / second
Hit            100000 breakpoints in  10.066440 seconds |     9934.0 / second

Usage

To use mesos there are 3 major steps. First, the modules of a running process are saved. Second, these modules are loaded in IDA which then outputs a list of all basic blocks into the meso format. And finally, mesos is run against a target process to gather coverage!

Creating meso_deps.zip

This step is the first thing we have to do. We create a ZIP file containing all of the modules loaded into a given PID.

This script requires no internet and is designed to be easily dropped onto new VMs so mesos can be generated for your target application. It depends on PowerShell v5.0 or later which is installed by default on Windows 10 and Windows Server 2016.

Run, with <pid> replaced with the process ID you want to gather coverage on:

C:\dev\mesos>powershell .\offline_meso.ps1 8484
Powershell is 64-bit: True
Target     is 64-bit: True

C:\dev\mesos>

Optionally you can supply -OutputZip <zipfile> to change the output zip file name

This will create a meso_deps.zip that if you look at contains all of the modules used in the process you ran the script targeting.

Example output:

C:\dev\mesos>powershell .\offline_meso.ps1 8484 -OutputZip testing.zip
Powershell is 64-bit: True                                                                                                                                         Target     is 64-bit: True                                                                                                                                                                                                                                                                                                            C:\dev\mesos>powershell Expand-Archive testing.zip -DestinationPath example                                                                                        
C:\dev\mesos>powershell Get-ChildItem example -rec -File -Name
cache\c_\program files\common files\microsoft shared\ink\tiptsf.dll
cache\c_\program files\intel\optaneshellextensions\iastorafsserviceapi.dll
cache\c_\program files\widcomm\bluetooth software\btmmhook.dll
cache\c_\program files (x86)\common files\adobe\coresyncextension\coresync_x64.dll
...

Generating meso files

To generate meso files we operate on the meso_deps.zip we created in the last step. It doesn't matter where this zip came from. This allows the zip to have come from a VM that the PowerShell script was run on.

Basic usage is:

python generate_mesos.py process_ida

This will use the meso_deps.zip file as an input, and use IDA to process all executables in the zip file and figure out where their basic blocks are.

This will create a cache folder with a bunch of files in it. These files are named based on the module name, the modules TimeDateStamp in the PE header, and the ImageSize field in the PE header. This is what DLLs are uniqued by in the PDB symbol store, so it should be good enough for us here too.

You'll see there are files with no extension (these are the original binaries), there are files with .meso extensions (the breakpoint lists), and .i64 files (the cached IDA database for the original binary).

Symbol resolution

There is no limitation on what can make these meso files. The quality of the symbol resolution depends on the tool you used to generate and it's ability to resolve symbols. For example with IDA if you have public/private symbols your _NT_SYMBOL_PATH should be configured correctly.

More advanced usage

Check the programs usage for the most recent usage. But there are _whitelist and _blacklist options that allow you to use a list of strings to filter the amount of mesos generated.

This is helpful as coverage outside of your target module is probably not relevant and just introduces overheads and unnecessary processing.

C:\dev\mesos>python generate_mesos.py
Usage:
    generate_mesos.py process_ida
        Processes all files in the meso_deps.zip file

    generate_mesos.py process_ida_whitelist <str 1> <str 2> <str ...>
        Processes files only containing one of the strings provided

    generate_mesos.py process_ida_blacklist <str 1> <str 2> <str ...>
        Processes files all files except for those containing one of the provided strings

Examples:

    python generate_mesos.py process_ida_whitelist system32
        Only processes files in `system32`

    python generate_mesos.py process_ida_blacklist ntdll.dll
        Process all files except for `ntdll.dll`

Path requirements for process_ida_*: must have `idat64.exe` in your PATH

Example usage

C:\dev\mesos>python generate_mesos.py process_ida_whitelist system32
Processing cache/c_/windows/system32/advapi32.dll
Processing cache/c_/windows/system32/bcryptprimitives.dll
Processing cache/c_/windows/system32/cfgmgr32.dll
...
Processing cache/c_/windows/system32/user32.dll
Processing cache/c_/windows/system32/uxtheme.dll
Processing cache/c_/windows/system32/win32u.dll
Processing cache/c_/windows/system32/windows.storage.dll
Processing cache/c_/windows/system32/wintypes.dll

Meso usage

Now we're onto the actual debugger. We've created meso files to tell it where to put breakpoints in each module.

First we need to build it with Rust!

cargo build --release

And then we can simply run it with a PID!

target\release\mesos.exe <pid>

Command-line options

Currently there are few options to mesos, run mesos without arguments to get the most recent list.

C:\dev\mesos>target\release\mesos.exe
Usage: mesos.exe <pid> [--freq | --verbose | --print] <explicit meso file 1> <explicit meso file ...>
    --freq               - Treats all breakpoints as frequency breakpoints
    --verbose            - Enables verbose prints for debugging
    --print              - Prints breakpoint info on every single breakpoint
    [explicit meso file] - Load a specific meso file regardless of loaded modules

Standard usage: mesos.exe <pid>

Example usage

C:\dev\mesos>target\release\mesos.exe 13828
mesos  is 64-bit: true
target is 64-bit: true
[      0.004033] Applied       5629 breakpoints (      5629 total breakpoints) notepad.exe
[      0.029248] Applied      61334 breakpoints (     66963 total breakpoints) ntdll.dll
[      0.037032] Applied      25289 breakpoints (     92252 total breakpoints) kernel32.dll
[      0.062844] Applied      55611 breakpoints (    147863 total breakpoints) kernelbase.dll
...
[      0.739059] Applied      66650 breakpoints (   1552551 total breakpoints) coreuicomponents.dll
[      0.750266] Applied      25202 breakpoints (   1577753 total breakpoints) coremessaging.dll
[      0.754485] Applied       7557 breakpoints (   1585310 total breakpoints) ntmarta.dll
[      0.766119] Applied      28569 breakpoints (   1613879 total breakpoints) iertutil.dll
...
[     23.544097] Removed 5968 breakpoints in imm32.dll
[     23.551529] Syncing code coverage database...
[     23.675103] Sync complete (169694 total unique coverage entries)
Detached from process 13828

Why not use cargo run?

When running in cargo run the Ctrl+C handler does not work correctly, and does not allow us to detach from the target program cleanly.

Limitations

Since this relies on a tool (IDA) to identify blocks, if the tool incorrectly identifies a block it could result in us inserting a breakpoint over data. Further it's possible to miss coverage if a block is not correctly found.

Why doesn't it do more?

Well. It really just allows fast breakpoints. Feel free to rip it apart and add your own hooks to functions. It could easily be used to fuzz things :)

Why IDA?

I tried a bunch of tools and IDA was the only one that seemed to work well. Binja probably would also work well but I don't have it installed and I'm not familiar with the API. I have a coworker who wrote a plugin for it and that'll probably get pull requested in soon.

The meso files are just simple files, anyone can generate them from any tool

Technical Details

Minidump autogenned filenames

The generated minidump filenames are designed to give a high-level of glance value at crashes. It includes things like the exception type, faulting address, and rough classification of the bug.

Currently if it's an access violation we apply the following classification:

  • Determine the access type (read, write, execute)
    • For reads the filename contains: "read"
    • For writes the filename contains: "WRITE"
    • For execute the filename contains: "DEP"
  • Determine if it's a non-canonical 64-bit address
    • For non-canonical addresses the filename contains: NONCANON
  • Otherwise determine if it's a NULL dereference (within 32 KiB +- of NULL)
    • Will put "null" in the filename
  • Otherwise it's considered a non-null deref and "HIGH" appears in the filename

It's intended that more severe things are in all caps to give higher glance value of prioritizing which crash dumps to look into more.

Example minidump filename for chrome:

crash_c0000005_chrome_child.dll+0x2c915c0_WRITE_null.dmp

Meso file format

Coming soon (once it's stable)

Comments
  • Ghidra support

    Ghidra support

    I created a script to generate Mesos-files from Ghidra. Should I submit a merge request for that file?

    https://github.com/marpie/mesos/blob/7740833f02bd3bc9cb8f9681e42768a6c12169f0/mesogen_scripts/ghidra.py

    opened by marpie 1
  • Introduces register_debug_event_callback()

    Introduces register_debug_event_callback()

    Introduces Debugger::register_debug_event_callback() callback which fires on debug events.

    Updates Debugger::register_modload_callback() (and Debugger::register_debug_event_callback()) to take Box<Fn()> as opposed to a standard function pointer.

    opened by toshipiazza 1
  • Color entire blocks instead of just the first instruction

    Color entire blocks instead of just the first instruction

    When coloring the graph in IDA, I think it would be better to color entire blocks rather than just the first instruction of each block. This will make the graph look something like this: image

    We can get the boundaries of each block using this small utility function:

    def addr2block(addr):
        f = idaapi.get_func(addr)
        if not f:
            print "No function at 0x%x" % (addr)
            return
    
        fc = idaapi.FlowChart(f)
    
        for block in fc:
            if (block.startEA <= addr) and (block.endEA > addr):
                    return (block.startEA, block.endEA)
    
    opened by assafcarlsbad 1
  • --follow-fork is broken

    --follow-fork is broken

    There appears to be several issues with --follow-fork. In no particular order:

    • Debugger loop sees multiple CREATE_PROCESS_DEBUG_EVENT debug events and the process handle gets overwritten
    • Debugger loop exits on first EXIT_PROCESS_DEBUG_EVENT debug event or on EXCEPTION_ACCESS_VIOLATION which could correspond to a child process
    • hit_initial_break needs to keep track of new pids
    • Need to keep track of modules and breakpoints per-process
    opened by toshipiazza 0
  • $path is null

    $path is null

    Hi! first I wanna thank you!. and I used powershell and I got an error.

    Copy-Item : 'Path' 매개 변수가 null이므로 인수를 해당 매개 변수에 바인딩할 수없습니다. --> means variable 'Path' is null
    위치 C:\Users\hsc1209\Downloads\mesos-master\offline_meso.ps1:77 문자:20
        +     Copy-Item -Path <<<<  $path -Destination $hirearchy
        + CategoryInfo          : InvalidData: (:) [Copy-Item], ParameterBindingVa
       lidationException
        + FullyQualifiedErrorId : ParameterArgumentValidationErrorNullNotAllowed,M
       icrosoft.PowerShell.Commands.CopyItemCommand
    

    I think this program loaded some hidden executable files(dll or exe). how can I solve this problem?

    opened by kismp123 0
  • Code examples about inject inputs

    Code examples about inject inputs

    In the readme file i can see the following text:

    "Out of the box it can be used to gather simple code coverage but it's designed to be easily modified to add fast breakpoint handlers to inject inputs. For example, put a breakpoint after NtReadFile() returns and modify the buffer in flight. I used this in Chrome to modify inbound IPC traffic in the browser."

    This is really awesome feature, would you be so kind as to provide an example of functional code?

    Thank you very much

    opened by Pl0414141 0
  • Add Thread-ID to output

    Add Thread-ID to output

    In my current workflow it helps me to have the Thread-ID present in the out to visually filter if I reached the right point.

    Maybe it's helpful for others...

    opened by marpie 0
  • Lighthouse support

    Lighthouse support

    Hi Brandon,

    Amazing tool! Is there plans to make mesos spit out a file format recognized by https://github.com/gaasedelen/lighthouse? Lighthouse has support for IDA Pro + Binary Ninja plus some pretty sophisticated features for visualizing the difference between different coverage data sets.

    The main coverage format Lighthouse supported is the same one as DynamoRIO's drcov, the development branch also has support for a simpler module+offset format like:

    a+4141
    b+5242
    b+5243
    a+4142
    c+6361
    a+4143
    
    opened by Ayrx 1
Owner
null
The Swiss Army Knife for Binary (In)security

binsec Swiss Army Knife for Binary (In)security binsec is a minimal static analysis utility for detecting security capabilities in ELF/PE/Mach-O execu

Alan 15 Dec 16, 2022
Binary Analysis Framework in Rust

Welcome to Falcon Falcon is a formal binary analysis framework in Rust. Expression-based IL with strong influences from RREIL and Binary Ninja's LLIL.

Falcon Binary Analysis Framework 489 Dec 18, 2022
Binary coverage tool without binary modification for Windows

Summary Mesos is a tool to gather binary code coverage on all user-land Windows targets without need for source or recompilation. It also provides an

null 381 Dec 22, 2022
OpenAPI-based test coverage analysis tool that helps teams improve integration test coverage in CI/CD pipelines

Ready-to-use OpenAPI test coverage analysis tool that helps teams improve integration CoveAPI is an advanced test coverage analysis tool based on the

Yasser Tahiri 18 Aug 3, 2023
Easy c̵̰͠r̵̛̠ö̴̪s̶̩̒s̵̭̀-t̶̲͝h̶̯̚r̵̺͐e̷̖̽ḁ̴̍d̶̖̔ ȓ̵͙ė̶͎ḟ̴͙e̸̖͛r̶̖͗ë̶̱́ṉ̵̒ĉ̷̥e̷͚̍ s̷̹͌h̷̲̉a̵̭͋r̷̫̊ḭ̵̊n̷̬͂g̵̦̃ f̶̻̊ơ̵̜ṟ̸̈́ R̵̞̋ù̵̺s̷̖̅ţ̸͗!̸̼͋

Rust S̵̓i̸̓n̵̉ I̴n̴f̶e̸r̵n̷a̴l mutability! Howdy, friendly Rust developer! Ever had a value get m̵̯̅ð̶͊v̴̮̾ê̴̼͘d away right under your nose just when

null 294 Dec 23, 2022
Rewind is a snapshot-based coverage-guided fuzzer targeting Windows kernel components.

Rewind is a snapshot-based coverage-guided fuzzer targeting Windows kernel components.

Quarkslab 259 Dec 26, 2022
Transmute - a binary that works alone or in coordination with coverage formatter to report test quality

Transmute is a binary that works alone or in coordination with coverage formatter to report test quality. It will change your code and make the tests fail. If don't, we will raise it for you.

Victor Antoniazzi 5 Nov 17, 2022
Use Git installed in Bash on Windows/Windows Subsystem for Linux (WSL) from Windows and Visual Studio Code (VSCode)

WSLGit This project provides a small executable that forwards all arguments to git running inside Bash on Windows/Windows Subsystem for Linux (WSL). T

A. R. S. 1.1k Jan 3, 2023
hosts file parsing, modification library, and some derivatives.

hosts-rs hosts: Hosts file parsing, modification library resolve-github: Use Cloudflare DoH to resolve GitHub domains and generate hosts files github-

zu1k 33 Jul 4, 2022
AlterWare game modification updater & launcher

alterware-launcher Download latest release Unpack the archive and place alterware-launcher.exe in the game directory Run alterware-launcher.exe, after

mxve 4 Jun 12, 2023
A Skyline plugin for Super Smash Bros. Ultimate that enables the use and modification of stage features that are otherwise hardcoded into the game.

stage_config A Skyline plugin for Super Smash Bros. Ultimate that enables the use and modification of stage features that are otherwise hardcoded into

TNN 4 Oct 14, 2023
A code coverage tool for Rust projects

Tarpaulin Tarpaulin is a code coverage reporting tool for the Cargo build system, named for a waterproof cloth used to cover cargo on a ship. Currentl

null 1.8k Jan 2, 2023
A user crud written in Rust, designed to connect to a MySQL database with full integration test coverage.

SQLX User CRUD Purpose This application demonstrates the how to implement a common design for CRUDs in, potentially, a system of microservices. The de

null 78 Nov 27, 2022
Structure-aware, in-process, coverage-guided, evolutionary fuzzing engine for Rust functions.

fuzzcheck Fuzzcheck is a structure-aware, in-process, coverage-guided, evolutionary fuzzing engine for Rust functions. Given a function test: (T) -> b

Loïc Lecrenier 394 Dec 20, 2022
Minimal virus genome coverage assessment for metagenomic diagnostics

vircov Minimal virus genome coverage assessment for metagenomic diagnostics Overview v0.5.0 Purpose Implementation Installation Usage Tests Concept Cl

Eike Steinig 14 Oct 17, 2022
Convert rust coverage data to HTML reports, LCOV files or terminal tables

Coverage Prepare Convert coverage data to HTML reports, LCOV files or terminal tables. coverage-prepare --help: Convert "profraw" coverage data to: *

Samuel Colvin 3 Aug 23, 2022
Rust low-level minimalist APNG writer and PNG reader with just a few dependencies with all possible formats coverage (including HDR).

project Wiki https://github.com/js29a/micro_png/wiki at glance use micro_png::*; fn main() { // load an image let image = read_png("tmp/test.

jacek SQ6KBQ 8 Aug 30, 2023
A snapshotting, coverage-guided fuzzer for software (UEFI, Kernel, firmware, BIOS) built on SIMICS

TSFFS: Target Software Fuzzer For SIMICS TSFFS is a snapshotting, coverage-guided fuzzer built on the SIMICS full system simulator. TSFFS makes it eas

Intel Corporation 194 Oct 9, 2023
Rslide - A web service that allows you to move through multiple html pages in the browser like a slide, even without focusing on the app console or the browser. Currently only supports Windows.

rslide rslide is a web service that allows you to move through multiple html pages in the browser like a slide, even without focusing on the app conso

Jason Dongheng Lee 3 Jan 1, 2022
Send Windows 10 styled notifications on Windows 7.

win7-notifications Send Windows 10 styled notifications on Windows 7. Note: This crate requires a win32 event loop to be running, otherwise the notifi

Tauri 9 Aug 29, 2022