supply-chain security for Rust

Last update: Jun 20, 2022

cargo-vet

crates.io Rust CI

Note: cargo-vet is under heavy development and not quite ready for general use. Feel free to give it a try you're feeling adventerous — feedback is most welcome!

The cargo vet subcommand is a tool to help projects ensure that third-party Rust dependencies have been audited by a trusted source. It strives to be lightweight and easy to integrate.

More details available in the book.

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

GitHub

https://github.com/mozilla/cargo-vet
Comments
  • 1. treat dependency overrides as third-party

    STR: cd fresh-mozilla-central-clone cargo vet init cargo vet suggest

    Resulting output: mio.txt

    Note that 'mio' appears several times as a reverse-dependency, but doesn't show up as a recommendation for inspect. We should figure out why that is and resolve the inconsistency.

    Reviewed by bholley at 2022-05-23 22:13
  • 2. How "temp" should our temp fetches be?

    There's kind of 4 tiers of "temp" in my books:

    1. So temp that it evaporates when the process exits
    2. In your system "temp location" with a randomly generated directory name, so nothing can "know" where to find it, not even ourselves on subsequent runs (security people will be happy if you do this)
    3. In your system "temp location" with a deterministic name
    4. Actually not temp at all, just in the current working directory

    I currently implement 3 with name "cargo-vet-checkout" because it's helpful for debugging, but 2 and 4 are both reasonable (your example output implies 3). 1 is totally gibberish for our usecase but I am pedant so I had to list it.

    No matter what default we pick, we should absolutely expose a CLI flag for configuring this a little, if only so the user can opt into some flavour of 4 if they want.

    Worth noting that something like cargo vet suggest is actually a very heavyweight process, because we basically need to download two separate copies of your crategraph and then individually diff each copy! This suggests it might be desirable to be able to cache results between runs, to some extent?

    My current implementation is very conservative and will just completely destroy and recreate cargo-vet-checkout whenever anything wants to write results to temp (suggest, fetch, diff).

    Reviewed by Gankra at 2022-04-12 21:32
  • 3. create skeleton of app

    This is mostly me setting up my "preferred" machinery for a CLI app (although this is my first time using clap's derive-based machinery, I am quite liking it here).

    Some extra stuff like --verbose, --log-file, --output-file are just "standard" things I like a CLI app to have for debugging and making them more flexible to the user.

    What actually is implemented:

    • minimal vet init
    • trivial vet (does not apply trusted/audits or deltas yet)

    A lot of the machinery set up here should make many of the other commands relatively trivial (open toml, push value to a Vec, write toml back).

    Reviewed by Gankra at 2022-04-01 22:21
  • 4. Consider storing the cache in $XDG_CACHE_HOME on platforms where that's applicable

    Currently the cache is stored in the system's temporary directory. On platforms where XDG is relevant, we could consider using $XDG_CACHE_HOME instead.

    https://specifications.freedesktop.org/basedir-spec/basedir-spec-latest.html

    $XDG_CACHE_HOME defines the base directory relative to which user-specific non-essential data files should be stored. If $XDG_CACHE_HOME is either not set or empty, a default equal to $HOME/.cache should be used.

    Reviewed by mystor at 2022-06-14 23:04
  • 5. Sometimes doesn't automatically pick a criteria

    When running cargo vet certify, sometimes it picks a criteria on its own, sometimes not. Example of case where it doesn't: unicode-ident:1.0.0 -> 1.0.1 in m-c.

    Reviewed by glandium at 2022-06-14 22:37
  • 6. implement guessing heuristics for cargo vet certify

    These were outlined in https://github.com/bholley/cargo-vet/issues/85#issuecomment-1133091909 and https://github.com/bholley/cargo-vet/issues/85#issuecomment-1133374812 , but not implemented in #140.

    Reviewed by bholley at 2022-06-07 22:43
  • 7. cargo vet diff shows diffs of files that aren't packaged

    For instance, cargo vet diff semver 1.0.9 1.0.10 shows differences for .cargo_vcs_info.json, .github/workflows/ci.yml, Cargo.toml.orig. None of these files are (presumably) shipped by cargo publish, and they're not pulled by cargo vendor.

    Reviewed by glandium at 2022-06-14 21:59
  • 8. Async processing

    Currently the MVP was quickly hacked up with sync reqwest and no multi-threading just to keep things simple while everything is in flux. In the medium-term we should properly cleanup the code to be async and spin up multiple concurrent tasks where it makes sense.

    I've already done some of that work in rust-minidump here: https://github.com/rust-minidump/rust-minidump/blob/main/breakpad-symbols/src/http.rs

    Which only gets more complicated here: https://github.com/rust-minidump/rust-minidump/pull/558

    Although even that is only "partially" async because the primary concurrency model for minidump-stackwalk is actually just spawning multiple instances in separate processes and them knowing how to transitionally add things to the cache. That said I have been thinking about how to make it "fully" async+multithreaded here: https://github.com/rust-minidump/rust-minidump/issues/561

    I am finding myself doing a lot of duplicated logic and work between these two codebases, and it feels like there could be some very useful to factor out nugget for all the concurrent locking+caching+downloading+parsing machinery.


    Things that can potentially be done concurrently in their own task:

    • Each package's suggest
      • Want to be able to start up multiple suggests at once to queue up all the expensive subtasks below
    • Each download of an imported audits.toml
      • Can we start parsing/validating it while it's downloading? (probably not worth it...?)
    • Each fetch of a package from crates.io
      • Can we start SHAing and untarring it while it's downloading like we do streaming-parse in rust-minidump?
    • Each diffstat of two packages (unclear if git diff --no-index is safe/useful to spawn multiple parallel instances of)

    See https://github.com/rust-minidump/rust-minidump/issues/561 where I detail some of the machinery you would want to have to make this work.

    Reviewed by Gankra at 2022-06-06 20:04
  • 9. regenerate-unaudited reorders entries

    When I add the following to m-c's config.toml:

    +[policy.geckodriver]
    +criteria = "safe-to-run"
    +
    +[policy.webdriver]
    +criteria = "safe-to-run"
    

    And then run cargo vet regenerate-unaudited, I get things like:

     criteria = "safe-to-deploy"
     
     [[unaudited.bytes]]
    -version = "0.5.6"
    +version = "1.1.0"
     criteria = "safe-to-deploy"
     
     [[unaudited.bytes]]
    -version = "1.1.0"
    -criteria = "safe-to-deploy"
    +version = "0.5.6"
    +criteria = "safe-to-run"
     
     [[unaudited.cc]]
     version = "1.0.73"
    

    Seems like the original order (sorted by version) is preferable.

    Reviewed by bholley at 2022-05-31 23:54
  • 10. Implement the two kinds of "suggest"

    Writing this out because it's too complicated to fit all in my head at once. There are two suggests:

    • Real Suggest
    • Vet Suggest

    For both commands you are conceptually trying to find the smallest possible audit you could perform that would properly connect up the graph. This is vague/complicated, so let's define some concepts/terms.

    Methodology / Terminology

    Your audit graph is conceptually a DAG (forest), with nodes containing version numbers and edges containing deltas. This DAG has minimum values and one maximum value.

    The minimum values are:

    • 0.0.0
    • full audits
    • entries in unaudited (may be multiple for one crate)

    The maximum value is the current versions in the Cargo.lock (there may be multiple versions of one crate, but we redo the whole algorithm for each version, so we can ignore that... kinda).

    A path from either direction conceptually starts with all criteria, and then has its criteria intersected by the edge's criteria.

    The goal of the vet resolver is to find every path between the minimums and the maximum. This can be done in either direction -- currently we search from the maximums because there are fewer of them and we can conceptually terminate searches must faster (I think we don't realize this right now, to do so we would need to compare the currently evaluated criteria against validated_criteria and discard that work item if it's a subset, since it can't possibly add more information; right now we only discard paths when the intersection becomes the empty set).

    The resolver can fail for two conceptually distinct reasons (although it doesn't distinguish them):

    1. There exist no paths between the minimums and the maximum
    2. Although there exists paths, none of them have sufficient criteria, not even their union

    The first is easier to think about, so let's start with that. The graph is disconnected, but conceptually we have some set of paths reaching down and some set paths reaching up. In my head this is like two hands outstretched towards eachother. In a simplified model where there is only one criteria, the solution we're interested in is just delta(max(min_graph), min(max_graph)). That is, the smallest distance between the two outstreched hands' fingers.

    The second is harder to think about, and I'm not sure how to properly detect/communicate what edges could be cheaply added to make the graph connected from the perspective of Sufficient Criteria. This is especially brain-melty to think about if you want to try to do a Really Amazing Job and actually take path unioning into account.

    Ok now with this established, let's see what these two commands are actually asking.

    Vet Suggest

    This is to help you get 'vet' passing. Everything described above is exactly the case. Usually the answer will just be "you need a delta from the maximum to the last audit you did".

    Real Suggest

    This is to help you clean out your unaudited backlog. This is the same as vet suggest except all the minimums from unaudited entries have been removed. Assuming you run this command when vet is passing, the answer will usually just be "you need a full audit for the exact version of the unaudited entry we just removed".

    Implementation strategies

    Quick And Dirty

    Assume the "usual" case is good enough, and just recommend exactly that.

    Real Suggest would just say "make a full audit for that version", which would always be sufficient but potentially overkill if you actually have a decent audit-chain near the unaudited version and can actually just add a small delta.

    Vet Suggest would say "make a delta from the largest known version mentioned by an audit that is still <= the current version". This could actually be insufficient if done naively enough, because it might be that this is a random floating delta we got from a foreign auditfile without the context of someone they import. In this cast connecting up to that delta might still result in failure.

    Compute The Fingers

    Actually run the algorithm forwards and backwards and compare the minimums and maximums. If criteria are uniform (no unioning shenanigans), then this will get you some pretty good results that are hard to complain about. Unfortunately this may still miss some extremely galaxy brain optimal solutions involving doing two small deltas to bridge a floating delta (I don't think anyone can reasonably expect us to find and explain that, right?)

    Unfortunately this wouldn't help with situations where a review needs to be made "stricter" or weird situations where you can fix some minor path and make the unioning work out. Maybe that's Too Weird and fine to miss?

    ??? Magic ???

    Somehow figure out how to detect and express places where making a review more strict could fix things. I genuinely can't imagine what kind of algorithm this would be right now.

    Ascend Beyond

    Get really smart about cargo's dependency resolution and start pointing out places where you could upgrade or downgrade specific packages to repair the delta graph. I am getting a headache just thinking about this.

    Reviewed by Gankra at 2022-04-15 02:38
  • 11. policy enforcement and better diagnostics

    See #19 for the issues.

    I have temporarily checked in the test failure this introduces so you can see what breaks. Policy as currently specified makes it impossible to have "meh" dependency edges without weakening the global policy.

    Reviewed by Gankra at 2022-04-15 01:27
  • 12. Introduce an Out trait which abstracts over the program's output

    This is certainly not a final solution. It doesn't integrate at all with our error reporting or tracing codepaths, and has the limitation that it requires mutable access thanks to using io::Write directly.

    This allows us to mock out more of terminal interaction to allow it to be tested from within our unit tests.

    This is then used when combined with a new ability to create mock cfgs from real argument lists to add a basic mocked-out test for the certify command.

    This is currently built on top of #210, so contains the changes from that PR.

    Reviewed by mystor at 2022-06-21 21:34
  • 13. Bump insta from 1.14.1 to 1.15.0

    Bumps insta from 1.14.1 to 1.15.0.

    Changelog

    Sourced from insta's changelog.

    1.15.0

    • Bump minimum version of Rust to 1.56.1. This was done because the used serde-yaml dependency no longer supports older versions of Rust.
    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    Reviewed by dependabot[bot] at 2022-06-21 10:44
  • 14. Improve criteria selection in certify

    This patch makes 2 major changes to certify:

    First, certify will no longer attempt to guess the version you are trying to certify unless there is a previous diff or inspect command in history for the crate being certified. This is to avoid potential mistakes, as it would be easy to miss the fact that the wrong version was selected.

    Second, when suggesting criteria to certify, the command will try harder to find the relevant set of criteria. If the diff or version being certified isn't found in the suggest output history, it will now also check the unaudited table, and fall back to running the resolver to determine what criteria would heal the graph when specified for this revision.

    This pass with the resolver is intended to be more relaxed that suggest, as the precise delta being checked is already known, so it should be able to give suggested criteria even for audits which would not be suggested by the suggest algorithm.

    Fixes #192

    Reviewed by mystor at 2022-06-20 20:39
  • 15. Log explanations for long-running pauses

    Sometimes cargo-vet will pause for a short while due to running an I/O operation. Often these pauses are frequent and short enough that we don't need to mention them in the UI, but if they are running for a long enough amount of time, we should consider logging a notice to the user that we're doing something which might take a bit.

    This could be done with an async timer running alongside whatever long operation we're performing which will log if the timer expires before the operation is complete. We'd likely want this for anything doing network access (e.g. running cargo metadata, fetching audits from another repository or downloading crates from crates.io) so that the user isn't confused why cargo vet is taking a bit longer this time.

    If we build up a language for how to communicate these indeterminate pauses, we'll likely also want to use it for the log messages emitted when waiting for a file lock due to another running instance of cargo vet.

    Reviewed by mystor at 2022-06-20 19:54
  • 16. Fix: Auditor information is not read from global git config

    The git configuration is used (in get_user_info() in src/main.rs) by calling git config --get, which is not the same as reading the global git configuration, as git config considers the repository-local configuration file as well as the $HOME git configuration file.

    Hence, adapt the wording here.

    Reviewed by matthiasbeyer at 2022-06-20 13:00
Related tags
Rust Util Collection, a simple and friendly error-chain

RUC Rust Util Collection, a simple and friendly error-chain, with many useful utils as an addition. The painful experience of using error-chain gave b

Jun 6, 2022
Rust Util Collection, a simple and friendly error-chain, with many useful utils as an addition.

RUC Rust Util Collection, a simple and friendly error-chain, with many useful utils as an addition. The painful experience of using error-chain gave b

Mar 27, 2022
A super super super voting system on near block chain :)

Disp41r-Super-Voting-System a super super super voting system on near block chain :) Finish! I call it super voting system, because it is really cool!

Jan 15, 2022
k-mer counter in Rust using the rust-bio and rayon crates

krust is a k-mer counter written in Rust and run from the command line that will output canonical k-mers and their frequency across the records in a f

Jun 5, 2022
Experimental Rust tool for generating FFI definitions allowing many other languages to call Rust code

Diplomat is an experimental Rust tool for generating FFI definitions allowing many other languages to call Rust code. With Diplomat, you can simply define Rust APIs to be exposed over FFI and get high-level C, C++, and JavaScript bindings automatically!

Jun 21, 2022
Aws-sdk-rust - AWS SDK for the Rust Programming Language

The AWS SDK for Rust This repo contains the new AWS SDK for Rust (the SDK) and its public roadmap. Please Note: The SDK is currently released as a dev

Jun 26, 2022
Rust + Yew + Axum + Tauri, full-stack Rust development for Desktop apps.

rust-yew-axum-tauri-desktop template Rust + Yew + Axum + Tauri, full-stack Rust development for Desktop apps. Crates frontend: Yew frontend app for de

Jun 23, 2022
A lightning fast version of tmux-fingers written in Rust, copy/pasting tmux like vimium/vimperator
A lightning fast version of tmux-fingers written in Rust, copy/pasting tmux like vimium/vimperator

tmux-thumbs A lightning fast version of tmux-fingers written in Rust for copy pasting with vimium/vimperator like hints. Usage Press ( prefix + Space

Jun 23, 2022
A command-line tool collection to assist development written in RUST

dtool dtool is a command-line tool collection to assist development Table of Contents Description Usage Tips Installation Description Now dtool suppor

Jun 10, 2022
Rust mid-level IR Abstract Interpreter

MIRAI MIRAI is an abstract interpreter for the Rust compiler's mid-level intermediate representation (MIR). It is intended to become a widely used sta

Jun 23, 2022
Migrate C code to Rust
Migrate C code to Rust

C2Rust helps you migrate C99-compliant code to Rust. The translator (or transpiler) produces unsafe Rust code that closely mirrors the input C code. T

Jun 24, 2022
C to Rust translator

Corrode: Automatic semantics-preserving translation from C to Rust This program reads a C source file and prints an equivalent module in Rust syntax.

Jun 15, 2022
Astronomical algorithms in Rust

astro-rust Contents API Docs About Usage Contributing References About astro-rust is a library of advanced astronomical algorithms for the Rust progra

Jun 9, 2022
A Rust library for calculating sun positions

sun A rust port of the JS library suncalc. Install Add the following to your Cargo.toml [dependencies] sun = "0.2" Usage pub fn main() { let unixti

May 31, 2022
Macro for Python-esque comprehensions in Rust

Cute Macro for Python-esque list comprehensions in Rust. The c! macro implements list and hashmap comprehensions similar to those found in Python, all

Jun 16, 2022
Language Integrated Query in Rust.

Linq in Rust Language Integrated Query in Rust (created by declarative macros). Inspired by LINQ in .NET. What's LINQ This project is under developmen

May 25, 2022
A cross-platform serial port library in Rust.

Introduction serialport-rs is a general-purpose cross-platform serial port library for Rust. It provides a blocking I/O interface and port enumeration

Nov 5, 2021
A Rust macro for writing regex pattern matching.

regexm A Rust macro for writing regex pattern matching.

Jan 24, 2022
Simple ray tracer written in Rust
Simple ray tracer written in Rust

Simple ray tracer written in Rust from scratch I've just finished my first semester at the Faculty of Applied Mathematics and Computer Science at the

May 21, 2022