Cargo subcommand for optimizing binaries with PGO and BOLT.

Overview

cargo-pgo Build Status Latest Version

Cargo subcommand that makes it easier to use PGO and BOLT to optimize Rust binaries.

Installation

$ cargo install cargo-pgo

You will also need the llvm-profdata binary for PGO and llvm-bolt and merge-fdata binaries for BOLT.

You can install the PGO helper binary by adding the llvm-tools-preview component to your toolchain with rustup:

$ rustup component add llvm-tools-preview

For BOLT, it's unfortunately more complicated. See below for BOLT installation guide.

BOLT support is currently experimental.

PGO/BOLT workflow

It is important to understand the workflow of using feedback-directed optimizations. Put simply, it consists of three general steps:

  1. Build binary with instrumentation
    • Perform a special build of your executable which will add additional instrumentation code to it.
  2. Gather performance profiles
    • Run your instrumented binary on representative workloads. The binary will generate profile files on disk which will be then used to optimize the binary.
    • Try to gather as much data as possible. Ideally, run your binary for at least a minute or more.
  3. Build an optimized binary using generated profiles
    • The compiler will use the generated profiles to build an optimized version of your binary.
    • The binary will be optimized with respect to the profiled workloads. If you execute it on a substantially different workload, the optimizations might not work (or they might even make your binary slower!).

Example

Example usage of the tool

Usage

Before you start to optimize your binaries, you should first check if your environment is set up correctly, at least for PGO (BOLT is more complicated). You can do that using the info command:

$ cargo pgo info

PGO

cargo-pgo provides subcommands that wrap common Cargo commands. It will automatically add --release to wrapped commands where it is applicable, since it doesn't really make sense to perform PGO on debug builds.

Generating the profiles

First, you need to generate the PGO profiles by performing an instrumented build. You can currently do that in several ways. The most generic command for creating an instrumented artifact is cargo pgo instrument:

$ cargo pgo instrument [<command>] -- [cargo-args]

The command specifies what command will be executed by cargo. It is optional and by default it is set to build. You can pass additional arguments for cargo after --.

There are several ways of producing the profiles:

  • Building a binary

    $ cargo pgo build
    # or
    $ cargo pgo instrument build

    This is the simplest and recommended approach. You build an instrumented binary and then run it on some workloads. Note that the binary will be located at <target-dir>/<target-triple>/release/<binary-name>.

  • Running an instrumented program

    $ cargo pgo run
    # or
    $ cargo pgo instrument run

    You can also directly execute an instrumented binary with the cargo pgo run command, which is a shortcut for cargo pgo instrument run. This command will instrument the binary and then execute it right away.

  • Run instrumented tests

    $ cargo pgo test
    # or
    $ cargo pgo instrument test

    This command will generate profiles by executing tests. Note that unless your test suite is really comprehensive, it might be better to create a binary and run it on some specific workloads instead.

  • Run instrumented benchmarks

    $ cargo pgo bench
    # or
    $ cargo pgo instrument bench

    This command will generate profiles by executing benchmarks.

Building an optimized binary

Once you have generated some profiles, you can execute cargo pgo optimize to build an optimized version of your binary.

If you want, you can also pass a command to cargo pgo optimize to e.g. run PGO-optimized benchmarks or tests:

$ cargo pgo optimize bench
$ cargo pgo optimize test

BOLT

Using BOLT with cargo-pgo is similar to using PGO, however you have to build BOLT manually and support for it is currently in an experimental stage.

BOLT is not supported directly by rustc, so the instrumentation and optimization commands are not directly applied to binaries built by rustc. Instead, cargo-pgo creates additional binaries that you have to use for gathering profiles and executing the optimized code.

Generating the profiles

First, you need to generate the BOLT profiles. To do that, execute the following command:

$ cargo pgo bolt build

The instrumented binary will be located at <target-dir>/<target-triple>/release/<binary-name>-bolt-instrumented. Execute it on several workloads to gather as much data as possible.

Note that for BOLT, the profile gathering step is optional. You can also simply run the optimization step (see below) without any profiles, although it will probably not have a large effect.

Building an optimized binary

Once you have generated some profiles, you can execute cargo pgo bolt optimize to build an optimized version of your binary. The optimized binary will be named <binary-name>-bolt-optimized.

BOLT + PGO

Yes, BOLT and PGO can even be combined :) To do that, you should first generate PGO profiles and then use BOLT on already PGO optimized binaries. You can do that using the --with-pgo flag:

# Build PGO instrumented binary
$ cargo pgo build
# Run binary to gather PGO profiles
$ ./target/.../<binary>
# Build BOLT instrumented binary using PGO profiles
$ cargo pgo bolt build --with-pgo
# Run binary to gather BOLT profiles
$ ./target/.../<binary>-bolt-instrumented
# Optimize a PGO-optimized binary with BOLT
$ cargo pgo bolt optimize --with-pgo

BOLT installation

Here's a short guide how to compile LLVM with BOLT. You will need a recent compiler, CMake and ninja.

  1. Download LLVM
    $ git clone https://github.com/llvm/llvm-project
    $ cd llvm-project 
  2. (Optional) Checkout a stable version, at least 14.0.0
    $ git checkout llvmorg-14.0.5
    Note that BOLT is being actively fixed, so a trunk version of LLVM might actually work better.
  3. Prepare the build
    $ cmake -S llvm -B build -G ninja \
      -DCMAKE_BUILD_TYPE=Release \
      -DCMAKE_INSTALL_PREFIX=${PWD}/llvm-install \
      -DLLVM_ENABLE_PROJECTS="clang;lld;compiler-rt;bolt"
  4. Compile LLVM with BOLT
    $ cd build
    $ ninja
    $ ninja install 
    The built files should be located at <llvm-dir>/llvm-install/bin. You should add this directory to $PATH to make BOLT usable with cargo-pgo.

Related work

  • cargo-pgo I basically independently reimplemented this crate. It uses an almost identical approach, but doesn't support BOLT. It's not maintained anymore, I got a permission from its author to (re)use its name.

License

MIT

Comments
  • cargo pgo bolt build gives nonexistent executable to run

    cargo pgo bolt build gives nonexistent executable to run

    [lofty@MISO yukari]$ rustc -V
    rustc 1.65.0-nightly (2befdefdd 2022-08-06)
    [lofty@MISO yukari]$ cargo pgo -V
    cargo-pgo 0.1.0
    [lofty@MISO yukari]$ llvm-bolt --version
    LLVM (http://llvm.org/):
      LLVM version 16.0.0git
      Optimized build.
      Default target: x86_64-unknown-linux-gnu
      Host CPU: skylake
    
    BOLT revision 6bb51bf06214af3690af7034f4edeb265732c481
    [lofty@MISO yukari]$ cargo pgo bolt build
    [2022-08-07T14:04:55Z INFO  cargo_pgo::bolt::instrument] BOLT profile directory will be cleared.
    [2022-08-07T14:04:55Z INFO  cargo_pgo::bolt::instrument] BOLT profiles will be stored into /home/lofty/yukari/target/bolt-profiles.
    [2022-08-07T14:04:55Z INFO  cargo_pgo::bolt::instrument] Binary yukari built successfully. It will be now instrumented with BOLT.
    [2022-08-07T14:04:55Z INFO  cargo_pgo::bolt::instrument] Binary yukari instrumented successfully. Now run /home/lofty/yukari/target/x86_64-unknown-linux-gnu/release/yukari-bolt-instrumented on your workload.
    [2022-08-07T14:04:55Z INFO  cargo_pgo::bolt::instrument] BOLT instrumentation build finished successfully.
    [lofty@MISO yukari]$ /home/lofty/yukari/target/x86_64-unknown-linux-gnu/release/yukari-bolt-instrumented
    -bash: /home/lofty/yukari/target/x86_64-unknown-linux-gnu/release/yukari-bolt-instrumented: No such file or directory
    [lofty@MISO yukari]$ find target -name yukari-bolt-instrumented
    [lofty@MISO yukari]$
    

    This was while I was working on https://github.com/yukarichess/yukari.

    I wish I could provide more to work with, but I don't know where to begin.

    opened by Ravenslofty 9
  • cargo pgo optimize: Invalid cross-device link (os error 18)

    cargo pgo optimize: Invalid cross-device link (os error 18)

    I get this error when I run cargo pgo optimize on a project under /home, which is mounted on a separate partition. If I copy the project to e.g. /tmp and run the command on the copy - no problem whatsoever.

    opened by burjui 6
  • Stream output of `cargo`

    Stream output of `cargo`

    Currently, cargo-pgo runs cargo, gathers its output and then writes it to the output. However, it would be much nicer if we could instead stream the output of cargo interactively, so that the user can see it immediately and not just at the end.

    Use-case: https://github.com/Kobzol/cargo-pgo/issues/5#issuecomment-1265031077

    opened by Kobzol 4
  • Reason for stdin(stdio::null)?

    Reason for stdin(stdio::null)?

    I have a command-line app that I wishes to use PGO and BOLT. It takes a few inputs from stdin at the startup. It works well with cargo run.

    cargo pgo seems to be overriding stdin by passing stdio::null().

    Is it necessary for cargo pgo functionalities? cargo pgo run obviously works differently than cargo run due to this.

    opened by whatmatrix 3
  • Run without clearing PGO directory

    Run without clearing PGO directory

    Hi, I'd like to run my program multiple times with different flags, merge their profiles, and optimize the binary based on the result. However, cargo pgo run seems to clear the profiles directory on every invocation. Would it be possible to add a flag to avoid doing this?

    opened by ajeetdsouza 2
  • Flags not passed to `rustc`

    Flags not passed to `rustc`

    When cargo-pgo is used, the [build.rustflags] of .cargo\config.toml are ignored. Lto in cargo.toml seems to be ignored as well, judging by performance. The only way to pass flags seems to be through the RUSTFLAGS environment variable.

    opened by human-0 2
  • pgo/env: Account for .exe extensions during path resolution

    pgo/env: Account for .exe extensions during path resolution

    Currently cargo-pgo does not work out of the box on Windows, because llvm-tools-preview binaries have .exe extensions. This commit extends find_pgo_env to look up llvm-profdata with .exe extension in case one without it is not available.

    I have verified manually that renaming llvm-profdata.exe to llvm-profdata manually works with current main. bolt/env probably has the same issue on Windows, but I didn't look into that. Should we tackle BOLT in this PR as well? By the way, thanks for this library - it looks very neat!

    opened by osiewicz 2
  • Stream `cargo` output instead of buffering it all at once

    Stream `cargo` output instead of buffering it all at once

    @pkolaczk Can you please check that this solves your issue? You can install this version with this command:

    $ cargo install --git https://github.com/kobzol/cargo-pgo --branch cargo-stream
    

    The output from criterion is not colored, I'm not sure how to do it yet. Setting cargo --color always doesn't help.

    Fixes: https://github.com/Kobzol/cargo-pgo/issues/19

    opened by Kobzol 0
  • Hash PGO profile files to properly invalidate them

    Hash PGO profile files to properly invalidate them

    This PR changes merged PGO profile naming so that they will be named with the (BLAKE-3) hash of their contents. This will make sure that the compilation session will be invalidated if the profiles change (before this wouldn't happen, since the profiles used the same path in each compilation session).

    This PR also fixes the merging of already merged profiles, which happened before. Now only the .profraw files will be merged. This should make it faster to run cargo pgo optimize if the code nor the profiles have changed.

    Related issue: https://github.com/rust-lang/rust/issues/100397

    opened by Kobzol 0
  • Remove `cargo` dependency

    Remove `cargo` dependency

    This should make the crate much easier and quicker to build and also resolve problems with newer Cargo features not being available by the cargo crate version that was used.

    Fixes: https://github.com/Kobzol/cargo-pgo/issues/2

    opened by Kobzol 0
  • Evaluate PGO coverage

    Evaluate PGO coverage

    Hi!

    Is there a way to calculate PGO coverage of the profiled program? I see now cargo-pgo shows warnings for functions without a profile. I want to see more "advanced" statistics regarding PGO coverage. I guess something like gcov + lcov from C++ world (with different report capabilities, etc).

    I guess it could be done somehow with llvm-cov report profile.profdata but I am not sure.

    Thanks in advance!

    opened by zamazan4ik 0
  • A couple PGO+BOLT limitations worth documenting?

    A couple PGO+BOLT limitations worth documenting?

    In order to get PGO+BOLT working, I needed to

    • Switch from fat LTO to thin
    • Disable strip = "symbols" in my release profile
    • Use trunk LLVM (you already have this in the README, this is just to affirm that it's still needed - I was trying with the 15.0.4 tag previously)

    Is it worth documenting those, and/or is there a way to make those settings work in RUSTFLAGS when you build the PGO-optimized binary that you're going to pass to llvm-bolt --instrument? The errors come from cc, which makes them super cryptic and unhelpful; I found the workarounds by just throwing possible fixes at the wall until something stuck :joy:

    I can retrigger the errors tonight if having a copy of them would be helpful

    opened by mcronce 3
  • Support running benchmarks

    Support running benchmarks

    Currently I was only able to build an optimized benchmark binary with cargo pgo --bench <bench>. But I could find no way to feed it into the benchmark runner. I suspect something like cargo pgo bench and cargo pgo bolt bench would be required to support this. The only workaround I found was to copy the bolt optimized binary to the path of the regular cargo bench generated one, essentially replacing the binary. But that's rather cumbersome.

    opened by mvtec-bergdolll 7
Owner
Jakub Beránek
Jakub Beránek
cargo-lambda is a Cargo subcommand to help you work with AWS Lambda.

cargo-lambda cargo-lambda is a Cargo subcommand to help you work with AWS Lambda. The new subcommand creates a basic Rust package from a well defined

null 184 Jan 5, 2023
A cargo subcommand that extends cargo's capabilities when it comes to code generation.

cargo-px Cargo Power eXtensions Check out the announcement post to learn more about cargo-px and the problems it solves with respect to code generatio

Luca Palmieri 33 May 7, 2023
Bolt is a desktop application that is designed to make the process of developing and testing APIs easier and more efficient.

Bolt ⚡ Bolt is a desktop application that is designed to make the process of developing and testing APIs easier and more efficient. Quick start ??‍??

0xHiro 6 Mar 26, 2023
A cargo subcommand that displays the assembly generated for Rust source code

cargo-show-asm A cargo subcommand that displays the assembly generated for Rust source code.

null 193 Dec 29, 2022
Cargo subcommand to easily bootstrap nocode applications. Write nothing; deploy nowhere.

cargo-nocode No code is the best way to write secure and reliable applications. Write nothing; deploy nowhere. cargo-nocode aims to bring the nocode a

Orhun Parmaksız 29 Jul 1, 2023
Functional testing framework for AVR binaries, powered by simavr.

Functional testing framework for AVR binaries, powered by simavr. tl;dr get your microcontroller's firmware black-box-tested in seconds!

Patryk Wychowaniec 14 Nov 16, 2022
Load and resolve Cargo configuration.

cargo-config2 Load and resolve Cargo configuration. This library is intended to accurately emulate the actual behavior of Cargo configuration, for exa

Taiki Endo 6 Jan 10, 2023
Cargo-BOJ: test and submit solutions to BOJ problems

Cargo-BOJ Test and submit solutions to BOJ (Baekjoon Online Judge) problems. Defaults are geared towards Rust solutions, but non-Rust usage is support

null 5 Apr 3, 2023
cargo, make me a project

cargo-generate cargo, make me a project cargo-generate is a developer tool to help you get up and running quickly with a new Rust project by leveragin

null 1.2k Jan 3, 2023
Render cargo dependency tree in online

Cargo Tree Online Check out rendered page Render cargo dependency tree in online. Usage trunk serve Copy and paste the content of Cargo.lock file to

Kangwook Lee (이강욱) 2 Sep 23, 2021
Generate a THIRDPARTY file with all licenses in a cargo project.

cargo-bundle-licenses Bundle all third-party licenses into a single file. NOTE This tools is not a lawyer and no guarantee of correctness can be made

Seth 58 Jan 7, 2023
Rust+Cargo lightweight hello world with the most minimum binary size possible.

Lightweight Cargo Hello World Rust+Cargo lightweight hello world with the most minimum binary size possible. requirements 1: Rustup (Rustc, Cargo) Ins

Raymond 1 Dec 13, 2021
Plugin to request a relaunch when uploading a Skyline plugin through cargo skyline

restart-plugin A skyline plugin for allowing cargo-skyline (or other tools) to restart your game without you having to touch your controller. Install

null 1 Nov 21, 2021
Convenience wrapper for cargo buildscript input/output

A convenience wrapper for cargo buildscript input/output. Why? The cargo buildscript API is (necessarily) stringly-typed.

Christopher Durham 6 Sep 25, 2022
Mommy's here to support you when running cargo~

cargo-mommy Mommy's here to support you when running cargo~ ❤️ Installation Install cargo-mommy like you would any other cargo extension~ > cargo inst

Aria Beingessner 198 Jan 5, 2023
Cargo wrapper for working with Webassembly wasi(x).

cargo-wasix A cargo subcommand that wraps regular cargo commands for compiling Rust code to wasix, a superset of Websassembly wasi with additional fun

WASIX 19 Jun 12, 2023
cargo-expand, but with Hygiene [WIP]

cargo-hexpand cargo-expand, but with Hygiene*. *Still very WIP. The problem cargo-expand works well, but it does not respect hygiene when expanding th

Sasha Pourcelot 20 Aug 9, 2023
Cargo extension to recycle outdated build artifacts

cargo gc Cargo extension to recycle outdated build artifacts. And try the best to avoid recompilation. Usage Install it with cargo: cargo install carg

Ruihang Xia 23 Aug 30, 2023
Fusion is a cross-platform App Dev ToolKit build on Rust . Fusion lets you create Beautiful and Fast apps for mobile and desktop platform.

Fusion is a cross-platform App Dev ToolKit build on Rust . Fusion lets you create Beautiful and Fast apps for mobile and desktop platform.

Fusion 1 Oct 19, 2021