Cargo subcommand for optimizing binaries with PGO and BOLT.

Overview

cargo-pgo Build Status Latest Version

Cargo subcommand that makes it easier to use PGO and BOLT to optimize Rust binaries.

Installation

$ cargo install cargo-pgo

You will also need the llvm-profdata binary for PGO and llvm-bolt and merge-fdata binaries for BOLT.

You can install the PGO helper binary by adding the llvm-tools-preview component to your toolchain with rustup:

$ rustup component add llvm-tools-preview

For BOLT, it's unfortunately more complicated. See below for BOLT installation guide.

BOLT support is currently experimental.

PGO/BOLT workflow

It is important to understand the workflow of using feedback-directed optimizations. Put simply, it consists of three general steps:

  1. Build binary with instrumentation
    • Perform a special build of your executable which will add additional instrumentation code to it.
  2. Gather performance profiles
    • Run your instrumented binary on representative workloads. The binary will generate profile files on disk which will be then used to optimize the binary.
    • Try to gather as much data as possible. Ideally, run your binary for at least a minute or more.
  3. Build an optimized binary using generated profiles
    • The compiler will use the generated profiles to build an optimized version of your binary.
    • The binary will be optimized with respect to the profiled workloads. If you execute it on a substantially different workload, the optimizations might not work (or they might even make your binary slower!).

Example

Example usage of the tool

Usage

Before you start to optimize your binaries, you should first check if your environment is set up correctly, at least for PGO (BOLT is more complicated). You can do that using the info command:

$ cargo pgo info

PGO

cargo-pgo provides subcommands that wrap common Cargo commands. It will automatically add --release to wrapped commands where it is applicable, since it doesn't really make sense to perform PGO on debug builds.

Generating the profiles

First, you need to generate the PGO profiles by performing an instrumented build. You can currently do that in several ways. The most generic command for creating an instrumented artifact is cargo pgo instrument:

$ cargo pgo instrument [<command>] -- [cargo-args]

The command specifies what command will be executed by cargo. It is optional and by default it is set to build. You can pass additional arguments for cargo after --.

There are several ways of producing the profiles:

  • Building a binary

    $ cargo pgo build
    # or
    $ cargo pgo instrument build

    This is the simplest and recommended approach. You build an instrumented binary and then run it on some workloads. Note that the binary will be located at <target-dir>/<target-triple>/release/<binary-name>.

  • Running an instrumented program

    $ cargo pgo run
    # or
    $ cargo pgo instrument run

    You can also directly execute an instrumented binary with the cargo pgo run command, which is a shortcut for cargo pgo instrument run. This command will instrument the binary and then execute it right away.

  • Run instrumented tests

    $ cargo pgo test
    # or
    $ cargo pgo instrument test

    This command will generate profiles by executing tests. Note that unless your test suite is really comprehensive, it might be better to create a binary and run it on some specific workloads instead.

  • Run instrumented benchmarks

    $ cargo pgo bench
    # or
    $ cargo pgo instrument bench

    This command will generate profiles by executing benchmarks.

Building an optimized binary

Once you have generated some profiles, you can execute cargo pgo optimize to build an optimized version of your binary.

If you want, you can also pass a command to cargo pgo optimize to e.g. run PGO-optimized benchmarks or tests:

$ cargo pgo optimize bench
$ cargo pgo optimize test

BOLT

Using BOLT with cargo-pgo is similar to using PGO, however you have to build BOLT manually and support for it is currently in an experimental stage.

BOLT is not supported directly by rustc, so the instrumentation and optimization commands are not directly applied to binaries built by rustc. Instead, cargo-pgo creates additional binaries that you have to use for gathering profiles and executing the optimized code.

Generating the profiles

First, you need to generate the BOLT profiles. To do that, execute the following command:

$ cargo pgo bolt build

The instrumented binary will be located at <target-dir>/<target-triple>/release/<binary-name>-bolt-instrumented. Execute it on several workloads to gather as much data as possible.

Note that for BOLT, the profile gathering step is optional. You can also simply run the optimization step (see below) without any profiles, although it will probably not have a large effect.

Building an optimized binary

Once you have generated some profiles, you can execute cargo pgo bolt optimize to build an optimized version of your binary. The optimized binary will be named <binary-name>-bolt-optimized.

BOLT + PGO

Yes, BOLT and PGO can even be combined :) To do that, you should first generate PGO profiles and then use BOLT on already PGO optimized binaries. You can do that using the --with-pgo flag:

# Build PGO instrumented binary
$ cargo pgo build
# Run binary to gather PGO profiles
$ ./target/.../<binary>
# Build BOLT instrumented binary using PGO profiles
$ cargo pgo bolt build --with-pgo
# Run binary to gather BOLT profiles
$ ./target/.../<binary>-bolt-instrumented
# Optimize a PGO-optimized binary with BOLT
$ cargo pgo bolt optimize --with-pgo

BOLT installation

Here's a short guide how to compile LLVM with BOLT. You will need a recent compiler, CMake and ninja.

  1. Download LLVM
    $ git clone https://github.com/llvm/llvm-project
    $ cd llvm-project 
  2. (Optional) Checkout a stable version, at least 14.0.0
    $ git checkout llvmorg-14.0.5
    Note that BOLT is being actively fixed, so a trunk version of LLVM might actually work better.
  3. Prepare the build
    $ cmake -S llvm -B build -G ninja \
      -DCMAKE_BUILD_TYPE=Release \
      -DCMAKE_INSTALL_PREFIX=${PWD}/llvm-install \
      -DLLVM_ENABLE_PROJECTS="clang;lld;compiler-rt;bolt"
  4. Compile LLVM with BOLT
    $ cd build
    $ ninja
    $ ninja install 
    The built files should be located at <llvm-dir>/llvm-install/bin. You should add this directory to $PATH to make BOLT usable with cargo-pgo.

Related work

  • cargo-pgo I basically independently reimplemented this crate. It uses an almost identical approach, but doesn't support BOLT. It's not maintained anymore, I got a permission from its author to (re)use its name.

License

MIT

Comments
  • cargo pgo bolt build gives nonexistent executable to run

    cargo pgo bolt build gives nonexistent executable to run

    [[email protected] yukari]$ rustc -V
    rustc 1.65.0-nightly (2befdefdd 2022-08-06)
    [[email protected] yukari]$ cargo pgo -V
    cargo-pgo 0.1.0
    [[email protected] yukari]$ llvm-bolt --version
    LLVM (http://llvm.org/):
      LLVM version 16.0.0git
      Optimized build.
      Default target: x86_64-unknown-linux-gnu
      Host CPU: skylake
    
    BOLT revision 6bb51bf06214af3690af7034f4edeb265732c481
    [[email protected] yukari]$ cargo pgo bolt build
    [2022-08-07T14:04:55Z INFO  cargo_pgo::bolt::instrument] BOLT profile directory will be cleared.
    [2022-08-07T14:04:55Z INFO  cargo_pgo::bolt::instrument] BOLT profiles will be stored into /home/lofty/yukari/target/bolt-profiles.
    [2022-08-07T14:04:55Z INFO  cargo_pgo::bolt::instrument] Binary yukari built successfully. It will be now instrumented with BOLT.
    [2022-08-07T14:04:55Z INFO  cargo_pgo::bolt::instrument] Binary yukari instrumented successfully. Now run /home/lofty/yukari/target/x86_64-unknown-linux-gnu/release/yukari-bolt-instrumented on your workload.
    [2022-08-07T14:04:55Z INFO  cargo_pgo::bolt::instrument] BOLT instrumentation build finished successfully.
    [[email protected] yukari]$ /home/lofty/yukari/target/x86_64-unknown-linux-gnu/release/yukari-bolt-instrumented
    -bash: /home/lofty/yukari/target/x86_64-unknown-linux-gnu/release/yukari-bolt-instrumented: No such file or directory
    [[email protected] yukari]$ find target -name yukari-bolt-instrumented
    [[email protected] yukari]$
    

    This was while I was working on https://github.com/yukarichess/yukari.

    I wish I could provide more to work with, but I don't know where to begin.

    opened by Ravenslofty 9
  • cargo pgo optimize: Invalid cross-device link (os error 18)

    cargo pgo optimize: Invalid cross-device link (os error 18)

    I get this error when I run cargo pgo optimize on a project under /home, which is mounted on a separate partition. If I copy the project to e.g. /tmp and run the command on the copy - no problem whatsoever.

    opened by burjui 2
  • pgo/env: Account for .exe extensions during path resolution

    pgo/env: Account for .exe extensions during path resolution

    Currently cargo-pgo does not work out of the box on Windows, because llvm-tools-preview binaries have .exe extensions. This commit extends find_pgo_env to look up llvm-profdata with .exe extension in case one without it is not available.

    I have verified manually that renaming llvm-profdata.exe to llvm-profdata manually works with current main. bolt/env probably has the same issue on Windows, but I didn't look into that. Should we tackle BOLT in this PR as well? By the way, thanks for this library - it looks very neat!

    opened by osiewicz 2
  • Hash PGO profile files to properly invalidate them

    Hash PGO profile files to properly invalidate them

    This PR changes merged PGO profile naming so that they will be named with the (BLAKE-3) hash of their contents. This will make sure that the compilation session will be invalidated if the profiles change (before this wouldn't happen, since the profiles used the same path in each compilation session).

    This PR also fixes the merging of already merged profiles, which happened before. Now only the .profraw files will be merged. This should make it faster to run cargo pgo optimize if the code nor the profiles have changed.

    Related issue: https://github.com/rust-lang/rust/issues/100397

    opened by Kobzol 0
  • Remove `cargo` dependency

    Remove `cargo` dependency

    This should make the crate much easier and quicker to build and also resolve problems with newer Cargo features not being available by the cargo crate version that was used.

    Fixes: https://github.com/Kobzol/cargo-pgo/issues/2

    opened by Kobzol 0
  • Support running benchmarks

    Support running benchmarks

    Currently I was only able to build an optimized benchmark binary with cargo pgo --bench <bench>. But I could find no way to feed it into the benchmark runner. I suspect something like cargo pgo bench and cargo pgo bolt bench would be required to support this. The only workaround I found was to copy the bolt optimized binary to the path of the regular cargo bench generated one, essentially replacing the binary. But that's rather cumbersome.

    opened by mvtec-bergdolll 3
Owner
Jakub Beránek
Jakub Beránek
cargo-lambda is a Cargo subcommand to help you work with AWS Lambda.

cargo-lambda cargo-lambda is a Cargo subcommand to help you work with AWS Lambda. The new subcommand creates a basic Rust package from a well defined

null 126 Sep 18, 2022
A cargo subcommand that displays the assembly generated for Rust source code

cargo-show-asm A cargo subcommand that displays the assembly generated for Rust source code.

null 90 Sep 25, 2022
Functional testing framework for AVR binaries, powered by simavr.

Functional testing framework for AVR binaries, powered by simavr. tl;dr get your microcontroller's firmware black-box-tested in seconds!

Patryk Wychowaniec 12 Sep 3, 2022
cargo, make me a project

cargo-generate cargo, make me a project cargo-generate is a developer tool to help you get up and running quickly with a new Rust project by leveragin

null 1.1k Sep 15, 2022
Render cargo dependency tree in online

Cargo Tree Online Check out rendered page Render cargo dependency tree in online. Usage trunk serve Copy and paste the content of Cargo.lock file to

Kangwook Lee (이강욱) 2 Sep 23, 2021
Generate a THIRDPARTY file with all licenses in a cargo project.

cargo-bundle-licenses Bundle all third-party licenses into a single file. NOTE This tools is not a lawyer and no guarantee of correctness can be made

Seth 47 Jul 27, 2022
Rust+Cargo lightweight hello world with the most minimum binary size possible.

Lightweight Cargo Hello World Rust+Cargo lightweight hello world with the most minimum binary size possible. requirements 1: Rustup (Rustc, Cargo) Ins

Raymond 1 Dec 13, 2021
Plugin to request a relaunch when uploading a Skyline plugin through cargo skyline

restart-plugin A skyline plugin for allowing cargo-skyline (or other tools) to restart your game without you having to touch your controller. Install

null 1 Nov 21, 2021
Convenience wrapper for cargo buildscript input/output

A convenience wrapper for cargo buildscript input/output. Why? The cargo buildscript API is (necessarily) stringly-typed.

Christopher Durham 6 Sep 25, 2022
Fusion is a cross-platform App Dev ToolKit build on Rust . Fusion lets you create Beautiful and Fast apps for mobile and desktop platform.

Fusion is a cross-platform App Dev ToolKit build on Rust . Fusion lets you create Beautiful and Fast apps for mobile and desktop platform.

Fusion 1 Oct 19, 2021
List of Persian Colors and hex colors for CSS, SCSS, PHP, JS, Python, and Ruby.

Persian Colors (Iranian colors) List of Persian Colors and hex colors for CSS, SCSS, PHP, C++, QML, JS, Python, Ruby and CSharp. Persian colors Name H

Max Base 12 Sep 3, 2022
Northstar is a horizontally scalable and multi-tenant Kubernetes cluster provisioner and orchestrator

Northstar Northstar is a horizontally scalable and multi-tenant Kubernetes cluster provisioner and orchestrator. Explore the docs » View Demo · Report

Lucas Clerisse 1 Jan 22, 2022
Time related types (and conversions) for scientific and astronomical usage.

astrotime Time related types (and conversions) for scientific and astronomical usage. This library is lightweight and high performance. Features The f

Michael Dilger 3 Aug 22, 2022
A Diablo II library for core and simple client functionality, written in Rust for performance, safety and re-usability

A Diablo II library for core and simple client functionality, written in Rust for performance, safety and re-usability

null 3 Mar 9, 2022
UnTeX is both a library and an executable that allows you to manipulate and understand TeX files.

UnTeX UnTeX is both a library and an executable that allows you to manipulate and understand TeX files. Usage Executable If you wish to use the execut

Jérome Eertmans 1 Apr 5, 2022
A tool of generating and viewing dice roll success distributions.

AZDice A GUI tool for generating and visualising dice roll probability distributions. Aims Intended to help people trying to get game balance just rig

null 13 Mar 2, 2021
Check Have I Been Pwned and see if it's time for you to change passwords.

checkpwn Check Have I Been Pwned and see if it's time for you to change passwords. Getting started Install: cargo install checkpwn Update: cargo inst

Johannes 87 Aug 11, 2022
Utilities and tools based around Amazon S3 to provide convenience APIs in a CLI

s3-utils Utilities and tools based around Amazon S3 to provide convenience APIs in a CLI. This tool contains a small set of command line utilities for

Isaac Whitfield 43 Sep 12, 2022
A low-ish level tool for easily writing and hosting WASM based plugins.

A low-ish level tool for easily writing and hosting WASM based plugins. The goal of wasm_plugin is to make communicating across the host-plugin bounda

Alec Deason 62 Sep 20, 2022