The fastest and safest AV1 encoder.

Last update: May 25, 2022

rav1e Travis Build Status Actions Status Coverage Status

The fastest and safest AV1 encoder.

Table of Content

Overview

rav1e is an AV1 video encoder. It is designed to eventually cover all use cases, though in its current form it is most suitable for cases where libaom (the reference encoder) is too slow.

Features

  • Intra, inter, and switch frames
  • 64x64 superblocks
  • 4x4 to 64x64 RDO-selected square and rectangular blocks
  • DC, H, V, Paeth, smooth, and all directional prediction modes
  • DCT, (FLIP-)ADST and identity transforms (up to 64x64, 16x16 and 32x32 respectively)
  • 8-, 10- and 12-bit depth color
  • 4:2:0, 4:2:2 and 4:4:4 chroma sampling
  • 11 speed settings (0-10, exhaustive to near real-time)
  • Constant quantizer and target bitrate (single- and multi-pass) encoding modes
  • Still picture mode

Documentation

Find the documentation in doc/

Releases

For the foreseeable future, a weekly pre-release of rav1e will be published every Tuesday.

Building

Toolchain: Rust

rav1e currently requires Rust 1.44.1 or later to build.

Dependency: NASM

Some x86_64-specific optimizations require NASM 2.14.02 or newer and are enabled by default.

The CI is testing against nasm 2.15.05, so bugs for other versions might happen. If you find one please open an issue!

Install nasm

ubuntu 20.04 (nasm 2.14.02)

sudo apt install nasm

ubuntu 18.04 (nasm 2.14.02)

sudo apt install nasm-mozilla
# link nasm into $PATH
sudo ln /usr/lib/nasm-mozilla/bin/nasm /usr/local/bin/

fedora 31, 32 (nasm 2.14.02)

sudo dnf install nasm

windows (nasm 2.15.05)
Have a NASM binary in your system PATH.

$NASM_VERSION="2.15.05" # or newer
$LINK="https://www.nasm.us/pub/nasm/releasebuilds/$NASM_VERSION/win64"
curl --ssl-no-revoke -LO "$LINK/nasm-$NASM_VERSION-win64.zip"
7z e -y "nasm-$NASM_VERSION-win64.zip" -o "C:\nasm"
# set path for the current sessions
set PATH="%PATH%;C:\nasm"

macOS (nasm 2.15.05)

brew install nasm

Release binary

To build release binary in target/release/rav1e run:

cargo build --release

Unstable features

Experimental API and Features can be enabled by using the unstable feature.

cargo build --features <feature>,unstable

Current unstable features

  • Channel API:
cargo build --features channel-api,unstable

Those Features and API are bound to change and evolve, do not rely on them staying the same over releases.

Target-specific builds

The rust autovectorizer can produce a binary that is about 6%-7% faster if it can use avx2 in the general code, you may allow it by issuing:

RUSTFLAGS="-C target-cpu=native" cargo build --release
# or
RUSTFLAGS="-C target-feature=+avx2,+fma" cargo build --release

The resulting binary will not work on cpus that do not sport the same set of SIMD extensions enabled.

NOTE : You may use rustc --print target-cpus to check if the cpu is supported, if not -C target-cpu=native would be a no-op.

Building the C-API

rav1e provides a C-compatible set of library, header and pkg-config file.

To build and install it you can use cargo-c:

cargo install cargo-c
cargo cinstall --release

Usage

Compressing video

Input videos must be in y4m format. The monochrome color format is not supported.

cargo run --release --bin rav1e -- input.y4m -o output.ivf

(Find a y4m-file for testing at tests/small_input.y4m or at http://ultravideo.cs.tut.fi/#testsequences)

Decompressing video

Encoder output should be compatible with any AV1 decoder compliant with the v1.0.0 specification. You can build compatible aomdec using the following:

mkdir aom_test && cd aom_test
cmake /path/to/aom -DAOM_TARGET_CPU=generic -DCONFIG_AV1_ENCODER=0 -DENABLE_TESTS=0 -DENABLE_DOCS=0 -DCONFIG_LOWBITDEPTH=1
make -j8
./aomdec ../output.ivf -o output.y4m

Configuring

rav1e has several optional features that can be enabled by passing --features to cargo. Passing --all-features is discouraged.

Features

Find a full list in feature-table in Cargo.toml

  • asm - enabled by default. When enabled, assembly is built for the platforms supporting it.
    • x86_64: Requires nasm.
    • aarch64
      • Requires gas
      • Alternative: Use clang assembler by setting CC=clang

NOTE: SSE2 is always enabled on x86_64, neon is always enabled for aarch64, you may set the environment variable RAV1E_CPU_TARGET to rust to disable all the assembly-optimized routines at the runtime.

Contributing

Please read our guide to contributing to rav1e.

Getting in Touch

Come chat with us on the IRC channel #daala on Freenode! If you don't have IRC set up you can easily connect from your web browser.

GitHub

https://github.com/xiph/rav1e
Comments
  • 1. Don't restrict partition sizes to exactly fit on the right or bottom frame boundaries

    • Current block partitioning approach on right or borrtom frame boundaries in rav1e
    • Both topdown and bottomup function keep splitting the input SB (SuperBlock) until no partitioned blocks straddle on the right or bottom frame boundaries. (The condition whether keep splitting or not is checked by 'must_split', which is set true if the current partition size straddle on the frame boundary or other condition such as if current size is larger than desired max partition size as shown in the code, https://github.com/xiph/rav1e/blob/8f273bcbde77e5f3711138a57c13dec9dc793973/src/encoder.rs#L2546)

    Affected areas of codebase can be:

    • [x] Partition search functions should not restrict partition sizes to exactly fit on the right or bottom frame boundaries
    • [x] Distortion compute functions to use visible area, which can be any size not defined by av1 partition sizes ~~- [x] Intra prediction to predict only visible pixels only~~
    • [x] Ref pixels for intra pred (i.e. intra edge pixels) should not use invisible pixels. ~~- [ ] Inter prediction (Motion Estimation) to predict only visible pixels only~~ ~~- [ ] Define the pixel values for invisible area as input to forward transforms, i.e. what kind of padding to use for input and reconstructed frame? Already defined that extension of directly adjacent and last available pixel.~~
    Reviewed by ycho at 2020-02-12 17:47
  • 2. Tile encoding

    (description updated on 16 april 2019)

    This PR implements tile encoding (#631).

    1. The (many) first commits introduce tiling structures, which allow to expose simultaneous tiled regions of the whole frame data.
    2. Following commits use the tiling structures where necessary in the whole codebase.
    3. Then, command line arguments are added, the encoder encodes tiles separately, and separate tiles are written to the bitstream.
    4. Finally, parallelization is enabled (spoiler: 1 line of code).

    Context

    Encoding a frame first involves frame-wise accesses (initialization, etc.), then tile-wise accesses (to encode tiles in parallel), then frame-wise accesses using the results of tile-encoding (deblocking, cdef, …):

                                    \
          +----------------+         |
          |                |         |
          |                |         |  Frame-wise accesses
          |                |          >
          |                |         |   - FrameState<T>
          |                |         |   - Frame<T>
          +----------------+         |   - Plane<T>
                                    /    - ...
    
                  ||   tiling views
                  \/
                                    \
      +---+  +---+  +---+  +---+     |
      |   |  |   |  |   |  |   |     |  Tile encoding (possibly in parallel)
      +---+  +---+  +---+  +---+     |
                                     |
      +---+  +---+  +---+  +---+     |  Tile-wise accesses
      |   |  |   |  |   |  |   |      >
      +---+  +---+  +---+  +---+     |   - TileStateMut<'_, T>
                                     |   - TileMut<'_, T>
      +---+  +---+  +---+  +---+     |   - PlaneRegionMut<'_, T>
      |   |  |   |  |   |  |   |     |
      +---+  +---+  +---+  +---+     |
                                    /
    
                  ||   vanishing of tiling views
                  \/
                                    \
          +----------------+         |
          |                |         |
          |                |         |  Frame-wise accesses
          |                |          >
          |                |         |  (deblocking, CDEF, ...)
          |                |         |
          +----------------+         |
                                    /
    

    Tiling

    As you know, in Rust, it is not sufficient not to read/write the same memory from several threads, it must be impossible to write (safe) code that could do it. More precisely, a mutable reference may not alias any other reference to the same memory.

    That's the reason why, as a preliminary step, I replaced accesses using the whole plane as a raw slice in addition to the stride information by PlaneSlice (#1035) and PlaneMutSlice (#1043).

    But Plane(Mut)Slice still borrows the whole plane slice, so it does not, in itself, solves the problem.

    There are several structures to be tiled, which form a tree:

     +- FrameState → TileState
     |  +- Frame → Tile
     |  |  +- Plane → PlaneRegion 
     |  +  RestorationState → TileRestorationState
     |  |  +- RestorationPlane → TileRestorationPlane
     |  |     +- FrameRestorationUnits → TileRestorationUnits
     |  +  FrameMotionVectors → TileMotionVectors
     +- FrameBlocks → TileBlocks
    

    Most of them exist both in const and mutable version (e.g. PlaneRegion and PlaneRegionMut).

    Tiling structures

    PlaneRegion

    This is a view of bounded region of a Plane. It is similar to PlaneSlice, except that it does not borrow the whole underlying raw slice. That way, it is possible to get several non-overlapping regions simultaneously.

    In the end, we should probably merge it with PlaneSlice, but it requires more work because some frame-wise code still uses PlaneSlice in the code base.

    It is possible to retrieve a subregion of a region (which may not exceed its parent). In theory, a subregion is defined by a rectangle (for example: x, y, width, height), but in practice, we need more flexibility. For example, we often need to retrieve a region from an offset, using the same bottom-right corner as its parent without providing width and height.

    For that purpose, I propose a specific Area structure (actually, a Rust enum) to describe subregion bounds. Here are some usage examples:

    let region = plane.region(Area::Rect { x: 32, y: 32, width: 512, height: 512 });
    
    // the area is relative to the parent region
    let subregion = region.subregion(Area::StartingAt { x: 128, y: 128 });
    // it is equivalent to
    let subregion = region.subregion(Area::Rect { x: 128, y: 128, width: 384, height: 384 });
    // or
    let subregion = plane.region(Area:: Rect { x: 160, y: 160, width: 384, height: 384 });
    

    Retrieving a subregion from a BlockOffset is so common accross the code base that I decided to expose it directly:

    let bo = BlockOffset { x: 2, y: 3 };
    let subregion = region.subregion(Area::BlockStartingAt { bo });
    

    Like Plane(Mut)Slice, it provides operator[] and iterators over its rows:

    let row5 = &region[5];
    let value = region[3][4];
    for row in region.rows_iter() {
        let _first_four_values = &row[..4];
    }
    

    The mutable versions of the structure (PlaneRegionMut) and methods are also provided.

    Tile

    A Tile is a view of 3 colocated plane regions (Tile is to a PlaneRegion as a Frame is to a Plane).

    The mutable version (TileMut) is also provided.

    TileState

    The way the FrameState fields are mapped in TileState depends on how they are accessed tile-wise and frame-wise.

    Some fields (like qc) are only used during tile-encoding, so they are only stored in TileState.

    Some other fields (like input or segmentation) are not written tile-wise, so they just reference the matching field in FrameState.

    Some others (like rec) are written tile-wise, but must be accessible frame-wise once the tile views vanish (e.g. for deblocking).

    It contains 2 tiled views: TileRestorationState and a vector of TileMotionVectorsMut (a tiled view of FrameMotionVectors).

    This structure is only provided as mutable (TileStateMut). A const version is not necessary, and would require to instantiate a const version of all its embedded tiled views.

    TileBlocks

    TileBlocks is a tiled view of FrameBlocks. It exposes the blocks associated to the tile.

    The mutable version (TileBlocksMut) is also provided.

    Splitting into tiles

    A TilingInfo structure computes all the details about tiling from the frame width and height and the (log2 of the) number of tile columns and rows. The details are accessible for initializing data or writing into the bitstream.

    It provides an iterator over tiles (yielding one TileStateMut and one TileBlocksMut for each tile).

    Frame offsets vs tile offsets

    In encode_tile(), super-block, block and plane offsets are expressed relative to the tile. The tiling views expose its data relative to the tile:

    • plane_region[y][x] is pixel (x, y) relative to the plane region,
    • tile_blocks[boy][box] contains the Block at (box, boy) relative to the tile,

    TileStateMut exposes some references to frame-level data stored in FrameState:

    • input is a reference to the whole frame,
    • input_hres and input_qres are references to the whole planes.

    When accessing these frame-level data, tile offsets are converted to frame offsets, for example by:

    let frame_bo = ts.to_frame_block_offset(bo);
    

    Current state

    It works.

    Need more tests and reviews.

    Usage

    Pass the requested log2 number of tiles, with --tile-cols-log2 and --tile-rows-log2. For example, to request 2x2 tiles:

    rav1e video.y4m -o video.ivf --tile-cols-log2 1 --tile-rows-log2 1
    

    Currently, the number of tiles is passed in log2 (like in libaom, even if the aomenc options are called --tile-columns and --tile-rows), to avoid any confusion. Maybe we could find a correct user-friendly option later.

    Note that the actual number of tiles may be smaller (e.g. if the image size has fewer super-blocks).

    Reviewed by rom1v at 2019-03-18 16:22
  • 3. [WIP] Add horz/vert (2:1 only) partitions to top-down partition rdo

    Not complete yet.

    TODO :

    • [x] Required to add rectangular transforms, i.e. 8x4, 4x8, 16x8, 8x16, 32x16, 16x32, 32x64, 64x32
    • [x] Check the rectangular transforms work correctly with existing helper functions
    • [x] Check currently enabled intra prediction code can also work for rectangular cases
    • [x] Test top-down partition rdo with new partitions added
    • [x] Check with each of minimum block size
    • [x] Check both of intra and inter modes works correctly with rectangular block sizes/
    Reviewed by ycho at 2018-11-06 23:42
  • 4. Allow inclusion in static builds

    I'm trying to build a static version ffmpeg including rav1e, but it seems like one can't compile a static library that includes lgcc_s, because there's only a shard version of it.

    gcc-10 -L<prefix>/lib -I<prefix>/lib -L/usr/lib/gcc/x86_64-linux-gnu/10 -static -Wl,--as-needed -Wl,-z,noexecstack -I<prefix>/include/rav1e -L<prefix>/lib -o /tmp/ffconf.EpJq5glV/test /tmp/ffconf.EpJq5glV/test.o -lrav1e -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -lm -lpthread
    /usr/bin/ld: cannot find -lgcc_s
    collect2: error: ld returned 1 exit status
    ERROR: rav1e >= 0.1.0 not found using pkg-config
    

    I'm not a lot into C so I'm not sure if that is something that even can be changed at all or if there's something I couldn't find that's allows this (although I searched quite thorough and all of the "solutions" simply didn't work or aren't applicable).

    Reviewed by cromefire at 2021-03-12 16:41
  • 5. Desync with speed 0 and 1

    Tested with commit: acb7b69227ab29b42f8258b66d5c0dfee9a32470

    ./target/release/rav1e ~/Downloads/crowd_run_2160p50.y4m -o test.ivf -r test_rec.y4m --quantizer 128 --speed=1 --limit=3 --low_latency --tune=Psnr

    ../aom_build/aomdec test.ivf -o test_dec.y4m -v ../aom_build/aomdec test.ivf --rawvideo -o test_dec.yuv -v ffmpeg -i test_rec.y4m test_rec.yuv -v 0 cmp test_rec.yuv test_dec.yuv

    test_rec.yuv test_dec.yuv differ: char 8324073, line 1 ==> difference in 2nd frame

    Similar desync with speed 0. 1st frame is fine. Checked that it first happens at the 2nd frame both 1st and 2nd frame when two frames are encoded, and does. not happen if only one frame is encoded. Speed 2, 6 and 10 were fine.

    Reviewed by ycho at 2019-11-19 03:48
  • 6. Adaptive keyframe selection

    Detects the level of difference between the current frame and the previous frame, and if the difference is above a preset threshold, encodes the current frame as a keyframe.

    Introduces a new CLI option, --max-keyint (-i), to set the maximum interval between two keyframes. Reuses the existing --keyint (-I) option as the minimum interval between two keyframes. Adjusts the default settings to be 15 frames minimum and 150 frames maximum. In theory, 300 frames may even be a safe default, but a conservative default was selected given that decoder performance is still insufficient for higher definition videos, and particularly long keyframe intervals negatively impact seek time.

    Compression improvements were up to 25% on test sources.

    Before:

    Frame 1499 - Inter frame - 756 bytes - encoded 1500 frames, 1.15 fps, 656.97 Kb/s
    
    Key Frames:     50    avg size:   27489 B
    Inter:        1450    avg size:    2676 B
    Intra Only:      0    avg size:       0 B
    Switch:          0    avg size:       0 B
    

    After:

    Frame 1499 - Inter frame - 725 bytes - encoded 1500 frames, 1.15 fps, 538.58 Kb/s
    
    Key Frames:     13    avg size:   27876 B
    Inter:        1487    avg size:    2653 B
    Intra Only:      0    avg size:       0 B
    Switch:          0    avg size:       0 B
    

    Closes #717

    Reviewed by shssoichiro at 2018-11-21 04:53
  • 7. Add tiles support

    • [ ] Implement the configuration settings
      • Context structure
      • Expose CLI options for it
    • [ ] have the Frame provide a tile iterator example
      • [ ] Have a FrameTileIterator
      • [ ] Have a FrameTile
      • [ ] Make the FrameTile produce PlaneSlice
    • [ ] Use the tile elements in the codebase
    Reviewed by lu-zero at 2018-10-02 17:28
  • 8. Enable open partition on frame boundary

    Issue #2166

    • AV1 spec allows to use a partition that straddle on frame boundary if certain condition met, i.e, 'Not Split' is allowed if more than half of the partition size is available (based on coded frame size, i.e. multipe of 8 pixels) in either horizontal or vertical.

    • For top-down partition search (i.e current speed levels 2~10), use bottomup on frame boundary

    • Functon for computing visible area of a partition or a tx block based on actual frame size (not a coded size) is added and in general it should be only used for a compute distortion. Hence, do not use it for adjusting the block size. One good critical example would be that a coefficient block size, when it is encoded, is not affected by visibility.

    • CFL specially requires to read outside the coded frame, when it compute average and obtain ac components for luma block, i.e. luma_ac().

    Reviewed by ycho at 2020-06-15 18:48
  • 9. Failed to spawn process

    Hi there, my OS is Windows 10 and I'm trying to build rav1e I got this error:

    error: failed to run custom build command for `rav1e v0.3.1`
    
    Caused by:
      process didn't exit successfully: `C:\Users\Серега\Desktop\RUST\stream\target\debug\build\rav1e-f617848cd6c89f31\build-script-build` (exit code: 101)
    --- stdout
    cargo:rustc-cfg=nasm_x86_64
    running: "nasm" "-fwin64" "-g" "-IC:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out/" "-Isrc/" "C:\\Users\\Серега\\.cargo\\registry\\src\\github.com-1ecc6299db9ec823\\rav1e-0.3.1\\src/x86/ipred.asm" "-o" "C:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out\\src/x86/ipred.o"
    running: "nasm" "-fwin64" "-g" "-IC:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out/" "-Isrc/" "C:\\Users\\Серега\\.cargo\\registry\\src\\github.com-1ecc6299db9ec823\\rav1e-0.3.1\\src/x86/me.asm" "-o" "C:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out\\src/x86/me.o"
    running: "nasm" "-fwin64" "-g" "-IC:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out/" "-Isrc/" "C:\\Users\\Серега\\.cargo\\registry\\src\\github.com-1ecc6299db9ec823\\rav1e-0.3.1\\src/x86/itx_ssse3.asm" "-o" "C:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out\\src/x86/itx_ssse3.o"
    running: "nasm" "-fwin64" "-g" "-IC:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out/" "-Isrc/" "C:\\Users\\Серега\\.cargo\\registry\\src\\github.com-1ecc6299db9ec823\\rav1e-0.3.1\\src/x86/sad_sse2.asm" "-o" "C:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out\\src/x86/sad_sse2.o"
    running: "nasm" "-fwin64" "-g" "-IC:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out/" "-Isrc/" "C:\\Users\\Серега\\.cargo\\registry\\src\\github.com-1ecc6299db9ec823\\rav1e-0.3.1\\src/x86/sad_avx.asm" "-o" "C:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out\\src/x86/sad_avx.o"
    running: "nasm" "-fwin64" "-g" "-IC:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out/" "-Isrc/" "C:\\Users\\Серега\\.cargo\\registry\\src\\github.com-1ecc6299db9ec823\\rav1e-0.3.1\\src/x86/satd.asm" "-o" "C:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out\\src/x86/satd.o"
    running: "nasm" "-fwin64" "-g" "-IC:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out/" "-Isrc/" "C:\\Users\\Серега\\.cargo\\registry\\src\\github.com-1ecc6299db9ec823\\rav1e-0.3.1\\src/x86/cdef.asm" "-o" "C:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out\\src/x86/cdef.o"
    running: "nasm" "-fwin64" "-g" "-IC:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out/" "-Isrc/" "C:\\Users\\Серега\\.cargo\\registry\\src\\github.com-1ecc6299db9ec823\\rav1e-0.3.1\\src/x86/tables.asm" "-o" "C:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out\\src/x86/tables.o"
    running: "nasm" "-fwin64" "-g" "-IC:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out/" "-Isrc/" "C:\\Users\\Серега\\.cargo\\registry\\src\\github.com-1ecc6299db9ec823\\rav1e-0.3.1\\src/x86/mc.asm" "-o" "C:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out\\src/x86/mc.o"
    running: "nasm" "-fwin64" "-g" "-IC:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out/" "-Isrc/" "C:\\Users\\Серега\\.cargo\\registry\\src\\github.com-1ecc6299db9ec823\\rav1e-0.3.1\\src/x86/ipred_ssse3.asm" "-o" "C:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out\\src/x86/ipred_ssse3.o"
    running: "nasm" "-fwin64" "-g" "-IC:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out/" "-Isrc/" "C:\\Users\\Серега\\.cargo\\registry\\src\\github.com-1ecc6299db9ec823\\rav1e-0.3.1\\src/x86/itx.asm" "-o" "C:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out\\src/x86/itx.o"
    running: "nasm" "-fwin64" "-g" "-IC:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out/" "-Isrc/" "C:\\Users\\Серега\\.cargo\\registry\\src\\github.com-1ecc6299db9ec823\\rav1e-0.3.1\\src/x86/mc_ssse3.asm" "-o" "C:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out\\src/x86/mc_ssse3.o"
    running: "lib" "/OUT:C:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out\\rav1easm.lib" "C:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out\\src/x86/ipred.o" "C:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out\\src/x86/ipred_ssse3.o" "C:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out\\src/x86/itx.o" "C:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out\\src/x86/itx_ssse3.o" "C:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out\\src/x86/mc.o" "C:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out\\src/x86/mc_ssse3.o" "C:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out\\src/x86/me.o" "C:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out\\src/x86/sad_sse2.o" "C:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out\\src/x86/sad_avx.o" "C:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out\\src/x86/satd.o" "C:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out\\src/x86/cdef.o" "C:\\Users\\Серега\\Desktop\\RUST\\stream\\target\\debug\\build\\rav1e-14e80f4d321818c9\\out\\src/x86/tables.o"
    
    --- stderr
    thread 'main' panicked at 'failed to spawn process: No such file or directory. (os error 2)', C:\Users\Серега\.cargo\registry\src\github.com-1ecc6299db9ec823\nasm-rs-0.1.7\src\lib.rs:339:19
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
    
    Reviewed by TheRadioGuy at 2020-05-01 10:52
  • 10. Frame flushing regression - last sub GOP constructed incorrectly

    I perform semi-regular encoder benchmarks on a set of anime-centered clips. While preparing this month's report I noticed a massive drop in BD rate (up to 12% less efficiency).

    I bisected the commits between the last "good" tests, performed on commit 64b9f501, and the one where I first noticed the drop, commit 57e4f926. Turned out the bad commit is 3202bf70 (ping @shssoichiro):

    # sh bd_rate.sh /c/Users/utente/Desktop/regressionTesting/rav1e-0.1.0-1219-be245fc0.out /c/Users/utente/Desktop/regressionTesting/rav1e-0.1.0-1222-3202bf70.out
            RATE (%) DSNR (dB)
     MSSSIM 12.5511 -0.502696
    PSNRHVS 12.6137 -0.699069
      HVMAF 12.4122 -1.06813
    

    BD rate report up to the commit before the bad one:

    # sh bd_rate.sh /c/Users/utente/Desktop/regressionTesting/rav1e-0.1.0-977-64b9f501.out /c/Users/utente/Desktop/regressionTesting/rav1e-0.1.0-1219-be245fc0.out
            RATE (%) DSNR (dB)
     MSSSIM -3.70748 0.161186
    PSNRHVS -3.5851 0.215709
      HVMAF -5.32006 0.524379
    

    BD rate between 64b9f501 and a more recent one (d3b81ca1):

    # sh bd_rate.sh /c/Users/utente/Desktop/regressionTesting/rav1e-0.1.0-977-64b9f501.out /c/Users/utente/Desktop/regressionTesting/rav1e-0.1.0-1768-d3b81ca1.out
            RATE (%) DSNR (dB)
     MSSSIM 9.60217 -0.394358
    PSNRHVS 9.65825 -0.547198
      HVMAF 6.88864 -0.659584
    

    The regression can be reliably replicated on a variety anime clips with different resolutions (1280x720, 1920x1080), so I'm sure it's not just a pathological case with this particular clip.

    Note that the regression is not nearly as bad on more "real life" content, for example ToS:

    # sh bd_rate.sh /c/Users/utente/Desktop/codecComparison/TearsOfSteel/statistics/rav1e-0.1.0-977-64b9f501.out /c/Users/utente/Desktop/codecComparison/TearsOfSteel/statistics/rav1e-0.1.0-1726-57e4f926.out
            RATE (%) DSNR (dB)
     MSSSIM 0.557544 -0.0240243
    PSNRHVS 0.898872 -0.0503387
      HVMAF -0.0647014 -0.0338935
    

    I'm using the AppVeyor Win64 builds in an MSYS2 environment on Win7 64bits. The clip used during bisection can be found here (FFV1 video in MKV container): https://mega.nz/#!QwhCxIob!I_aWvxX_igulPzZw9nr__nxvr7X-jYPcOlCLUfqL_-I Cmdlines used during bisection: seq -w 112 16 160 | xargs -n1 -P6 -I {} ./rav1e-0.1.0-1222-3202bf70.exe --low_latency false -o rav1e-0.1.0-1222-3202bf70/test.rav1e.cq{}.ivf --quantizer {} -s 2 --tune Psnr orig.i420.y4m

    Reviewed by SmilingWolf at 2019-03-07 10:37
  • 11. add mp4 muxer.

    @kodabb and @tdaede This will fix https://github.com/xiph/rav1e/issues/998. I am not familiar with rust. Any suggestion is welcome.

    Please ignore de2611a and 32b57af, I already sent them out at https://github.com/xiph/rav1e/pull/1259

    thanks

    Reviewed by xuguangxin at 2019-05-08 09:33
  • 12. Possible speed upgrade, Use HW hevc encoding for first pass

    maybe something could be made for hw to provide data from the first pass runned in a hw accelerator like nvenc or vceenc, and then pass it to the av1 encoder

    Reviewed by LuisB79 at 2022-05-20 02:35
  • 13. In x86inc, `pxor m0, m0` is translated inefficiently

    It seems like in x86inc.asm, pxor m0, m0 gets translated into vpxor ymm0, ymm0 when mmsize == 32. This is a bit inefficient because vpxor xmm0, xmm0 does the same thing and is preferred, because although XORing xmm0 or ymm0 are both 4 byte instructions, on AMD CPUs before Zen 2 the ymm version takes 2 extra uops, and the EVEX version of the instruction actually does take extra bytes to encode.

    Not exactly sure if this should be changed in x86inc itself or if all places in the code that do this should fix it themselves by using xm0 instead of m0 (there are surprisingly quite a few places that don't do this).

    Reviewed by redzic at 2022-05-16 17:33
  • 14. Remove uses of Arc::make_mut to avoid clones

    Profiling revealed there to be a measurable number of clones coming out of TileStateMut::new, which was traceable to usage of Arc::make_mut. A characteristic of Arc::make_mut is that it will clone the inner data when a thread attempts to write to it. In theory, I have concerns that this could cause data inconsistencies, although that does not seem to have happened in practice in rav1e. However, the clones are avoidable, because each place where we use Arc::make_mut can be safely assumed to be independent--i.e. there is never more than one thread accessing the same portion of the data.

    640x360 with 1 tile: 1.5% memory usage reduction 640x360 with 4 tiles: 4% memory usage reduction 1920x1080 with 4 tiles: 6% memory usage reduction

    No measurable runtime improvement with 1 tile. ~1% runtime improvement with 4 tiles.

    Reviewed by shssoichiro at 2022-05-09 15:41
  • 15. Compute segmentation offsets dynamically

    This changeset, in lieu of reusing variance calculations to do spatial AQ, factors in the fact that the spatiotemporal scales we have already consider both spatial and temporal importance. Therefore, this changeset uses the existing metric and uses an assessment of the distribution of that metric within a frame to determine how many segments to use and what the deltaq offset should be for each of them, rather than using a constant set of +21/-21/0.

    The segment splitting code in this changeset is originally from @cyanreg in #2247.

    As an immediate followup to this pull request, I still plan to add luma-based AQ bias to provide more bits to dark areas and adapt AQ for HDR clips. Ideas for future improvements are also very welcome.

    AWCY shows promising results--neutral metrics and, in my opinion, good psychovisual improvements.

    Closes #752

    Reviewed by shssoichiro at 2022-05-07 08:58
  • 16. Add pre-processing denoising

    This changeset adds a --denoise CLI option which enables denoising prior to encoding. This takes a strength value from 0-50, where 0 disables denoising. The default is 0, or half of the --photon-noise setting if --photon-noise is enabled. --denoise can be set manually and will override the denoise strength chosen by --photon-noise.

    The denoiser implemented is a FFT-based spatio-temporal denoiser based on the DFTTest plugin from Vapoursynth. This was chosen because it provides a reasonable balance of speed and quality.

    This also moves the --photon-noise and --photon-noise-table args into the stable feature set, as was discussed in #2924.

    Denoising performance at this time is currently rather slow. Per @tdaede on IRC, we're thinking it makes sense to open this up for merging and improve performance in a followup changeset, given that denoising is not turned on by default.

    Reviewed by shssoichiro at 2022-05-02 05:44
Related tags
Plays back videos in your terminal in an insanely slow and inefficient way.
Plays back videos in your terminal in an insanely slow and inefficient way.

term-video I guess this is usable now... Compilation Since this project is built using Rust, install its toolchain first, for example using rustup. gi

Feb 23, 2022
Pure-rust implementation of legacy H.263 video codec and associated color transforms

website | demo | nightly builds | wiki h263-rs h263-rs is a pure-Rust implementation of ITU-T Recommendation H.263 (2005/08), a video codec commonly u

Apr 22, 2022
A CLI tool that converts videos to ASCII and displays them to the terminal on the fly
A CLI tool that converts videos to ASCII and displays them to the terminal on the fly

A CLI tool that converts videos to ASCII and displays them to the terminal on the fly

Mar 12, 2022
Gyroflow is an application that can stabilize your video by using motion data from a gyroscope and optionally an accelerometer
Gyroflow is an application that can stabilize your video by using motion data from a gyroscope and optionally an accelerometer

Gyroflow is an application that can stabilize your video by using motion data from a gyroscope and optionally an accelerometer. Modern cameras record that data internally (GoPro, Sony, Insta360 etc), and this application stabilizes the captured footage precisely by using them. It can also use gyro data from an external source (eg. from Betaflight blackbox).

May 27, 2022
Lumiere is a proof-of-concept/example video player built with the Slint UI framework and libmpv
Lumiere is a proof-of-concept/example video player built with the Slint UI framework and libmpv

Lumiere is a proof-of-concept/example video player built with the Slint UI framework and libmpv. This was built quickly to try out Slint and it's new OpenGL underlay feature.

May 20, 2022
AV1 encoding tool with fast VMAF sampling. Uses svt-av1, ffmpeg & vmaf
AV1 encoding tool with fast VMAF sampling. Uses svt-av1, ffmpeg & vmaf

ab-av1 AV1 encoding tool with fast VMAF sampling. Uses svt-av1, ffmpeg & vmaf. Command: auto-encode Automatically determine the best crf to deliver th

May 22, 2022
Fastest and safest Rust implementation of parquet. `unsafe` free. Integration-tested against pyarrow

Parquet2 This is a re-write of the official parquet crate with performance, parallelism and safety in mind. The five main differentiators in compariso

May 15, 2022
Aptos-core strives towards being the safest and most scalable layer one blockchain solution.
Aptos-core strives towards being the safest and most scalable layer one blockchain solution.

Aptos-core strives towards being the safest and most scalable layer one blockchain solution. Today, this powers the Aptos Devnet, tomorrow Mainnet in order to create universal and fair access to decentralized assets for billions of people.

May 24, 2022
A DHCP parser and encoder for DHCPv4/DHCPv6

dhcproto A DHCP parser and encoder for DHCPv4/DHCPv6. dhcproto aims to be a functionally complete DHCP implementation. Many common option types are im

May 20, 2022
mico (minimalistic config file format) encoder and decoder

mico This library implements a parser and emitter for mico (minimalistic config file format). Format example: Name: mico Description: minimalistic con

Jan 30, 2022
A binary encoder / decoder implementation in Rust.
A binary encoder / decoder implementation in Rust.

Bincode A compact encoder / decoder pair that uses a binary zero-fluff encoding scheme. The size of the encoded object will be the same or smaller tha

May 24, 2022
A Rust encoder/decoder for Dominic Szablewski's QOI format for fast, lossless image compression.

QOI - The “Quite OK Image” format This is a Rust encoder and decoder for Dominic Szablewski's QOI format for fast, lossless image compression. See the

Mar 8, 2022
An experimental gif encoder for Windows 10.
An experimental gif encoder for Windows 10.

giffun An experimental gif encoder for Windows 10. Resources Based on "Techniques for GPU-based Color Quantization" (2019) by Matthias Trapp, Sebastia

Dec 22, 2021
A binary encoder / decoder implementation in Rust.

Bincode A compact encoder / decoder pair that uses a binary zero-fluff encoding scheme. The size of the encoded object will be the same or smaller tha

May 27, 2022
A basic rust QOI decoder/encoder

libqoi A basic rust QOI decoder/encoder. Why QOI QOI is a lossless image format with a one page specification. It can achieve better compression than

May 21, 2022
Perhaps the fastest and most memory efficient way to pull data from PostgreSQL into pandas and numpy. 🚀

flaco Perhaps the fastest and most memory efficient way to pull data from PostgreSQL into pandas and numpy. ?? Have a gander at the initial benchmarks

May 4, 2022
ConnectorX - Fastest library to load data from DB to DataFrames in Rust and Python
ConnectorX - Fastest library to load data from DB to DataFrames in Rust and Python

ConnectorX enables you to load data from databases into Python in the fastest and most memory efficient way.

May 23, 2022
Finding the shortest and fastest path in the graph to optimize traffic.

Finding the shortest and fastest path in the graph to optimize traffic. Because we don't want to spend unnecessary time searching for the optimal route back to home.

Oct 26, 2021
Fastest lz4 compression library in Node.js, powered by napi-rs and lz4-flex.

Lz4 Fastest lz4 compression library in Node.js, powered by napi-rs and lz4-flex. Install this package yarn add lz4-napi API export function compress:

May 23, 2022
The Fastest and most Advanced Ethereum Client

The Fastest and most Advanced Ethereum Client. » Download the latest release « Table of Contents Description Technical Overview Building 3.1 Building

Feb 17, 2022