Fast Function Dispatch: Improving the performance of Rust's dynamic function calls

Overview

Fast Function Dispatch: Improving the performance of Rust's dynamic function calls

crates.io crates.io License actions-badge

A safe, pragmatic toolkit for high-performance virtual function calls.

This library provides alternatives to types like Box<dyn Fn(...) -> _> that are more performant in a range of scenarios.

Feature flags

  • nightly: Implements Fn for Func, as well as allowing Func::new to accept multi-argument closures

Why?

You'll often hear it said that Rust is packed full of zero-cost abstractions.

In spirit, this is often true! Many of Rust's fancy features do indeed compile down to machine code that's close enough to what one might write by hand in a 'low-level' language like C that the differences are fairly meaningless.

Sadly, an exception to this rule is function dispatch.

Rust's strategy, upon seeing a trait like the following, and a corresponding dyn coercion, is to generate a vtable.

trait MyTrait {
    fn do_something(&self);
    fn do_something_else(&self, x: i32);
}

struct MyStruct { a: i32 }

impl MyTrait for MyStruct {
    fn do_something(&self) { println!("{}", self.a); }
    fn do_something_else(&self, a: i32) { println!("{a}"); }
}

The vtable might look something like this:

struct MyTraitVtable {
    // `*const ()` represents the `&self` argument of `do_something`
    do_something: fn(*const ()),
    do_something_else: fn(*const (), i32),
}

static MYSTRUCT_MYTRAIT_VTABLE: MyTraitVtable = MyTraitVtable {
    do_something: MyStruct::do_something as fn(_),
    do_something_else: MyStruct::do_something_else as fn(_, _),
};

By and large, this is a reasonable strategy: when the compiler sees &dyn MyTrait, it'll internally represent this as wide pointer, somewhat akin to the following tuple:

(*const (), *const MyTraitVtable)

The first field represents the pointer to the data, &self. The second field is the vtable, allowing us to look up methods at runtime.

When calling a method on the trait object, the compiler will generate code that first dereferences the vtable pointer to find the vtable, and then selects the field corresponding to the method being invoked. This field is a function pointer: so we can now call this function pointer using the data pointer as its argument.

This works brilliantly for most traits.

Sadly, Rust also uses the same strategy for dispatching dynamic function calls: the Fn traits appear, to Rust, like any other trait. This is unnecessarily inefficient! The Fn trait only has one very commonly invoked method, Fn::call: why should we need to perform double indirection, jumping through two locations in memory, when we could just carry the Fn::call function pointer around directly as the pointer metadata? Worse still, this double-indirection can severely pessimise the code generation of both the caller and callee, trashing register state and requiring unnecessary stack operations.

99% of the time, this relatively tiny inefficiency is of no consequence. However, there are circumstances in which this overhead really starts to matter, and it is for those circumstances that this library exists.

Planned features

  • Covering concurrency use-cases: Send and Sync functions
  • Covering more of the Fn traits: FnMut, FnOnce, etc.
  • Different representation strategies: drop function in pointer metadata instead?
You might also like...
🚀 10x easier, 🚀 10x cheaper, 🚀 high performance,  🚀 petabyte scale - Elasticsearch/Splunk/Datadog alternative for 🚀 (logs, metrics, traces).
🚀 10x easier, 🚀 10x cheaper, 🚀 high performance, 🚀 petabyte scale - Elasticsearch/Splunk/Datadog alternative for 🚀 (logs, metrics, traces).

🚀 10x easier, 🚀 10x cheaper, 🚀 petabyte scale - Elasticsearch/Splunk/Datadog alternative for 🚀 (logs, metrics, traces). ZincObserve ZincObserve is

The High Performance Proxy/Load Balancer
The High Performance Proxy/Load Balancer

Silverwind-The Next Generation High-Performance Proxy English 简体中文 The Silverwind is a high-performance reverse proxy/load balancer. And it could be a

A library-first, lightweight, high-performance, cloud-native supported API gateway🪐 by RUST

Preview version, will not guarantee the stability of the API! Do NOT use in production environment! A library-first, lightweight, high-performance, cl

A high-performance, lightweight, and cross-platform QUIC library

TQUIC English | 中文 TQUIC is a high-performance, lightweight, and cross-platform library for the IETF QUIC protocol. Advantages High performance: TQUIC

A high performance http proxy server & extensions platform & net packet capture tool
A high performance http proxy server & extensions platform & net packet capture tool

CthulhuRs A high performance http proxy server A browser extensions platform A net packet capture tool Demonstration Main features of CthulhuRs Inject

Super Fast Sub-domain Takeover Detection!
Super Fast Sub-domain Takeover Detection!

NtHiM - Super Fast Sub-domain Takeover Detection Installation Method 1: Using Pre-compiled Binaries The pre-compiled binaries for different systems ar

Simple and fast layer 4 proxy in Rust

Fourth 这一波在第四层。 English Fourth是一个Rust实现的Layer 4代理,用于监听指定端口TCP流量,并根据规则转发到指定目标。 功能 监听指定端口代理到本地或远端指定端口 监听指定端口,通过TLS ClientHello消息中的SNI进行分流 安装方法 为了确保获得您架构

Fast User-Space TCP/UDP Stack

Catnip Catnip is a TCP/IP stack that focuses on being an embeddable, low-latency solution for user-space networking. Building and Running 1. Clone Thi

A fast, stable, efficient, and lightweight intranet penetration, port forwarding tool supports multiple connections, cascading proxy, and transmission encryption
A fast, stable, efficient, and lightweight intranet penetration, port forwarding tool supports multiple connections, cascading proxy, and transmission encryption

A fast, stable, efficient, and lightweight intranet penetration, port forwarding tool supports multiple connections, cascading proxy, and transmission encryption

Owner
Joshua Barretto
Interested in things. Usually exhausted.
Joshua Barretto
Layer 4 load balancer with dynamic configuration loading

Convey Layer 4 load balancer with dynamic configuration loading featuring proxy, passthrough and direct server return modes Features Stats page (at /s

Ben Parli 289 Jan 1, 2023
High-performance link shortener

shorty High-performance link shortener written in Rust ?? Hosting In addition to being easy to build from source, shorty is available as a Docker imag

Caleb Denio 49 Jan 3, 2023
A high performance TCP SYN port scanner.

Armada A High-Performance TCP SYN scanner What is Armada? Armada is a high performance TCP SYN scanner. This is equivalent to the type of scanning tha

resync 259 Dec 19, 2022
TCP is so widely used, however QUIC may have a better performance.

TCP is so widely used, however QUIC may have a better performance. For softwares which use protocols built on TCP, this program helps them take FULL advantage of QUIC.

zephyr 15 Jun 10, 2022
An end-to-end encrypted, anonymous IP-hiding, decentralized, audio/video/file sharing/offline messaging multi-device platform built for both communications and application security and performance.

An end-to-end encrypted, anonymous IP-hiding, decentralized, audio/video/file sharing/offline messaging multi-device platform built for both communications and application security and performance.

null 2 Apr 27, 2022
Drpc-Correct, high performance, robust, easy use Remote invocation framework

Drpc - Correct, high performance, robust, easy use Remote invocation framework

darkrpc 30 Dec 17, 2022
High performance I/O framework written by Rust inspired by Netty

Introduction Retty is a High performance I/O framework written by Rust inspired by Netty 基于mio的IO多路复用高并发、高性能网络通信开发框架 Feature Rayon 线程池包装 EventLoop / E

lgphp 9 Dec 25, 2022
Volo is a high-performance and strong-extensibility Rust RPC framework that helps developers build microservices.

Volo is a high-performance and strong-extensibility Rust RPC framework that helps developers build microservices.

CloudWeGo 1.3k Jan 2, 2023
A high performance/low-overhead OpenMetrics library for Rust

* * * EXPERIMENTAL * * * discreet-metrics A high-performance/low-overhead metrics library aiming to conform with OpenMetrics and to satisfy the follow

null 2 Sep 14, 2022
Cloud Native high performance security and privacy SQL proxy.

Fern proxy With the advent of Cloud Native applications, architecture patterns evolved and emerged to take advantage of cloud computing, and build mor

Fern 12 Nov 7, 2022