A tokio-uring backed runtime for Rust

Related tags

Database tokio-uring
Overview

tokio-uring

A proof-of-concept runtime backed by io-uring while maintaining compatibility with the Tokio ecosystem. This is a proof of concept and not intended to be used yet. There are probably tons of bugs.

Design doc: https://github.com/tokio-rs/tokio-uring/pull/1

License

This project is licensed under the MIT license.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in tokio-uring by you, shall be licensed as MIT, without any additional terms or conditions.

Comments
  • rt: Hang on too small completion queue

    rt: Hang on too small completion queue

    Similar to #145, and again exercised by #144. If there size of the completion queue is smaller than the number of concurrent writers to the submission queue, the runtime just hangs.

    Probably not commiting?

    opened by ollie-etl 35
  • rt: refactor runtime to avoid Rc<RefCell<...>>

    rt: refactor runtime to avoid Rc>

    This change refactors the runtime to not use reference counting directly in the ops themselves. Instead, ops access the driver via thread local variables.

    This is sound because dropping the driver (which happens when it is removed from its thread-local state) blocks the thread until all ops complete, ensuring that we do not free the contents of the driver until after all operations have completed.

    opened by Noah-Kennedy 28
  • Fixed buffers to support ReadFixed and WriteFixed ops

    Fixed buffers to support ReadFixed and WriteFixed ops

    Add infrastructure to manage pre-registered buffers, and operation methods File::read_fixed_at and File::write_fixed_at to make use of them. Exclusive access to buffer data between the application and in-flight ops is controlled at runtime.

    This is initial API to enable fixed buffers. Future developments may include:

    • :white_check_mark: Improved parameter polymorphism with ~#53~ #172;
    • An internal linked list of free buffers, to be able to check out the next available buffer from FixedBufRegistry without enumerating or keeping track of indices;
    • Support IORING_REGISTER_BUFFERS2/IORING_REGISTER_BUFFERS_UPDATE.

    Needs #52 to get tests to pass.

    opened by mzabaluev 27
  • driver: MultiCqe support for Driver Drop

    driver: MultiCqe support for Driver Drop

    Handles LifeCycle::CompletionList in driver Drop logic

    During the first pass over all lifecycles, identifies if CompletionLists are finished or not.

    • If they are, it changes the Lifecycle to Completed with null data.
    • If they are not, adds them to the cancel list, and converts to Ignored

    This means that after this pass, we will never observe CompletionList in the lifecycles. This means the existing termination logic is ok unchanged. The ignored entry will stay until removed from the Lifecycle slab by the call to complete.

    Fixes a gap left by #158 and unblocks #123

    opened by ollie-etl 24
  • Support zero-copy send

    Support zero-copy send

    Linux 6.0 introduced IORING_OP_SEND_ZC. This PR introduces the function send_zc on udp (todo: tcp) sockets which takes advantage of it.

    ~This is blocked on this PR on io-uring~ Merged

    As IORING_OP_SEND_ZC can return multiple CQE's per SQE, this also extends the Lifecycle handling in impl Fut for Op<T> to check for these occurences. If does this using a new method update in Completable, which I added to minimise the scope of the changes - it just falls back to the complete method for implementations which are just single entry.

    This approach could probably be used as a the basis of streaming interfaces for some other submission types.

    Since being split, is now blocked on #130

    opened by ollie-etl 22
  • Add TCP, UDP and Unix socket support

    Add TCP, UDP and Unix socket support

    Add TCP, UDP and Unix socket support.

    • [x] TCP
      • [x] Stream
      • [x] Listener
    • [x] UDP socket
    • [ ] Unix
      • [ ] Stream
      • [ ] Listener
      • [ ] Datagram

    Open to feedback

    opened by LinkTed 22
  • rt: cancel all in-flight ops on driver drop

    rt: cancel all in-flight ops on driver drop

    This fixes an immediate issue where ops like Accept can cause runtime shutdown to hang.

    This may need to be revisited once we have a real cancellation story. It may be better to do cancel-on-drop for ops.

    opened by Noah-Kennedy 21
  • chore: reduce noise in benchmarks

    chore: reduce noise in benchmarks

    Removes FuturesUnordered from benchmarks and alters timing to ignore task spawning, as to hone in on specifically the processing of ops and driver performance.

    opened by Noah-Kennedy 18
  • Improve the ergonomics of .slice with IntoSlice trait

    Improve the ergonomics of .slice with IntoSlice trait

    The owned slice API has a flaw: b"hello".to_vec().slice(1..).slice(..3) produces a Slice<Slice<Vec<u8>>>. It is easy to inadvertently end up with nested Slice wrappers, which adds unnecessary overhead to the buffer accessor methods and makes the results of .begin() and .end() difficult to interpret.

    Move the slice method to a new trait IntoSlice, which is used to uniformly convert buffer parameters to a Slice<T>, with the impl of InfoSlice for Slice<T> closing onto Slice<T>.

    Instead of IoBuf which can no longer be implemented for Slice, the buffer parameter type in the generic I/O operation methods is bound with IntoSlice to keep the convenient polymorphism, while reducing code bloat in monomorphization when the application passes both buffers and their slices to tokio-uring operations. Consequently, all buffers returned from the operations are wrapped into Slice.

    This PR incorporates #52 in order to get tests to pass.

    opened by mzabaluev 17
  • `FixedBufPool`: a dynamic fixed buffer collection

    `FixedBufPool`: a dynamic fixed buffer collection

    Make the internal infrastructure for fixed buffers more flexible, allowing FixedBuf handles to be backed by alternative collections. Add one new collection as FixedBufPool, enabling dynamic retrieval of free buffers.

    opened by mzabaluev 16
  • Support multi completion events

    Support multi completion events

    Support of multi completion events, originally proposed in https://github.com/tokio-rs/tokio-uring/pull/123, and submitted separately for review as suggested by @Noah-Kennedy

    Multi completion events contain a CQE_F_MORE flag, which indicates more events are expected associated with the same submission entry. This leads to a 1:M correspondance, where all ops currently supported are 1:1

    This PR modifies struct Op<T> to support both types. It does this by adding the update method to the Completable trait. The semantics of this are:

    Take 1 or more completions, update the state of the operation, and return if the operation is Finished, with the result, or if More are required

    For single event Ops, the result will always be Finished. As this is the most likely behaviour, the default method is provided.

    To support this with the current Lifecycle handling, the Lifecycle::Completed variant now contains 1 or more completions, rather than 1. This is because there is a data race between reading the completion queue, and operations calling poll() to consume Lifecycle::Completed. To manage this, completions are stored in a growable conatiner. smallvec (a vec with small string optimization) has been chosen to avoid allocations in the vast majority of cases. The default size is chosen to be small (under a cache line), by no rigorous method. It may want to be bigger (8 or 16).

    Other changes to Lifecycle include the Ignored variant. This now has to wait for all outstanding completions for an Op before being dropped. It just checks the more flag to do this.

    There are design decisions here which want discussing.

    The major one in my eyes is the decision to have an unbounded collection of completion events. The other option would be to stall the completion queue handling once some predetermined threshold of completion events for a single op is reached. A waker would be registered, and completion queue handling would be woken when the Lifecycle::Completed was consumed by the Op's Poll(). I deemed this unpalatable, as it stalls all Ops

    opened by ollie-etl 16
  • Provide example showing a file descriptor leak

    Provide example showing a file descriptor leak

    Provide an example where a single binary both calls TCP listen accept and connect, and that tracks the file descriptor values, to show that the tokio_uring accept operation can leak file descriptors if it is canceled at an inopportune time: when the uring accept operation is being run by the kernel, a client makes the connection, but the server side has canceled the future.

    It's a small time window where this can happen, but it shows potentially a larger problem when uring operations in flight and having to deal with cleanup when the future is cancelled.

    Run with

    cargo run --release --example investigate_accept_socket_leak
    

    It will list the open file descriptors to stdout for a few rounds and will finish with an error if file descriptor leaks were evident.

    opened by FrankReh 1
  • chore: fix criterion benchmarks

    chore: fix criterion benchmarks

    A number of errors got overlooked in #151 and earlier PRs targeting the criterion benchmarks. In particular, the throughput, concurrency, and iteration count values were all wrong.

    This change addresses those issues.

    opened by Noah-Kennedy 0
  • Builder API consistency and ergonomics

    Builder API consistency and ergonomics

    Problem

    A Rather awkward Builder API presented by tokio_uring::Builder.

    Presently the tokio_uring::Builder (TUB) is a thin wrapper over the io_uring::Builder (IUB) but it serves a noble purpose, which is to allow a developer to tweak some settings without having to declare a dependency on io_uring

    Focusing on *entries, the IUB allows one to set, as a builder method, the size of the Completion Queue, but not the Submission Queue. The SQ Entries value can only be supplied as a (required) parameter to the Builder::build() method. As a result, the TUB stores 1 parameter, the SQ entries, which it then supplies to the build() function. So really, the TUB isn't really a builder at all, but a convenience wrapper to store and supply the SQ Entries value to the IUB build invocation.

    Setting the Completion Queue Size requires the construction of the inner builder:

        tokio_uring::builder()
            .entries(64)
            .uring_builder(
                tokio_uring::uring_builder()
                .setup_cqsize(1024)
            )
            .start(async {
                ...
            })
    

    Options

    1. We rename TUB to something other than "Builder" since it's sort of misleading and, at best, incomplete.
    2. We alter the IUB API to be consistent with regards to its setup_parameters. Then wrap it with a higher-fidelity Builder API in tokio-uring.
      1. Add setup_sqsize to achieve parity with setup_cqsize .
      2. Give it an intelligent default.
      3. Remove the parameter in the io_uring::Builder::build() method, so that it takes its sqsize from its internal member instead.
      4. Update the TUB API to achieve parity with the IUB API (expose all relevant parameters)

    The proposed new API would look something like this:

        tokio_uring::builder()
            .sq_entries(64)
            .cq_entries(1024)
            .iopoll(true)
            .start(async {
                ...
            })
    
    opened by rrichardson 0
  • Update docs to conform to Tokio standards

    Update docs to conform to Tokio standards

    This PR does ~3~ 2 things: ~1. Alters the Runtime crate::Builder API slightly to make it easier to correctly set sq and cq entry counts.~ 2. Updates method documentation to use indicative mood. This is not a huge change, as most public method docs were conformant. 3. Enhances other documentation in a few places for clarity.

    opened by rrichardson 7
  • Replace squeue::Entry with squeue::Entry128, Add UringCmd16 and UringCmd80

    Replace squeue::Entry with squeue::Entry128, Add UringCmd16 and UringCmd80

    This is definitely not in a ready-to-merge state, but I would like to open a conversation about integrating squeue::Entry128, into tokio-uring to unblock integration of io_uring methods that require squeue::Entry128. For context, I am working on implementing a rust-native ublk server library, and would like to make use of tokio-uring.

    opened by thomasbarrett 1
Releases(v0.4.0)
  • v0.4.0(Nov 5, 2022)

    Fixed

    • Fix panic in Deref/DerefMut for Slice extending into uninitialized part of the buffer (#52)
    • docs: all-features = true (#84)
    • fix fs unit tests to avoid parallelism (#121)
    • Box the socket address to allow moving the Connect future (#126)
    • rt: Fix data race (#146)

    Added

    • Implement fs::File::readv_at()/writev_at() (#87)
    • fs: implement FromRawFd for File (#89)
    • Implement AsRawFd for TcpStream (#94)
    • net: add TcpListener.local_addr method (#107)
    • net: add TcpStream.write_all (#111)
    • driver: add Builder API as an option to start (#113)
    • Socket and TcpStream shutdown (#124)
    • fs: implement fs::File::from_std (#131)
    • net: implement FromRawFd for TcpStream (#132)
    • fs: implement OpenOptionsExt for OpenOptions (#133)
    • Add NoOp support (#134)
    • Add writev to TcpStream (#136)
    • sync TcpStream, UnixStream and UdpSocket functionality (#141)
    • Add benchmarks for no-op submission (#144)
    • Expose runtime structure (#148)

    Changed

    • driver: batch submit requests and add benchmark (#78)
    • Depend on io-uring version ^0.5.8 (#153)

    Internal Improvements

    • chore: fix clippy lints (#99)
    • io: refactor post-op logic in ops into Completable (#116)
    • Support multi completion events: v2 (#130)
    • simplify driver operation futures (#139)
    • rt: refactor runtime to avoid Rc<RefCell<...>> (#142)
    • Remove unused dev-dependencies (#143)
    • chore: types and fields explicitly named (#149)
    • Ignore errors from uring while cleaning up (#154)
    • rt: drop runtime before driver during shutdown (#155)
    • rt: refactor drop logic (#157)
    • rt: fix error when calling block_on twice (#162)

    CI changes

    • chore: update actions/checkout action to v3 (#90)
    • chore: add all-systems-go ci check (#98)
    • chore: add clippy to ci (#100)
    • ci: run cargo test --doc (#135)
    Source code(tar.gz)
    Source code(zip)
  • v0.3.0(Mar 2, 2022)

  • v0.2.0(Jan 9, 2022)

    0.2.0 (January 9th, 2022)

    Fixed

    fs: fix error handling related to changes in rustc (#69) op: fix 'already borrowed' panic (#39)

    Added

    fs: add fs::remove_file (#66) fs: implement Debug for File (#65) fs: add remove_dir and unlink (#63) buf: impl IoBuf/IoBufMut for bytes::Bytes/BytesMut (#43)

    Source code(tar.gz)
    Source code(zip)
Owner
Tokio
Rust's asynchronous runtime.
Tokio
Simple crate that wraps a tokio::process into a tokio::stream

tokio-process-stream tokio-process-stream is a simple crate that wraps a tokio::process into a tokio::stream Having a stream interface to processes is

Leandro Lisboa Penz 8 Sep 13, 2022
Awesome books, tutorials, courses, and resources for the Tokio asynchronous runtime ecosystem. ⚡

Awesome Tokio Tokio is an asynchronous runtime for the Rust programming language. It provides the building blocks needed for writing network applicati

Marcus Cvjeticanin 59 Oct 27, 2023
Appendable and iterable key/list storage, backed by S3, written in rust

klstore Appendable and iterable key/list storage, backed by S3. General Overview Per key, a single writer appends to underlying storage, enabling many

Eric Thill 3 Sep 29, 2022
A Pub/Sub library for Rust backed by Postgres

Unisub Unisub is a Pub/Sub library for Rust, using Postgres as the backend. It offers a convenient way to publish and subscribe to messages across dif

Nick Rempel 12 Oct 6, 2023
Simple and handy btrfs snapshoting tool. Supports unattended snapshots, tracking, restoring, automatic cleanup and more. Backed with SQLite.

Description Simple and handy btrfs snapshoting tool. Supports unattended snapshots, tracking, restoring, automatic cleanup and more. Backed with SQLit

Eduard Tolosa 27 Nov 22, 2022
The simplest implementation of LLM-backed vector search on Postgres.

pg_vectorize under development The simplest implementation of LLM-backed vector search on Postgres. -- initialize an existing table select vectorize.i

Tembo 5 Jul 25, 2023
Asyncronous Rust Mysql driver based on Tokio.

mysql-async Tokio based asynchronous MySql client library for rust programming language. Installation Library hosted on crates.io. [dependencies] mysq

Anatoly I 292 Dec 30, 2022
📺 Netflix in Rust/ React-TS/ NextJS, Actix-Web, Async Apollo-GraphQl, Cassandra/ ScyllaDB, Async SQLx, Kafka, Redis, Tokio, Actix, Elasticsearch, Influxdb Iox, Tensorflow, AWS

Fullstack Movie Streaming Platform ?? Netflix in RUST/ NextJS, Actix-Web, Async Apollo-GraphQl, Cassandra/ ScyllaDB, Async SQLx, Spark, Kafka, Redis,

null 34 Apr 17, 2023
a tokio-enabled data store for triple data

terminusdb-store, a tokio-enabled data store for triple data Overview This library implements a way to store triple data - data that consists of a sub

TerminusDB 307 Dec 18, 2022
Incomplete Redis client and server implementation using Tokio - for learning purposes only

mini-redis mini-redis is an incomplete, idiomatic implementation of a Redis client and server built with Tokio. The intent of this project is to provi

Tokio 2.3k Jan 4, 2023
Thin wrapper around [`tokio::process`] to make it streamable

process-stream Wraps tokio::process::Command to future::stream. Install process-stream = "0.2.2" Example usage: From Vec<String> or Vec<&str> use proc

null 4 Jun 25, 2022
AsyncRead/AsyncWrite interface for rustls-on-Tokio

rustls-tokio-stream rustls-tokio-stream is a Rust crate that provides an AsyncRead/AsyncWrite interface for rustls. Examples Create a server and clien

Deno 7 May 17, 2023
Fault-tolerant Async Actors Built on Tokio

Kameo ???? Fault-tolerant Async Actors Built on Tokio Async: Built on tokio, actors run asyncronously in their own isolated spawned tasks. Supervision

Ari Seyhun 135 Jul 25, 2024
Rust client for the anna-rs KVS that can run in the WasmEdge Runtime

wasmedge-anna-client wasmedge-anna-client is a Rust client for anna-rs based on Tokio for WasmEdge. It communicates with Anna routing nodes and KVS no

WasmEdge Runtime 4 Nov 8, 2022
An Async SDR Runtime for Heterogeneous Architectures

FutureSDR An experimental asynchronous SDR runtime for heterogeneous architectures that is: Extensible: custom buffers (supporting accelerators like G

FutureSDR 169 Jan 8, 2023
🧰 The Rust SQL Toolkit. An async, pure Rust SQL crate featuring compile-time checked queries without a DSL. Supports PostgreSQL, MySQL, SQLite, and MSSQL.

SQLx ?? The Rust SQL Toolkit Install | Usage | Docs Built with ❤️ by The LaunchBadge team SQLx is an async, pure Rust† SQL crate featuring compile-tim

launchbadge 7.6k Dec 31, 2022
Redis re-implemented in Rust.

rsedis Redis re-implemented in Rust. Why? To learn Rust. Use Cases rsedis does not rely on UNIX-specific features. Windows users can run it as a repla

Sebastian Waisbrot 1.6k Jan 6, 2023
A generic connection pool for Rust

r2d2 A generic connection pool for Rust. Documentation Opening a new database connection every time one is needed is both inefficient and can lead to

Steven Fackler 1.2k Jan 8, 2023
An ArangoDB driver for Rust

Rincon Rincon is an ArangoDB driver for Rust. It enables low level access to ArangoDB in a typesafe and Rust idiomatic manner. The name Rincon is deri

Innoave 35 Mar 21, 2021