Magical Automatic Deterministic Simulator for distributed systems in Rust.

Overview

MadSim

Crate Docs CI

Magical Automatic Deterministic Simulator for distributed systems.

Deterministic simulation

MadSim is a Rust async runtime similar to tokio, but with a key feature called deterministic simulation.

The main idea is borrowed from sled simulation guide and FoundationDB. Part of the implementation is inspired by tokio-rs/simulation.

Ensure Deterministic

Developers should eliminate any randomness in the application code. That's not easy.

Here are some tips to avoid randomness:

To make sure your code is deterministic, run your test with the following environment variable:

MADSIM_TEST_CHECK_DETERMINISTIC=1

Your test will be run at least twice with the same seed. If any non-deterministic detected, it will panic as soon as possible.

Related Projects

  • MadRaft: The labs of Raft consensus algorithm derived from MIT 6.824 and PingCAP Talent Plan.

License

Apache License 2.0

Comments
  • support tokio's tracing feature

    support tokio's tracing feature

    error: failed to select a version for `madsim-tokio`.
        ... required by package `risingwave_ctl v0.1.9 (/Users/skyzh/Work/risingwave/src/ctl)`
        ... which satisfies path dependency `risingwave_ctl` (locked to 0.1.9) of package `risingwave_cmd v0.1.9 (/Users/skyzh/Work/risingwave/src/cmd)`
    versions that meet the requirements `=0.2.0-alpha.3` (locked to 0.2.0-alpha.3) are: 0.2.0-alpha.3
    
    the package `risingwave_ctl` depends on `madsim-tokio`, with features: `tracing` but `madsim-tokio` does not have these features.
    

    we may see if we can pass this feature flag to tokio.

    opened by skyzh 1
  • Don't expose `NetLocalHandle` or make it !Send

    Don't expose `NetLocalHandle` or make it !Send

    NetLocalHandle is the entry point of all network operations. However, it should not be sent to other threads. One thread using a handle from other thread would cause it to block forever. The current API design is likely to be misused. For example, users tend to store the handle in a global state. To prevent that, we should either hide this type and make its methods static, or make this type !Send.

    opened by wangrunji0408 1
  • bug: the first RPC will timeout after kill-restart

    bug: the first RPC will timeout after kill-restart

    https://github.com/huang-jl/madsim-bug/blob/main/src/lib.rs

    [ WARN][0.114723s][0.0.0.0:0][madsim_test::tests] Server crash!
    [ WARN][5.114723s][0.0.0.0:0][madsim_test::tests] Server restart!
    [ INFO][10.114723s][10.0.0.2:1234][madsim_test] Send Incr request from 10.0.0.2:1234 to 10.0.0.1:8000
    [ WARN][11.114723s][10.0.0.2:1234][madsim_test] Send incr fail: RPC timeout  <---
    [ INFO][11.114723s][10.0.0.2:1234][madsim_test] Send Incr request from 10.0.0.2:1234 to 10.0.0.1:8000
    [ INFO][11.124569s][10.0.0.1:8000][madsim_test] Receive increment request, local addr = 10.0.0.1:8000
    [ INFO][11.128317s][10.0.0.2:1234][madsim_test] Send incr success
    
    bug 
    opened by wangrunji0408 1
  • Add `#[service]` macro

    Add `#[service]` macro

    Currently building a server on top of the RPC interface needs a lot boilerplate code:

    pub struct Server {...}
    
    impl Server {
        pub fn new() -> Arc<Self> {
            let server = Arc::new(Server {...});
            server.add_rpc_handler();
            server
        }
    
        // boilerplate!
        fn add_rpc_handler(self: Arc<Self>) {
            let net = NetLocalHandle::current();
    
            let this = self.clone();
            net.add_rpc_handler(move |req: Ping| {
                let this = this.clone();
                async move { this.ping(req) }
            });
        }
    
        fn ping(&self, req: Ping) {
            // handle RPC...
        }
    }
    

    We hope to provide a procedural macro to make that easy:

    #[madsim::service]
    impl Server {
        #[ctor]
        pub fn new() -> Arc<Self> {
            Arc::new(Server {...})
        }
    
        #[rpc]
        fn ping(&self, req: Ping) {
            // handle RPC...
        }
    }
    
    enhancement 
    opened by wangrunji0408 1
  • fix(time): fix time interception on ARM64 macOS

    fix(time): fix time interception on ARM64 macOS

    Signed-off-by: Runji Wang [email protected]

    https://github.com/rust-lang/rust/pull/103594 changed the internal structure of Instant on ARM64 macOS, which breaks the build after nightly-2022-11-19 (roughly) and stable-1.67.

    opened by wangrunji0408 0
  • feat(etcd): support load and dump

    feat(etcd): support load and dump

    This PR introduces Client::dump() and SimServer::load() function to dump and load etcd data. The data would be dumped in pretty json format, in which bytes are encoded in Rust ascii escape format to make it human-readable.

    Example dump data:

    {
      "revision": 516,
      "kv": {
        "cf/meta/leader": "\\n\\x0c0.0.0.0:5690",
      },
      "lease": {}
    }
    
    opened by wangrunji0408 0
  • feat(tonic): support interceptor

    feat(tonic): support interceptor

    This PR adds interceptor support to tonic simulator. It also refactors the internal implementation and fixes passing metadata of tonic request and response.

    Added API:

    • tonic::{service, Extensions}
    opened by wangrunji0408 0
  • fix(time): fix panic cause by std internal structure change

    fix(time): fix panic cause by std internal structure change

    https://github.com/rust-lang/rust/pull/102368 (nightly-2022-09-29) changed the memory layout of std::time::{SystemTime, Instant} on Linux (code), making them no longer equal to libc::timespec. So the previous transmute between these types will break data and trigger a panic.

    This PR fixes this issue and releases v0.2.9.

    opened by wangrunji0408 0
  • Tcp connect failure error not catched

    Tcp connect failure error not catched

    https://github.com/madsim-rs/madsim/blob/15e0c805d0c5ea90aa6aca73dc7f27e6497ecb8d/madsim/src/std/net/tcp.rs

            let stream = TcpStream::connect(addr).await.unwrap();
            self.inner
                .setup_connection(self.addr, Some((addr, recver)), stream)
                .await;
            sender
    

    If TcpStream::connect(addr) failed (etc. target is not up), it will unwarp.

    Maybe can return an error.

    opened by login256 0
  • [NEED HELP] How to call a service from another service?

    [NEED HELP] How to call a service from another service?

    I am trying to simulate several services communicating via madsim RPC and I don't know how to call a service from another service. The Endpoint::call method is applied on an instance of Endpoint but I haven't found a way to get the Endpoint of the current service. See below a small example of what I'm trying to do.

    #[derive(Serialize, Deserialize, Request)]
    #[rtype("String")]
    pub struct HttpGetRequest(pub String);
    
    #[derive(Clone)]
    pub struct FrontendService {
        redisAddr: SocketAddr,
    }
    
    #[madsim::service]
    impl FrontendService {
        pub fn new(redisAddr: SocketAddr) -> Self {
            FrontendService {
                redisAddr,
            }
        }
    
        #[rpc]
        async fn serve_http_request(&self, req: HttpGetRequest) -> String {
            let reply = <CURRENT_ENDPOINT?>.call(self.redisAddr, redis::GetRedisRequest(req.0.clone())).await.unwrap();
            reply
        }
    }
    

    I tried multiple approaches:

    • The method server_on consumes the endpoint and Endpoint is not clonable so I can't pass the endpoint after the initial binding.
    • The creation of Endpoint from the method Endpoint::connect doesn't work in this context.
    • I can't use Arc as the serve_on method consumes an instance of Endpoint.

    Is it a supported scenario? If yes how can I do that?

    If I can solve this problem, I will submit a PR containing an example of a microservices topology communicating via RPC and running on several simulated nodes. I think this kind of complete example doesn't exist yet in the repo and it would probably save time for future users. Feel free to point me to such an example if it exists.

    opened by lquerel 3
  • Replace std::sync::<Lock> with parking_lot lib

    Replace std::sync:: with parking_lot lib

    The main API difference between the two libs is poison detection. Poison detection is useful when panic happens in some thread while others are still runing, but it's not the case in this lib. In a rpc lib, we usually panic when the whole system breaks and can't be recovered, otherwise we should not panic. In another word there's no need to detection mutex poison in madsim.

    Apart from poison detection parking_lot locks have better performance in most cases. So switch to parking_lot sync lock is the obvious choice from my view.

    opened by rogercloud 3
Owner
MadSys Research Group
MadSys Research Group
Fluvio is a high-performance distributed streaming platform that's written in Rust

Fluvio is a high-performance distributed streaming platform that's written in Rust, built to make it easy to develop real-time applications.

InfinyOn 1.6k Dec 30, 2022
Raft distributed consensus for WebAssembly in Rust

WRaft: Raft in WebAssembly What is this? A toy implementation of the Raft Consensus Algorithm for WebAssembly, written in Rust. Basically, it synchron

Emanuel Evans 60 Oct 22, 2022
Paxakos is a pure Rust implementation of a distributed consensus algorithm

Paxakos is a pure Rust implementation of a distributed consensus algorithm based on Leslie Lamport's Paxos. It enables distributed systems to consistently modify shared state across their network, even in the presence of failures.

Pavan Ananth Sharma 2 Jul 5, 2022
The lightest distributed consensus library. Run your own replicated state machine! ❤️

Little Raft The lightest distributed consensus library. Run your own replicated state machine! ❤️ Installing Simply import the crate. In your Cargo.to

Ilya Andreev 359 Dec 26, 2022
Sorock is an experimental "so rocking" scale-out distributed object storage

Sorock is an experimental "so rocking" scale-out distributed object storage

Akira Hayakawa 6 Jun 13, 2022
A universal, distributed package manager

Cask A universal, distributed package manager. Installation | Usage | How to publish package? | Design | Contributing | Cask.toml If you are tired of:

null 39 Dec 30, 2022
Easy Hadoop Streaming and MapReduce interfaces in Rust

Efflux Efflux is a set of Rust interfaces for MapReduce and Hadoop Streaming. It enables Rust developers to run batch jobs on Hadoop infrastructure wh

Isaac Whitfield 31 Nov 22, 2022
A fully asynchronous, futures-based Kafka client library for Rust based on librdkafka

rust-rdkafka A fully asynchronous, futures-enabled Apache Kafka client library for Rust based on librdkafka. The library rust-rdkafka provides a safe

Federico Giraud 1.1k Jan 8, 2023
Rust client for Apache Kafka

Kafka Rust Client Project Status This project is starting to be maintained by John Ward, the current status is that I am bringing the project up to da

Yousuf Fauzan 902 Jan 2, 2023
Easy-to-use beanstalkd client for Rust (IronMQ compatible)

rust-beanstalkd Easy-to-use beanstalkd client for Rust (IronMQ compatible) Install Add this dependency to your Cargo.toml beanstalkd = "*" Documentati

Johannes Schickling 44 Oct 4, 2022
libhdfs binding and wrapper APIs for Rust

hdfs-rs libhdfs binding library and rust APIs which safely wraps libhdfs binding APIs Current Status Alpha Status (Rust wrapping APIs can be changed)

Hyunsik Choi 32 Dec 1, 2022
The Raft algorithm implement by Rust.

Raft The Raft algorithm implement by Rust. This project refers to Eli Bendersky's website, the link as follows: https://eli.thegreenplace.net/2020/imp

Qiang Zhao 1 Oct 23, 2021
Damavand is a quantum circuit simulator. It can run on laptops or High Performance Computing architectures, such CPU distributed architectures or multi GPU distributed architectures.

Damavand is a quantum circuit simulator. It can run on laptops or High Performance Computing architectures, such CPU distributed architectures or multi GPU distributed architectures.

MichelNowak 0 Mar 29, 2022
Damavand is a quantum circuit simulator. It can run on laptops or High Performance Computing architectures, such CPU distributed architectures or multi GPU distributed architectures.

Damavand is a code that simulates quantum circuits. In order to learn more about damavand, refer to the documentation. Development status Core feature

prevision.io 6 Mar 29, 2022
Easy c̵̰͠r̵̛̠ö̴̪s̶̩̒s̵̭̀-t̶̲͝h̶̯̚r̵̺͐e̷̖̽ḁ̴̍d̶̖̔ ȓ̵͙ė̶͎ḟ̴͙e̸̖͛r̶̖͗ë̶̱́ṉ̵̒ĉ̷̥e̷͚̍ s̷̹͌h̷̲̉a̵̭͋r̷̫̊ḭ̵̊n̷̬͂g̵̦̃ f̶̻̊ơ̵̜ṟ̸̈́ R̵̞̋ù̵̺s̷̖̅ţ̸͗!̸̼͋

Rust S̵̓i̸̓n̵̉ I̴n̴f̶e̸r̵n̷a̴l mutability! Howdy, friendly Rust developer! Ever had a value get m̵̯̅ð̶͊v̴̮̾ê̴̼͘d away right under your nose just when

null 294 Dec 23, 2022
Project Masterpass is a deterministic databaseless key management algorithm, aimed to help those who cannot protect their encryption keys in storage

Project Masterpass (working title) Attention! This project is still under heavy development, and SHOULD NOT be used in practice, as the algorithms cou

Gyorgy Wang 2 Sep 11, 2022
open source training courses about distributed database and distributed systemes

Welcome to learn Talent Plan Courses! Talent Plan is an open source training program initiated by PingCAP. It aims to create or combine some open sour

PingCAP 8.3k Dec 30, 2022
A simple to use rust package to generate or parse Twitter snowflake IDs,generate time sortable 64 bits unique ids for distributed systems

A simple to use rust package to generate or parse Twitter snowflake IDs,generate time sortable 64 bits unique ids for distributed systems (inspired from twitter snowflake)

houseme 5 Oct 6, 2022
The fly.io distributed systems challenges solved in Rust

The fly.io distributed systems challenges solved in Rust. Live-streamed in https://youtu.be/gboGyccRVXI License Licensed under either of Apache Licens

Jon Gjengset 162 Apr 19, 2023
A model checker for implementing distributed systems.

A model checker for implementing distributed systems.

Stateright Actor Framework 1.3k Dec 15, 2022