Redirects your plumbing for you.

Related tags

Utilities valve
Overview

Valve

_Redirects your plumbing for you. _

valve creates multi-threaded Plumber APIs powered by Rust's tokio and axum web frameworks.

Motivation

Plumber is an R package that creates RESTful APIs from R functions. It is limited in that each API is a single R process and thus a single thread. Multiple queries are executed in the sequence that they came in. Scaling plumber APIs is not easy. The intention of valve is to be able to make scaling plumber APIs, and thus R itself, easier. We can make R better by leveraging Rust's "fearless concurrency."

Installation

Install the R package using {remotes}. Note that this will compile the package from source which will require Rust to be installed. If you don't have rust installed follow the instructions here. Rust is the second easiest programming language to install after R.

I also recommend installing the development version of {rextendr} via pak::pak("extendr/rextendr") which provides the function rextendr::rust_sitrep() which will update you on if you have a compatible Rust installation.

remotes::install_github("josiahparry/valve")

When the R package is built it also includes the binary executable at inst/valve. So if you ever find yourself needing the executable system.file("valve", package = "valve") will point you right to it! This will always be the version of the executable that your R package is using.

To install the executable only run

cargo install --git https://github.com/josiahparry/valve/ --no-default-features

Creating the app

The R package exports only 1 function: valve_run(). The most important argument is filepath which determines which Plumber API will be executed as well as specifying the host and port to determine where your app will run. Additional configuration can be done with the n_max, workers, check_unused, and max_age argument to specify how your app will scale.

library(valve)
# get included plumber API path
plumber_api_path <- system.file("plumber.R", package = "valve")

valve_run(plumber_api_path, n_max = 5)
#> Docs hosted at <http://127.0.0.1:3000/__docs__/>

n_max refers to the maximum number of background Plumber APIs that can be spawned whereas workers specifies how many main worker threads are available to handle incoming requests. Generally, the number of workers should be equal to the number of plumber APIs since because plumber is single threaded. This is the default. If workers is less than n_max, you'll never spawn the maximum number of APIs.

Plumber connections are automatically spawned, pooled, and terminated using deadpool. App connections are automatically pooled by hyper.

Running this from your R session will block the session. If you are comfortable, it is recommended to install the cli so you can run them from your terminal so that you can call the plumber APIs from your R session.

Calling valve with multiple workers

The way valve works is by accepting requests on a main port (3000 by default) and then distributing the requests round robin to the plumber APIs that are spawned on random ports. Requests are captured by axum and proxied to a plumber API process.

First I'm going to define a function to call my /sleep endpoint. The function will take two parameters: the port and the duration of sleep. The port will be used to change between the valve app and a single plumber API.

sleep <- function(port, secs) {
  httr2::request(
        paste0("127.0.0.1:", port, "/sleep?zzz=", secs)
    ) |> 
    httr2::req_perform() |> 
    httr2::resp_body_string()
}

Using this function we'll create 5 total R sessions each will make a request to sleep for 2 seconds.

library(furrr)
plan(multisession, workers = 5)

First, we'll ping the main valve app which will distribute requests round robin.

start <- Sys.time()
multi_sleep <- future_map(1:5, ~ sleep(3000, 2))
multi_total <- Sys.time() - start

Next, we select only one of the available plumber APIs and query it.

start <- Sys.time()
single_sleep <- furrr::future_map(1:5, ~ sleep(35219, 2))
single_total <- Sys.time() - start

Notice the performance difference.

print(paste0("Multiple Plumber APIs: ", round(multi_total, 2)))
#> [1] "Multiple Plumber APIs: 2.63"
print(paste0("One Plumber API: ", round(single_total, 2)))
#> [1] "One Plumber API: 10.08"

In the former each worker gets to make the request in approximately the same amount of time. The latter has to wait for each subsequent step to finish before the next one can occur. So we've effectively distributed the work load.

Benchmarks with drill

Simple benchmarks using drill can be found in inst/bench-sleep-plumber.yml and bench-sleep-valve.yml.

The bench mark calls the /sleep endpoint and sleeps for 500ms for 100 times with 5 concurrent threads. This alone can illustrate how much we can speed up a single plumber API's response time with valve.

Plumber's benchmark:

Time taken for tests      50.7 seconds
Total requests            100
Successful requests       100
Failed requests           0
Requests per second       1.97 [#/sec]
Median time per request   2540ms
Average time per request  2482ms
Sample standard deviation 272ms
99.0'th percentile        2556ms
99.5'th percentile        2556ms
99.9'th percentile        2556ms

Valve's benchmark:

Time taken for tests      10.2 seconds
Total requests            100
Successful requests       100
Failed requests           0
Requests per second       9.78 [#/sec]
Median time per request   510ms
Average time per request  510ms
Sample standard deviation 2ms
99.0'th percentile        516ms
99.5'th percentile        518ms
99.9'th percentile        518ms

With all that said....

valve is best suited for light to medium sized work loads. Each background plumber API will hold their own copy of their R objects. So if you are serving a machine learning model that is a GB big, that model will have to be copied into each thread and that can be quickly bloat up your ram. So be smart! If you have massive objects in your R session, try and reduce the clutter and thin it out.

You might also like...
File Tree Fuzzer allows you to create a pseudo-random directory hierarchy filled with some number of files.

FTZZ File Tree Fuzzer allows you to create a pseudo-random directory hierarchy filled with some number of files. Installation $ cargo +nightly install

Hosts EDitor, it will add/update/delete host entries for you
Hosts EDitor, it will add/update/delete host entries for you

Host EDitor A command-line tool to easily manage you hosts file. View Demo · Report Bug · Request Feature Table of Contents About The Project Built Wi

A tool that generates a Sublime Text project file that helps you get started using Scoggle.

README A tool that generates a Sublime Text project file that helps you get started using Scoggle. While Scoggle-Gen may not find every single source

cargo-lambda a Cargo subcommand to help you work with AWS Lambda

cargo-lambda cargo-lambda is a Cargo subcommand to help you work with AWS Lambda. This subcommand compiles AWS Lambda functions natively and produces

You can name anonymous Future from async fn without dyn or Box!

rename-future You can name anonymous Future from async fn without dyn or Box! PLEASE READ THIS THIS PROJECT NOT YET WELL TESTED! DON'T USE THIS IN PRO

UnTeX is both a library and an executable that allows you to manipulate and understand TeX files.

UnTeX UnTeX is both a library and an executable that allows you to manipulate and understand TeX files. Usage Executable If you wish to use the execut

cargo-lambda is a Cargo subcommand to help you work with AWS Lambda.

cargo-lambda cargo-lambda is a Cargo subcommand to help you work with AWS Lambda. The new subcommand creates a basic Rust package from a well defined

A simple web-app allowing you to batch archive groups of repositories from a given organization

ice-repos My goal here is to build a simple web-app allowing you to batch archive groups of repositories from a given organization, using Rust+Yew. As

The last kubernetes tool you'll ever need.

Neatkube The last kubernetes tool you'll ever need. Kubernetes is a mess. Everthing ships it's own command line tools that you need to install and tra

Comments
  • Implement connection pooling with Deadpool

    Implement connection pooling with Deadpool

    This PR changes valve to utilize deadpool to create custom connection pools of plumber APIs. Spawning and termination is handled mostly by deadpool. There no longer is a fixed number of APIs in the background. One down side is that it is possible to have no background apps but this can handled later i think and it is intended by deadpool to be fixed.

    The main thread app is always running, though. So even if there isn't a background plumber API a request to the main thread will spawn one.

    opened by JosiahParry 0
Owner
Josiah Parry
Social Scientist. Spatial Stats @ Esri
Josiah Parry
Twidge is a fresh approach to productivity. It integrates with your workflow and allows you to be your most productive self.

Twidge A productivity app which is an extension to your mind Twidge is a cross platform productivity app, powered by rust, tauri, prisma-client-rust T

Twidge 187 Jun 28, 2023
Searchbuddy is a browser extension that lets you chat with people that are searching for what you're searching for.

searchbuddy Make friends while searching! Searchbuddy is a browser extension that lets you chat with people that are searching for what you're searchi

Joseph Gerber 14 May 23, 2022
Flexcord! A custom Discord client to allow you to do what you want!

Disclaimer Flexcord is NO WHERE near done. Flexcord What is it? Flexcord is a Discord client that flexes for your needs, it allows you to do exactly w

null 2 Dec 5, 2022
miette is a diagnostic library for Rust. It includes a series of traits/protocols that allow you to hook into its error reporting facilities, and even write your own error reports!

miette is a diagnostic library for Rust. It includes a series of traits/protocols that allow you to hook into its error reporting facilities, and even write your own error reports!

Kat Marchán 1.2k Jan 1, 2023
Tagref helps you maintain cross-references in your code.

Tagref helps you maintain cross-references in your code. You can use it to help keep things in sync, document assumptions, manage invariants, etc. Airbnb uses it for their front-end monorepo. You should use it too!

Stephan Boyer 119 Dec 28, 2022
For when you really, really just want to know that your config changed

really-notify This crate is for when you really, really just want to know that your config changed. K8s configmap symlink shenanigans? No problem. Mul

Max Bruce 13 Jun 20, 2023
Check Have I Been Pwned and see if it's time for you to change passwords.

checkpwn Check Have I Been Pwned and see if it's time for you to change passwords. Getting started Install: cargo install checkpwn Update: cargo inst

Johannes 93 Dec 13, 2022
bevy_blender is a Bevy library that allows you to use assets created in Blender directly from the .blend file

bevy_blender bevy_blender is a Bevy library that allows you to use assets created in Blender directly from the .blend file.

Jerald Thomas 45 Jan 4, 2023
This crate allows you to safely initialize Dynamically Sized Types (DST) using only safe Rust.

This crate allows you to safely initialize Dynamically Sized Types (DST) using only safe Rust.

Christofer Nolander 11 Dec 22, 2022
Fusion is a cross-platform App Dev ToolKit build on Rust . Fusion lets you create Beautiful and Fast apps for mobile and desktop platform.

Fusion is a cross-platform App Dev ToolKit build on Rust . Fusion lets you create Beautiful and Fast apps for mobile and desktop platform.

Fusion 1 Oct 19, 2021