Fluvio is a high-performance distributed streaming platform that's written in Rust

InfinyOn

Last update: Dec 30, 2022

Related tags

Overview

Fluvio

Data streaming for real-time applications

Fluvio is a high-performance distributed streaming platform that's written in Rust, built to make it easy to develop real-time applications.

Quick Links

Release Status

Fluvio is currently in Alpha and is not ready to be used in production. Our CLI and APIs are also in rapid development and may experience breaking changes at any time. We do our best to adhere to semantic versioning but until our R1 release we cannot guarantee it.

Contributing

If you'd like to contribute to the project, please read our Contributing guide.

License

This project is licensed under the Apache license.

Comments

[Merged by Bors] - Feature/fluvio cli partition list show size

Add a human-readable partition size field to the fluvio partition list command.

$ fluvio partition list
 TOPIC      PARTITION  LEADER  REPLICAS  RESOLUTION  SIZE    HW  LEO  LRS  FOLLOWER OFFSETS 
 cat-facts  0          5001    []        Online      6 KB    12  12   0    [] 
 cat-facts  1          5001    []        Online      2.1 KB  10  10   0    [] 
 cat-facts  2          5001    []        Online      4.7 KB  10  10   0    [] 
 cat-facts  3          5001    []        Online      4.6 KB  9   9    0    []

Prints ERROR when the leader is not able to fetch log size.

TOPIC      PARTITION  LEADER  REPLICAS  RESOLUTION  SIZE   HW  LEO  LRS  FOLLOWER OFFSETS 
cat-facts  0          5001    []        Online      ERROR  12  12   0    [] 
cat-facts  1          5001    []        Online      ERROR  10  10   0    [] 
cat-facts  2          5001    []        Online      ERROR  10  10   0    [] 
cat-facts  3          5001    []        Online      ERROR  9   9    0    []

Force bash shell in the k8-util/docker/build.sh to make the script work in a non-bash-compatible environment.

Solves #2148

opened by Afourcat 35

Add Producer API for sending keyless records
After #965 lands, we will have deprecated the TopicProducer::send_record API, which allowed users to send a record without specifying a key for it. This API was deprecated because send_record accepted a partition number as one of the arguments, and we decided that for now, users should not be able to manually specify partitions.

This leaves a gap in our API for a particular use-case, which is sending a single record with no key. Without send_record, the only way to send a record with no key involves using the send_all function, which forces users to wrap their single record in some kind of iterator before passing it to the producer, e.g. like this:

producer.send_all(Some( (None, "My record data with no key") )).await?;

I believe that this is a workaround that we should not force users to deal with, as it is non-obvious and non-ergonomic. I think we need to introduce a new API method for the "single record with no key" use-case. I originally proposed a send_keyless API in #965 (discussion at https://github.com/infinyon/fluvio/pull/965#discussion_r615967714), but we decided to exclude it from that PR in order to discuss it more, which is what this issue is for.

For reference, the send_keyless API that I would like to introduce looks like this:

impl TopicProducer { pub async fn send_keyless<V>(&self, value: V) -> Result<(), FluvioError> where V: Into<Vec<u8>> { ... } }

So the questions I would like to resolve in this discussion are:

Do we believe that a new API to fill this use-case is necessary? (I believe it is),

Is there a different API that better serves this use-case (I have not thought of one)?

Are we happy with the name send_keyless or would we prefer something else?

I think it makes sense to put a timeline on this. If nobody has any strong reasons why we should NOT introduce this API, I would like to add this issue to our roadmap as an action item by this Wednesday (4/21/21) to give enough time to ship it with the 0.7.4 release this Thursday.

Discussion Results

After debating the API, we've decided to implement the following:

pub struct RecordKey(RecordKeyInner); impl RecordKey { pub const NULL: Self = Self(RecordKeyInner::Null); } enum RecordKeyInner { Null, Key(RecordData), } impl<K: Into<Vec<u8>>> From<K> for RecordKey { fn from(k: K) -> Self { Self(RecordKeyInner::Key(RecordData::from(k))) } } /// A type to hold the contents of a record's value. pub struct RecordData(bytes::Bytes); impl<V: Into<Vec<u8>>> From<V> for RecordData { fn from(v: V) -> Self { let value: Vec<u8> = v.into(); Self(value.into()) } } fn send<K, V>(key: K, value: V) where K: Into<RecordKey>, V: Into<RecordData>, { ... } #[cfg(test)] mod tests { use super::*; #[test] fn test_conversion() { send(RecordKey::NULL, vec![2]); send(vec![3],vec![4]); send("Hello", "world"); send("Hello".to_string(), "World".to_string()); } }
Client API producer
opened by nicholastmosher 31

[Merged by Bors] - fix: clap-command updates

Overview

Adds tests for valid and invalid fluvio cli arguments.
Changes to make flag names more clear and consistent.
Some bug fixes to make command usage consistent with help.
Simplifies consume status message and adds tests.

Changes to `fluvio consume` options for specifying offsets.

  -B, --beginning
          Consume records from the beginning of the log

  -H, --head <integer>
          Consume records starting <integer> from the beginning of the log

  -T, --tail <integer>
          Consume records starting <integer> from the end of the log

      --start <integer>
          The absolute offset of the first record to begin consuming from

      --end <integer>
          Consume records until end offset (inclusive)

Command Summary

Read messages from a topic/partition

Usage: fluvio consume [OPTIONS] <topic>

Arguments:
  <topic>
          Topic name

Options:
  -p, --partition <integer>
          Partition id
          
          [default: 0]

  -A, --all-partitions
          Consume records from all partitions

  -d, --disable-continuous
          Disable continuous processing of messages

      --disable-progressbar
          Disable the progress bar and wait spinner

  -k, --key-value
          Print records in "[key] value" format, with "[null]" for no key

  -F, --format <FORMAT>
          Provide a template string to print records with a custom format. See --help for details.
          
          Template strings may include the variables {{key}}, {{value}}, {{offset}}, {{partition}} and {{time}} which will have each record's contents substituted in their place. Note that timestamp is displayed using RFC3339, is always UTC and ignores system timezone.
          
          For example, the following template string:
          
          Offset {{offset}} has key {{key}} and value {{value}}
          
          Would produce a printout where records might look like this:
          
          Offset 0 has key A and value Apple

      --table-format <TABLE_FORMAT>
          Consume records using the formatting rules defined by TableFormat name

  -B, --beginning
          Consume records from the beginning of the log

  -H, --head <integer>
          Consume records starting <integer> from the beginning of the log

  -T, --tail <integer>
          Consume records starting <integer> from the end of the log

      --start <integer>
          The absolute offset of the first record to begin consuming from

      --end <integer>
          Consume records until end offset (inclusive)

  -b, --maxbytes <integer>
          Maximum number of bytes to be retrieved

      --suppress-unknown
          Suppress items items that have an unknown output type

  -O, --output <type>
          Output
          
          [possible values: dynamic, text, binary, json, raw, table, full-table]

      --smartmodule <SMARTMODULE>
          Name of the smart module

      --smartmodule-path <SMARTMODULE_PATH>
          Path to the stmart module

      --aggregate-initial <AGGREGATE_INITIAL>
          (Optional) Path to a file to use as an initial accumulator value with --aggregate

  -e, --params <PARAMS>
          (Optional) Extra input parameters passed to the smartmodule module. They should be passed using key=value format Eg. fluvio consume topic-name --filter filter.wasm -e foo=bar -e key=value -e one=1

      --isolation <ISOLATION>
          Isolation level that consumer must respect. Supported values: read_committed (ReadCommitted) - consume only committed records, read_uncommitted (ReadUncommitted) - consume all records accepted by leader

  -h, --help
          Print help information (use `-h` for a summary)

Loose ends

Integration tests for the command behavior
Consistency for offset datatype.
--end is inclusive so that fluvio consume --end 1 displays 2 records.

CLI

opened by davidbeesley 28

[Merged by Bors] - Add bors configuration

This PR is intended to be merged after https://github.com/infinyon/fluvio/pull/937

@sehz if we want to use this, you'll need to add the Bors bot to this repository. You can do that here: https://bors.tech/ at the top you click "Log into dashboard". It will ask to add Bors to your Github account, but you can selectively choose which repositories it has access to, so you can choose infinyon/fluvio. Once that's done, I believe we should be able to use the bors commands in this PR.

Here's the command documentation for how to use Bors: https://bors.tech/documentation/. The general gist though is just that if you are reviewing a PR, instead of clicking the "Merge/Rebase/Squash" button, you just comment with bors r+ and it will take care of adding it to a queue to be merged, then it will merge it once all CI passes.

opened by nicholastmosher 26
[Merged by Bors] - feat: Improved SmartStream API using proc macro
Closes #1033.

Adds a #[smartstream] attribute that generates glue code to pass data to and from WASM

Detects attribute arguments #[smartstream(filter)] and #[smartstream(map)]

Adds SimpleRecord that may be passed to user functions, while hiding internal DefaultRecord details
opened by nicholastmosher 25
[Merged by Bors] - feat: support to generate action tokens in other environments

Introduces the get_action_auth_with_token method for HubAccess to allow tokens in other applications of the ecosystem to generate action tokens similar to how the Hub CLI does.

opened by EstebanBorai 24

[Merged by Bors] - feat: generate SmartModules using SMDK

Provides support to generate new SmartModules using cargo-generate as a libary.

Resolve: https://github.com/infinyon/fluvio/issues/2621

Examples

Generates SmartModule Projects using `smdk generate`

➜  Desktop smdk generate log-filter
Generating new SmartModule project: log-filter
🔧   Destination: /Users/esteban/Desktop/log-filter ...
🔧   Generating template ...
✔ 🤷   Which type of SmartModule would you like? · filter
✔ 🤷   Want to use SmartModule parameters? · false
🤷   SmartModule Version [default: git = "https://github.com/infinyon/fluvio.git"]: git = "https://github.com/infinyon/fluvio.git"
✔ 🤷   Want to use SmartModule init? · true
Ignoring: /var/folders/0g/l702s31x2mj4zdlw2_43rpyr0000gn/T/.tmpwQ4Yql/cargo-generate.toml
[1/5]   Done: Cargo.toml
[2/5]   Done: README.md
[3/5]   Done: Smart.toml
[4/5]   Done: src/lib.rs
[5/5]   Done: src
🔧   Moving generated files into: `/Users/esteban/Desktop/log-filter`...
💡   Initializing a fresh Git repository
✨   Done! New project created /Users/esteban/Desktop/log-filter

Generated Project

total 24
drwxr-xr-x  7 esteban  staff   224B Oct  2 13:54 .
drwx------+ 6 esteban  staff   192B Oct  2 13:54 ..
drwxr-xr-x  9 esteban  staff   288B Oct  2 13:54 .git
-rw-r--r--  1 esteban  staff   358B Oct  2 13:54 Cargo.toml
-rw-r--r--  1 esteban  staff   1.8K Oct  2 13:54 README.md
-rw-r--r--  1 esteban  staff   152B Oct  2 13:54 Smart.toml
drwxr-xr-x  3 esteban  staff    96B Oct  2 13:54 src

Generated `src/lib.rs` with the provided options

use fluvio_smartmodule::dataplane::smartmodule::{SmartModuleExtraParams, SmartModuleInitError};

use once_cell::sync::OnceCell;
use fluvio_smartmodule::eyre;



use fluvio_smartmodule::{smartmodule, Result, Record};

#[smartmodule(filter)]
pub fn filter(record: &Record) -> Result<bool> {
    let string = std::str::from_utf8(record.value.as_ref())?;
    Ok(string.contains('a'))
}




static CRITERIA: OnceCell<String> = OnceCell::new();

#[smartmodule(init)]
fn init(params: SmartModuleExtraParams) -> Result<()> {
    // You can refer to the example SmartModules in Fluvio's GitHub Repository
    // https://github.com/infinyon/fluvio/tree/master/smartmodule
    if let Some(key) = params.get("key") {
        CRITERIA.set(key.clone()).map_err(|err| eyre!("failed setting key: {:#?}", err))
    } else {
        Err(SmartModuleInitError::MissingParam("key".to_string()).into())
    }
}

Building SMDK Project

➜  Desktop cd ./log-filter
➜  log-filter git:(main) ✗ smdk build
    Updating git repository `https://github.com/infinyon/fluvio.git`
    Updating crates.io index
   Compiling proc-macro2 v1.0.46

Testing SM Project using `smdk test`

➜  log-filter git:(main) ✗ smdk test --params "key=a" --text "testing"
project name: "log-filter"
loading module at: target/wasm32-unknown-unknown/release-lto/log_filter.wasm
0 records outputed
➜  log-filter git:(main) ✗ smdk test --params "key=a" --text "tasty"
project name: "log-filter"
loading module at: target/wasm32-unknown-unknown/release-lto/log_filter.wasm
1 records outputed
tasty

opened by EstebanBorai 21

[Merged by Bors] - Introduce channels into CLI
This adds the concept of separate channels in the CLI, with the ability to change between them.

3 default channels:

stable

latest

dev

By default, we use the stable channel, but you can switch to another channel with:

fluvio version switch <channel>

To add a version channel, you need to create it first

For example, to create a channel for 0.9.15 run fluvio version create 0.9.15

Then you can switch over to it with

fluvio version switch 0.9.15

Simplifies starting a cluster to fluvio cluster start for all channels (only if using the fluvio-channel binary)

Changes fluvio-cli to respect 3 environment variables to control cluster and plugin interactions

FLUVIO_RELEASE_CHANNEL

FLUVIO_EXTENSIONS_DIR

FLUVIO_IMAGE_TAG_STRATEGY

Manual tests:

Copy fluvio-channel binary ~/.fluvio/bin/fluvio

Switch to stable channel fluvio version switch stable, start a cluster with fluvio cluster start

Switch to latest, fluvio version switch latest, start a cluster with fluvio cluster start

Verify the image that is used for pods is correct

Additionally, modifies install.sh to support the installation of fluvio-channel to ~/.fluvio/bin/fluvio, and a target fluvio binary as ~/.fluvio/bin/fluvio-<channel>

Resolves #1663
opened by tjtelan 21
[Merged by Bors] - feat: Add `smdk` install support in CLI and release

Closes #2644

Adds a name check to fluvio install. If it starts with fluvio-, it is an extension and gets installed in the channel extension dir. Otherwise, it gets installed in the fluvio bin dir.

The result of fluvio install smdk --develop installs the latest published smdk to install in my fluvio path.

--

Added smdk to CI for build and publish for CI

Moving a lot of publish and release workflows into Makefile so I can test it easier locally

Got rid of the cargo make configs since they are out of date and unmaintained

opened by tjtelan 20
[Merged by Bors] - Migrate to `comfy-table`

I've migrated all components from prettytable to comfy-table.

Although the API's of the 2 libraries are quite different, and a lot of features differ, I've tried my best to maintain expected behavior.

Would appreciate if you could review this PR and merge if tables are rendered as expected.
enhancement

opened by suptejas 19
[Merged by Bors] - [Merged by Bors] - Add check for unused dependencies
Resolves #415

Runs in separate workflow. Runs on staging branch (when Bors performs PR merge to master)

Install nightly in CI (to be used by udeps)

Add udeps to Makefile

Remove unused crates and minor bumped affected crates
opened by tjtelan 18
feature: cluster shutdown command

We have a cluster delete command. However, this deletes existing data. It's nice to have a cluster shutdown, which terminates all cluster processes but leaves data intact. This is useful for swapping or replacing data for testing or other purposes.
help wanted cluster

opened by sehz 1
feat: support custom configs in connector SDK
connector macro will support any custom configs for connectors.

It is done in the following way:

Config file is read and parsed to serde_yaml::Value.

ConnectorConfig is parsed from the yaml value and used to initialize Producer/Consumer.

User-defined config is parsed from the yaml value and passed to the user-defined function.

Validation on the deployer side is done only for ConnectorConfig type.

Each connector will be responsible for the validation of its own configs.
feature/connector
opened by galibey 0
fix: Fluvio CLI strip newline suffixes

Currently newlines break our VERSION parsing via SemVer package. To avoid this we must trim newlines suffixes.

Refer to this PR for an scenario where this represents an issue: https://github.com/infinyon/fluvio/pull/2907

opened by EstebanBorai 0
Access control for produce / consume

Currently crates/fluvio-sc/src/services/auth/basic.rs only applies to the SC API, it would be useful to restrict producing and consuming from a given topic.
RBAC

opened by nacardin 0
Allow specifying inifnite retention
Currently only Segment cleanup policy is supported when creating a topic. It should support something like NoCleanup.

Currently in crates/fluvio-controlplane-metadata/src/topic/spec.rs

pub enum CleanupPolicy { Segment(SegmentBasedPolicy), }
help wanted SPU storage
opened by nacardin 0