Implementation of the Kademlia DHT protocol in Rust

Overview

kademlia-dht

Simple implementation of the Kademlia DHT protocol in Rust with state dumping features for educational purposes (not production-ready).

Table of contents

Lib structure

src/
  key.res       ---> Implementation of the 256bits unique ID
  node.rs       ---> Node struct definition
  network.rs    ---> Network module used to issue RPCs
  routing.rs    ---> Routing Table implementation using vectors
  protocol.rs   ---> Main library API
  utils.rs      ---> General utilities functions
  main.rs       ---> Example program
  lib.rs        ---> Main lib file

Usage

Interface creation

In order to join the network you must create an interface with the Protocol::new method:

// BRAND NEW NETWORK
// if you want you can explicitely create a node first
let root = Node::new(utils::get_local_ip().unwrap(), 8080);

// it needs an IP, a PORT and an Option<Node> (bootstrap node)
let root_interface = Protocol::new(root.ip.clone(), root.port.clone(), None);

If you want to join a network and you already know a peer you can provide it as a bootstrap node:

// this is the contact we already know
let root = Node::new(utils::get_local_ip().unwrap(), 8080);

let our_node = Node::new(utils::get_local_ip().unwrap(), 8081);
let our_interface = Protocol::new(our_node.ip, our_node.port, Some(root.clone())));

Main operations

These are the main operations, there are more methods you can use but these are the ones you probably need (see Docs for more).

PUT

Store a <key, value> pair in the network:

// interface is already defined
interface.put("some_key", "some_value");

GET

Retreive a value from the network given its key:

// interface is already defined
let value = interface.get("some_key"); // some_value

Example program

I've written an example program to test the lib out. In order to run it issue the following command:

cargo run

It will spin up 10 nodes and it will test the PUT and the GET method.

If you want to run tests, issue:

cargo test

Documentation

Very brief and not detailed explaination of the library. I left some comments in the code to help people understand it better. If this project will be useful for some people I will expand this section.

Kademlia node

A node is a struct containing an IP, a PORT and a unique ID of type Key (see key.rs).

The node.rs module exposes the following methods:

Node::new

Creates a node on a given address and port

let node = Node::new("192.168.1.10", 8080);

We can also use this utils.rs method to automatically grab the local address:

use kademlia_dht::utils;

let node = Node::new(utils::get_local_ip().unwrap(), 8080);

get_info

Returns a string containing the IP, PORT and ID of the given node:

let node = Node::new(utils::get_local_ip().unwrap(), 8080);

printl!("node: {}", node.get_info()); // 192.168.1.10:8080:<SOME_256bits_ID>

get_addr

Returns a string containing the IP and the PORT of the given node. See get_info for similar behavior.

256bits Key and Distance

Both a Key and a Distance are struct wrappers around a [u8] slice of KEY_LEN length (32 bytes, 256bits).

The key.rs module provides methods to create a 256bits unique ID and to calculate the distance (XOR) between two IDs.

Key::new

let key = Key::new("some string".to_string());

Distance::new

Caluclates the distance between two Keys.

// for example, to calculate the distance between 2 nodes

let node0 = Node::new(utils::get_local_ip().unwrap(), 1335);
let node1 = Node::new(utils::get_local_ip().unwrap(), 1336);

let dist = Distance::new(&node0.id, &node1.id); // as we know, the id field is of type Key

Routing Table

The routing table is a struct containing a node field, representing the current node instance, a kbuckets field which is a Vec of KBucket (a struct containing a Vec of nodes and a size field) and a crossbeam_channel sender and receiver(external crate used to communicate with the protocol module).

The routing table communicates with the protocol.rs module for some actions such as pinging nodes that must be checked. The following struct (coming from utils.rs) is used in the crossbeam_channel:

pub enum ChannelPayload {
    Request((network::Request, Node)),
    Response(network::Response),
    NoData,
}

Through the channel we can see a Request, a Response or NoData if the contacted peer doesn't reply to the messages coming from the routing table.

For more information about Request and Response see the Network module.

Routing::new

Creates a new routing table:

pub fn new(
    node: Node, // current node
    bootstrap: Option<Node>, // bootstrap node
    sender: crossbeam_channel::Sender<ChannelPayload>, // sender of type ChannelPayload
    receiver: crossbeam_channel::Receiver<ChannelPayload>, // receiver of type ChannelPayload
) -> Self

get_lookup_bucket_index

Computes the corresponding bucket index for a given node ID with bitwise operations:

fn get_lookup_bucket_index(&self, key: &Key) -> usize

contact_via_rpc

Method used to communicate internally with the protocol.rs module. Used to send Pings to other nodes:

fn contact_via_rpc(&self, dst: Node) -> bool

Here we use the crossbream_channel.

update

Inserts a given node into the routing table. If there's space for it the node gets pushed to the vector, otherwise the necessary checks are performed on the other nodes (see official paper for more details).

pub fn update(&mut self, node: Node)

remove

Removes a given node from the routing table:

pub fn remove(&mut self, node: &Node)

get_closest_nodes

In this method the NodeAndDistance struct is used, which is a tuple of a Node and a Distance.

Returns a Vector of NodeAndDistance for a given Key target:

pub fn get_closest_nodes(&self, key: &Key, count: usize) -> Vec<NodeAndDistance>

Network

The network.rs module provides methods to communicate to other network nodes. Here we issue RPCs (Remote Procedure Calls) through the Rpc struct.

The Rpc contains a socket field which is an Arc to a std::net::UdpSocket, a pending field which is an Arc Mutex around a HashMap of Keys and mpsc::Sender<Option<Response>> and a node field representing the current node.

pub struct Rpc {
    pub socket: Arc<UdpSocket>,
    pub pending: Arc<Mutex<HashMap<Key, mpsc::Sender<Option<Response>>>>>,
    pub node: Node,
}

Request

Enum around Kademlia RPCs.

pub enum Request {
    Ping,
    Store(String, String),
    FindNode(Key),
    FindValue(String),
}

Response

pub enum Response {
    Ping,
    FindNode(Vec<NodeAndDistance>),
    FindValue(FindValueResult),
}

Where FindValueResult comes from routing.rs and it wraps either a vector of NodeAndDistance or the String value that we had looked for.

Message

pub enum Message {
    Abort,
    Request(Request),
    Response(Response),
}

RpcMessage

This is what gets sent to other network nodes.

pub struct RpcMessage {
    pub token: Key, // token of the message, crafted from source addr and timestamp
    pub src: String,
    pub dst: String,
    pub msg: Message,
}

Rpc::new

Creates a new RPC around a node without starting communications:

pub fn new(node: Node) -> Self

Rpc::open

Starts listening and sending modes:

pub fn open(rpc: Rpc, sender: mpsc::Sender<ReqWrapper>) {

Where ReqWrapper is a wrapper around the Request enum, used to keep track of metadata about the request (who sent it):

pub struct ReqWrapper {
    pub token: Key,
    pub src: String,
    pub payload: Request,
}

In this method, as soon as we receive a request we send that through the channel to the protocol.rs module, which handles it.

send_msg

Forwards a RpcMessage to another node using the UdpSocket:

pub fn send_msg(&self, msg: &RpcMessage)

handle_response

Method used to handle incoming responses from other nodes:

pub fn handle_response(self, token: Key, res: Response)

Here we keep track of the pending HashMap.

make_request

Makes a Request to a dst node that is then forwared to the protocol.rs module, also waits for the corresponding Response from the contacted node. It also handles the pending HashMap

pub fn make_request(&self, req: Request, dst: Node) -> mpsc::Receiver<Option<Response>>

Kademlia interface creation

The interface has the following structure:

pub struct Protocol {
    pub routes: Arc<Mutex<routing::RoutingTable>>,
    pub store: Arc<Mutex<HashMap<String, String>>>,
    pub rpc: Arc<network::Rpc>,
    pub node: Node,
}

It includes the routing table, the store (HashMap used to store <key, value> pairs), the rpc coming from the network module and the current active node.

The protocol.rs module exposes the following methods:

Protocol::new

pub fn new(ip: String, port: u16, bootstrap: Option<Node>) -> Self

Creates an interface to use Kademlia on a given address and port. A bootstrap node is a node that we already know in the network.

With bootstrap node:

// some already existing node
let root = Node::new("192.168.1.10", 8080);

// cloning the node it's not mandatory
let root_interface = Protocol::new("192.168.1.10".to_string(), 8081, Some(root.clone()));

Without bootstrap node:

let interface = Protocol::new("192.168.1.10", 8080, None);

In this method we also establish communications with the routing.rs module and the network.rs one by using channels, after of course creating them.

rt_forwarder

Used internally to forward requests issued by the Routing table:

fn rt_forwarder(
    self,
    sender: crossbeam_channel::Sender<utils::ChannelPayload>,
    receiver: crossbeam_channel::Receiver<utils::ChannelPayload>,
) {

request_handler

Used to handle incoming requests thorugh the mpsc channel. Here we send (see reply) responses to the requests.

fn requests_handler(self, receiver: mpsc::Receiver<network::ReqWrapper>) {

craft_res

Simply crafts responses for requests and executes RPCs coming from those requests (this means that we mutate the routing table and the store).

fn craft_res(&self, req: network::ReqWrapper) -> (network::Response, network::ReqWrapper) {

reply

Used to reply to requests. Calls send_msg.

fn reply(&self, packet_details: (network::Response, network::ReqWrapper)) {

Kademlia API

Here there are the implementations for the needed API calls:

pub fn ping(&self, dst: Node) -> bool // pings a node, returns true in case of response

pub fn store(&self, dst: Node, key: String, val: String) -> bool // rpc to store a <key, value> pair on a given destination. Returns true in case of response


pub fn find_node(
    &self,
    dst: Node,
    id: super::key::Key,
) -> Option<Vec<routing::NodeAndDistance>> // finds a node given the current node id. Returns a NodeAndDistance struct for that node or None in case it doesnt get a response


pub fn find_value(&self, dst: Node, k: String) -> Option<routing::FindValueResult> // finds a given value using the provided key on a given node. Returns a FindValueResult or None in case it doesnt get a response

nodes_lookup

Method used to lookup nodes given a starting ID.

pub fn nodes_lookup(&self, id: &super::key::Key) -> Vec<routing::NodeAndDistance> {

value_lookup

Method used to lookup a value given a String key:

pub fn value_lookup(&self, k: String) -> (Option<String>, Vec<routing::NodeAndDistance>) {

put

Method used to put a <key, value> pair into the network. It calls nodes_lookup and store.

pub fn put(&self, k: String, v: String)

get

Method used to extract a value from the network given a key. It calls value_lookup but also store.

pub fn get(&self, k: String) -> Option<String>

State dumping

There are two utils.rs methods used to dump the internal state of a Kademlia node:

pub fn dump_interface_state(interface: &Protocol, path: &str)

Dumps the Protocol object to a given file path (must be dumps/<name>.json, where you choose name). It dumps it as json and as plantuml.

Here's an example of the rendered dump using PlantUML:

example

pub fn dump_node_and_distance(
    entries: &Vec<NodeAndDistance>,
    target: &super::key::Key,
    path: &str,
) {

Dumps a vector of NodeAndDistances in json format. Example:

{
    "found": [
        {
            "distance": "00000000000000000000000000000000",
            "node": {
                "id": "9278733FBB7F4C6914839C98A54912F4F18B3F15EAED15178663AA5FC63",
                "ip": "192.168.1.10",
                "port": 1339
            }
        },
        {
            "distance": "3D0C24670ACCA14C1DEE576D7AF2D85486F125E4E0BFD664CCDABA9E532ED2",
            "node": {
                "id": "342BA354F17B558A8CA66EA4F0A6497BC9E99615BE11735CBD2DCA4F6D2B1",
                "ip": "192.168.1.10",
                "port": 1338
            }
        },
        {
            "distance": "B8F339EA9FB5DECF23F9C7D754476AD9C6698AC5787281EF371456C766F96C88",
            "node": {
                "id": "B1F14199A0EA18325688FEE9DCD3E48E9269276892C2F3E66135EA15C5C90EB",
                "ip": "192.168.1.10",
                "port": 1337
            }
        }
    ],
    "target": "9278733FBB7F4C6914839C98A54912F4F18B3F15EAED15178663AA5FC63"
}

Implemented features

Features specified in the paper that are implemented in this lib

  • Keys

  • XOR Distance between Keys

  • KBuckets

    • represented as a Vec of Vecs. A max of 256 kbuckets is set, each of them containing up to 20 elements
  • PING

  • STORE

  • FIND_NODE

  • FIND_VALUE

  • Node lookup

  • Republishing of <key, value> pairs every hour

    • technically, the original publisher should republish ever 24 hours
  • ability to dump a node internal state to JSON and plantuml

  • ability to dump distances between nodes to JSON

Missing features

  • expiration date on <key, value> pairs

    • this isn't wanted when kademlia is used in a decentralized storage system
  • replicate closest <key, value> pairs when a node joins the network

  • if no lookup has been performed for an hour in a kbucket, that bucket must be refreshed

Enhancements

  • better nodes lookup algorithm, as described in the paper

References

  • Kademlia: A Peer-to-peer Information System Based on the XOR Metric by Petar Maymounkov and David Mazières PDF

  • Implementation of the Kademlia Distributed Hash Table by Bruno Spori PDF

  • Kademlia: A Design Specification by XLattice project PDF

  • TinyTorrent: Implementing a Kademlia Based DHT for File Sharing by Sierra Kaplan-Nelson, Jestin Ma, Jake Rachleff PDF

Thanks

Thanks for taking the time to check out my library. If you found this good enough to be on crates.io please let me know!

I will also make a small article about Kademlia in general. Check out my blog!

You might also like...
Custom Ethereum vanity address generator made in Rust
Custom Ethereum vanity address generator made in Rust

ethaddrgen Custom Ethereum address generator Get a shiny ethereum address and stand out from the crowd! Disclaimer: Do not use the private key shown i

The new, performant, and simplified version of Holochain on Rust (sometimes called Holochain RSM for Refactored State Model)

Holochain License: This repository contains the core Holochain libraries and binaries. This is the most recent and well maintained version of Holochai

DEPRECATED. The Holochain framework implemented in rust with a redux style internal state-model.
DEPRECATED. The Holochain framework implemented in rust with a redux style internal state-model.

Holochain-rust Travis: Circle CI: Codecov: License: This code is loosely based on the previous Golang prototype. Code Status: This Rust version is alp

Rust Ethereum 2.0 Client
Rust Ethereum 2.0 Client

Lighthouse: Ethereum 2.0 An open-source Ethereum 2.0 client, written in Rust and maintained by Sigma Prime. Documentation Overview Lighthouse is: Read

rust client libraries to deal with the current cardano mainnet (byron / cardano-sl)

Rust implementation of Cardano primitives, helpers, and related applications Cardano Rust is a modular toolbox of Cardano’s cryptographic primitives,

Tendermint in Rust!

tendermint.rs Tendermint in Rust with TLA+ specifications. Tendermint is a high-performance blockchain consensus engine for Byzantine fault tolerant a

A Rust library for generating cryptocurrency wallets
A Rust library for generating cryptocurrency wallets

Table of Contents 1. Overview 2. Build Guide 2.1 Install Rust 2.2a Build from Homebrew 2.2b Build from Crates.io 2.2c Build from Source Code 3. Usage

Rust port of the Terry Davis' (RIP) "god says" program

RIP Terry A. Davis 1969-2018 god says Rust port of the programmer Terry Davis' "god says" (AKA GodSpeaks) program. Terrence Andrew Davis (December 15,

Collection of Key Derivation Functions written in pure Rust

RustCrypto: Key Derivation Functions Collection of Key Derivation Functions (KDF) written in pure Rust. Supported Algorithms Algorithm Crate Crates.io

Comments
  • Network module doesn't stream more than 4096 bytes

    Network module doesn't stream more than 4096 bytes

    Issue

    In Rpc::open (network.rs, line 72), we stream only 4096 bytes at the time. This means that if a packet size exceeds this limit, the message will be cut off.

    loop {
        let (len, src_addr) = rpc
            .socket
            .recv_from(&mut buf)
            .expect("[FAILED] Rpc::open --> Failed to receive data from peer");
    

    Behavior wanted

    We want to stream the full message before processing it

    We can't use the following method:

    PSEUDO
    
    fn stream(...) {
        let (recv, src) = socket.recv_from()
        recv
    }
    
    while stream(...) != 0
          ()
    

    Because this will hang the thread until the sender disconnects completely.

    What we want is:

    1. listener: listens for incoming packets
    2. sender: sends packet
    3. listener: adjusts buffer size to match sender's packet size OR streams with fixed buffer size until bytes recv == bytes sent
    4. listener: processes packet
    5. loop to 1)
    

    Possible solutions

    1. Attach header to a message where we include the size of the entire message

    EDIT: this doesn't solve the problem as the header can grow beyond the allocated buffer to read it. For example we could read the first 10 digits of the size but if the size is larger than that, problems will occur.

    Something like

    PACKET
    +---------------------------------------------+
    |   size: 5463                                |
    |   --------------------------------------    |
    |    <data>                                   |
    +---------------------------------------------+
    

    Where the header could be parsed at the start of the stream of bytes:

    message = b'5463                                  0x770xF10xA7'
    

    By isolating it from the data with some padding.

    2. Send the header as a separate packet, used to announce the next packet.

    In this case the header packet MUST be FIXED size, still don't sure how to make this possible... EDIT: it doesn't seem possible to make it fixed size

    Header
    {
        size_of_incoming_msg: 5463
        // other stuff
    } 
    
    |
    |
    |
    v
    
    MESSAGE
    

    3. Set a max size on packets

    I don't like this idea because in large networks many messages would be stopped from being sent, or we would need some checks on the length of, for example, the received NodeAndDistance array to tell when to stop receiving them. This would though make us lose some information.

    4. If one packet isn't enough, then what about 2?

    This is probably the only solution: if a packet is bigger than the receiver's buffer size, we proceed to send 2 or more packets until we stream the full message. In this case we would need some sort of "confirmation" for packets, something like "Wait for more packets/go ahead I'm done". Not sure how to implement this but possible

    bug 
    opened by f0lg0 2
Owner
Leonardo Folgoni
decentralize everything
Leonardo Folgoni
Rust implementation of Zcash protocol

The Parity Zcash client. Gitter Blog: Parity teams up with Zcash Foundation for Parity Zcash client Installing from source Installing the snap Running

Parity Technologies 183 Sep 8, 2022
Minimal implementation of the Mimblewimble protocol.

Grin Grin is an in-progress implementation of the Mimblewimble protocol. Many characteristics are still undefined but the following constitutes a firs

null 5k Dec 28, 2022
Reference client for NEAR Protocol

Reference implementation of NEAR Protocol About NEAR NEAR's purpose is to enable community-driven innovation to benefit people around the world. To ac

NEAR 2k Dec 29, 2022
IBC modules and relayer - Formal specifications and Rust implementation

ibc-rs Rust implementation of the Inter-Blockchain Communication (IBC) protocol. This project comprises primarily four crates: The ibc crate defines t

Informal Systems 296 Jan 4, 2023
A Rust implementation of BIP-0039

bip39-rs A Rust implementation of BIP0039 Changes See the changelog file, or the Github releases for specific tags. Documentation Add bip39 to your Ca

Infincia LLC 49 Dec 9, 2022
Martinez is vNext Ethereum implementation written in pure Rust with Erigon architecture as design.

?? Martinez ?? Next-generation implementation of Ethereum protocol ("client") written in Rust, based on Erigon architecture. Why run Martinez? Look at

Arthur·Thomas 23 Jul 3, 2022
Polkadot Node Implementation

Polkadot Implementation of a https://polkadot.network node in Rust based on the Substrate framework. NOTE: In 2018, we split our implementation of "Po

Parity Technologies 6.5k Jan 6, 2023
Official implementation of the YeeCo Root Chain (Layer 1)

yeeroot Official implementation of the YeeCo Root Chain (Layer 1) YeeCo is a permissionless, secure, high performance and scalable public blockchain p

YeeCo 29 Sep 20, 2022
A Rust library for working with Bitcoin SV

Rust-SV A library to build Bitcoin SV applications in Rust. Documentation Features P2P protocol messages (construction and serialization) Address enco

Brenton Gunning 51 Oct 13, 2022
Coinbase pro client for Rust

Coinbase pro client for Rust Supports SYNC/ASYNC/Websocket-feed data support Features private and public API sync and async support websocket-feed sup

null 126 Dec 30, 2022