Data analysis infrastructure for the Neo N3 blockchain.

Overview

Shrike

Shrike is a set of tools built for the purpose of Neo blockchain data analysis. The infrastructure comprises three components:

Indexer - Synchronizes a NeoGo node, retrieves blockchain data, and processes it into a relational DB.

API - Serves a set of useful queries for indexed data over a REST API. Used to power the GUI and hopefully other third-party applications in the future.

GUI - A simple web interface for interacting with the data made available by Shrike. A hosted version of this application may be found here.

You can find instructions on how to operate each of the components independently in the respective sections below.

Pull requests and suggestions are welcomed in any of the components. There are innumerable ways to improve the code and broaden the featureset.

Indexer

A 🔥 blazingly fast 🔥 chain indexer for Neo N3.

Indexer is the first and primary component. It was built with personal projects in mind, and as a learning experience, so is very much a WIP. However, it should be just about safe for human consumption.

The Indexer oversees three functions:

  • Synchronize a NeoGo instance.
  • Fetch block, transaction, and application log data.
  • Process and store the chain data into SQLite tables.

Requirements

  • The latest stable Rust version. I recommend using Rustup.
  • The NeoGo v0.101.0 binary for your platform. Get that here. Indexer has not been tested on any platform except Windows 10.
  • (Optional) An SQLite-compatible DB browser/query editor. For simplicity I enjoy DB Browser, going more advanced you might prefer DBeaver.

Quickstart

  1. Clone or otherwise download the Indexer folder.
  2. Drop your NeoGo binary in the root directory (where Cargo.toml lives). On Windows, rename the binary to neogo.exe. On other platforms, you'll likely need to edit main.rs to use the correct path in spawn::NeoGo::new().
  3. Open the root directory in a terminal and enter cargo run --release to build and run.
  4. Do something else for a while.

Notes

All figures below are accurate on my machine as of block height ~2.7M.

Database structure

The database has two tables: blocks and transactions. They are modelled to closely match their typical NeoRPC forms, with some allowances made for SQL and the cramming of the relevant parts of their respective application logs into each.

I'm not against the idea of changing the tables, depending on feedback, if there's good reason for it. I also plan to add contracts and perhaps balances or transfers, depending on if I have a use case for them. Feel free to make a PR if you want to expedite that process.

NeoGo sync time

Indexer will wait for its NeoGo instance to sync before it will start fetching data. Syncing NeoGo currently takes a little over an hour. You can speed it up by adjusting the config to SkipBlockVerification, but this is not advised. Once you have caught up to the chain head once, sync time is generally negligible.

Indexing time

Indexer works quickly and quietly, you can use your machine as you usually would while it runs. Once syncing is complete, fully populating the block and transaction tables from scratch takes me less than 15 minutes.

Storage requirements

You'll need a healthy amount of storage space to use the Indexer, slightly more than is required to sync a node on its own. My chain folder is currently 26.6GB and the Shrike DB is 7.18GB. Extrapolate from there to determine how much headroom you need to account for future blockchain growth, depending on your use case.

Alternative networks

You can point Indexer at any Neo N3 network that is compatible with the current NeoGo version used by the program. This can be done by adjusting the protocol config file. References can be found here. You may have to adjust the NODE_PATH in rpc.rs if you alter the RPC port.

Acknowledgements

Thanks to the NeoGo team for their excellent software and documentation. Also thanks to @liaojinghui, whose work on neo-rs saved me a lot of headache with the cumbersome task of converting script hashes to public addresses.

API

An Actix Web-based service that performs various queries against indexed data and serves the responses. Only relatively basic queries are implemented so far. There is currently no caching for queries that only need to be performed once per block, it will scale very poorly to multiple users until then.

Quickstart

  1. Clone or otherwise download the API folder.
  2. Get a copy of the Shrike DB from the download page (TODO) or by running the Indexer. Adjust the file path in main.rs via the DB_PATH constant.
  3. Use cargo run or cargo run --release to serve the API.
  4. Make your requests! The default path for the API when run locally is as follows: http://127.0.0.1:8080/v1/module/method/parameter.

A hosted version will be provided in the future.

API Reference

TODO

GUI

A simple web application built using SolidJS (SolidStart) and PicoCSS. It was created to give a way for regular users to leverage Shrike, but power users will be better served by running custom queries against their own copy of the Shrike DB.

Quickstart

  1. Clone or otherwise download the GUI folder.
  2. Run the API following the above instructions, or update the path in /constants/index.js to use the hosted version (coming soon).
  3. Serve the GUI locally with npm run dev and open it in your browser at http://127.0.0.1:5173/.
You might also like...
It's not a novel data sturcture just AVL and Btree for rust

This crate named as ABtree but this not means it is a novel data sturcture. It’s just AVL tree and Btree. For the Btree, what makes it different from

RedisJSON - a JSON data type for Redis

RedisJSON RedisJSON is a Redis module that implements ECMA-404 The JSON Data Interchange Standard as a native data type. It allows storing, updating a

A Rust application that inserts Discogs data dumps into Postgres

Discogs-load A Rust application that inserts Discogs data dumps into Postgres. Discogs-load uses a simple state machine with the quick-xml Rust librar

SQLite compiled to WASM with pluggable data storage

wasm-sqlite SQLite compiled to WASM with pluggable data storage. Useful to save SQLite in e.g. Cloudflare Durable Objects (example: https://github.com

Databend aimed to be an open source elastic and reliable serverless data warehouse,
Databend aimed to be an open source elastic and reliable serverless data warehouse,

An elastic and reliable Serverless Data Warehouse, offers Blazing Fast Query and combines Elasticity, Simplicity, Low cost of the Cloud, built to make the Data Cloud easy

Open Data Access Layer that connect the whole world together

OpenDAL Open Data Access Layer that connect the whole world together. Status OpenDAL is in alpha stage and has been early adopted by databend. Welcome

postgres-ical - a PostgreSQL extension that adds features related to parsing RFC-5545 « iCalendar » data from within a PostgreSQL database

postgres-ical - a PostgreSQL extension that adds features related to parsing RFC-5545 « iCalendar » data from within a PostgreSQL database

Plugin for macro-, mini-quad (quads) to save data in simple local storage using Web Storage API in WASM and local file on a native platforms.

quad-storage This is the crate to save data in persistent local storage in miniquad/macroquad environment. In WASM the data persists even if tab or br

The Declarative Data Generator
The Declarative Data Generator

The Declarative Data Generator Synth is a tool for generating realistic data using a declarative data model. Synth is database agnostic and can scale

Owner
edge
editor @ NNT
edge
a tokio-enabled data store for triple data

terminusdb-store, a tokio-enabled data store for triple data Overview This library implements a way to store triple data - data that consists of a sub

TerminusDB 307 Dec 18, 2022
A Modern Real-Time Data Processing & Analytics DBMS with Cloud-Native Architecture, built to make the Data Cloud easy

A Modern Real-Time Data Processing & Analytics DBMS with Cloud-Native Architecture, built to make the Data Cloud easy

Datafuse Labs 5k Jan 9, 2023
Implements the packet parser for Gran Turismo 7 telemetry data, allowing a developer to retrieve data from a running game.

gran-turismo-query Implements the packet parser for Gran Turismo 7 telemetry data, allowing a developer to retrieve data from a running game. Features

Carlos Menezes 3 Dec 11, 2023
Experimental blockchain database

A database for the blockchain. Design considerations API The database is a universal key-value storage that supports transactions. It does not support

Parity Technologies 172 Dec 26, 2022
Materialize simplifies application development with streaming data. Incrementally-updated materialized views - in PostgreSQL and in real time. Materialize is powered by Timely Dataflow.

Materialize is a streaming database for real-time applications. Get started Check out our getting started guide. About Materialize lets you ask questi

Materialize, Inc. 4.7k Jan 8, 2023
🐸Slippi DB ingests Slippi replays and puts the data into a SQLite database for easier parsing.

The primary goal of this project is to make it easier to analyze large amounts of Slippi data. Its end goal is to create something similar to Ballchasing.com but for Melee.

Max Timkovich 20 Jan 2, 2023
A Key-Value data storage system. - dorea db

Dorea DB ?? Dorea is a key-value data storage system. It is based on the Bitcask storage model Documentation | Crates.io | API Doucment 简体中文 | English

ZhuoEr Liu 112 Dec 2, 2022
Hypergraph is data structure library to create a directed hypergraph in which a hyperedge can join any number of vertices.

Hypergraph is data structure library to create a directed hypergraph in which a hyperedge can join any number of vertices.

Davy Duperron 224 Dec 21, 2022
Blazingly fast data generation & seeding for MongoDB

Planter Blazingly fast and simple data generation & seeding for MongoDB Installation Use the package manager cargo to install planter. Add the followi

Valencian Digital 4 Jan 12, 2022
Zenith substitutes PostgreSQL storage layer and redistributes data across a cluster of nodes

Zenith substitutes PostgreSQL storage layer and redistributes data across a cluster of nodes

null 5.7k Jan 6, 2023