📝 Web-based, reactive Datalog notebooks for data analysis and visualization

Overview

Percival

Percival is a declarative data query and visualization language. It provides a reactive, web-based notebook environment for exploring complex datasets, producing interactive graphics, and sharing results.


percival.ink

Percival combines the flexibility of Datalog as a query language for relational data with the beauty of exploratory visualization grammars. These declarative components interact through a reactive dataflow system. Because Percival uses web technologies (including Web Workers for multithreaded, sandboxed execution), fully-interactive notebooks can be shared with anyone on the Internet, making data analyses more tangible to others.

At the core of Percival is a custom Datalog compiler, built with Rust and WebAssembly, which integrates with its notebook runtime. This compiles the query language to JavaScript through a staged evaluation process that also allows users to embed their own JavaScript code. The interface aims to be lightweight, friendly, and accessible, and there is no hidden workspace state.

This is an early-stage research project, and we welcome your feedback, so please feel free to say hello at our discussions page!

Getting Started

If you've gotten to this point in the README, please first try out the web application and demo notebook at percival.ink! The information below is technical documentation intended for contributors.

Building Percival from scratch requires Node v16+, NPM v8+, Rust 1.56+, Cargo, and Wasm-Pack installed on your machine. To build the Rust/WebAssembly portion of the project, use the command:

wasm-pack build --target web crates/percival-wasm

Next, run npm install to install JavaScript dependencies, then run the following command to start the development server:

npm run dev

This should open a Percival notebook in your browser, with live reloading.

Architecture

This section outlines the high-level technical design of Percival.

User Interface

Percival is a client-side web application running fully in the user's browser. The notebook interface is built with Svelte and styled with Tailwind CSS. It relies on numerous other open source libraries, including CodeMirror 6 for live code editing and syntax highlighting, Remark and KaTeX for Markdown rendering, and Vite for frontend bundling.

The code for the web frontend is located in src/, which contains a mix of Svelte (in src/components/) and TypeScript (in src/lib/). These modules are bundled into a static website at build time, and there is no dynamic server-side rendering.

JIT Compiler

Users write code cells in a custom dialect of Datalog, and they are translated to JavaScript by a Rust compiler, which itself is compiled to WebAssembly using wasm-bindgen. The Percival compiler's code is located in the crates/ folder. For ergonomic parsing with human-readable error messages, the compiler relies on chumsky, a parser combinator library.

After the percival-wasm crate is compiled to WebAssembly, it can be used by client-side code. The compiler processes code cells, then sends the resulting JavaScript to separate web workers that sandbox the code and execute it just-in-time. As the user writes queries, their notebook automatically tracks inter-cell dependencies and evaluates cells in topological order, spawning / terminating worker threads on demand.

Data Visualization

Plotting is done using a specialized web worker that runs JavaScript code with access to the Observable Plot library. In order for this library (and D3) to run in a worker context, we patch the global document with a lightweight virtual DOM implementation ported from Domino.

Deployment

In production, the main branch of this repository is continuously deployed to percival.ink via Vercel, which hosts the static website. It also runs a serverless function (see api/index.go) that allows users to share notebooks through the GitHub Gist API.

Development

To build, lint, and format the Svelte project, use the corresponding scripts:

npm run build
npm run check
npm run format

For the Rust crates, you can run unit tests for the core functionality with:

cargo test

You can also run tests for the WebAssembly component using a headless Chrome or Firefox browser:

wasm-pack test --chrome --headless crates/percival-wasm

Since Percival uses a Rust-based compiler but outputs JavaScript, the easiest way to test code generation functionality is within the browser. We use Mocha and Puppeteer for this, and tests can be run with:

npm test

Acknowledgements

Created by Eric Zhang (@ekzhang1). Licensed under the MIT license.

Comments
  • Adjust Divider UX

    Adjust Divider UX

    Hi, nice project! Datalog/prolog-like facts+rules is a great way to work with data, web-based notebooks are an ideal experience for exploring and sharing it, and I've been interested in trying out svelte + tailwindcss, so your project is surprisingly synchronous for me. Thank you for sharing it!

    While exploring the project I noticed a couple small things so I fixed them and put them in separate commits for you to review:

    Smaller one first: I factored the divider buttons to generate from a list. This should make it easier to add/remove buttons for new types of cells in the future and I think it's a tiny bit easier to read (also maybe an opportunity to generate from centrally registered cell types in some future version?). This is my first time using Svelte so I wanted to poke some (very!) minor low-hanging fruit first. 😸 Note that I moved the <hr> to the bottom of the div to reproduce identical button classes with :not(:first-child); I believe the result is the same.

    Second, I noticed that you had to be unusually precise to trigger the opacity hover in order to see the divider buttons, in fact I was confused for a few minutes trying to figure out how to add new cells before I discovered that you just have to hover in the right place. So I adjusted the styles to make the divider hover target easier to activate, specifically:

    • Remove CellDivider h-2 so the div takes up the whole space between cells
    • Remove margin on Cells
    • Use negative margin on divider to slightly overlap neighboring cells' padding. Keeps both compact spacing and easy hover target.
    • Increase divider z-index so the overlap works correctly

    With some added borders to visualize, before:

    image

    Here you can see that you have to hover the buttons directly or that tiny sliver around the <hr> for the buttons to become visible.

    After:

    image

    Now, hovering anywhere on the overlapped area triggers the hover effect of the divider which makes it much easier to find. I checked and none of the regular cells use this area so this change shouldn't impede their functionality. This does have a small effect on the vertical spacing of neighboring cells (3% taller based on the screenshots), but it is not very noticeable and worth the UX imo.

    I'm sorry for writing so many words for such a small PR but I don't have time to make it shorter, please forgive me! 😅🙏 What do you think?

    opened by infogulch 1
  • code cell with syntax error gets erased on toggle

    code cell with syntax error gets erased on toggle

    Steps to reproduce:

    1. Create a code cell.
    2. Write invalid code, such as: 1, 2, 3. The code block is highlighted in orange.
    3. Click the down arrow to toggle the visibility of the code block, rendering it not visible.
    4. Click the right arrow to toggle visibility of the code block, rendering it visible.
    5. The code block is empty, while expected to contain 1, 2, 3.
    opened by namin 1
  • Finish writing the Percival

    Finish writing the Percival "starter notebook" tutorial

    I will finish the tutorial and fill in the missing section with more complex data analyses, as well as nicer examples of data visualizations using Observable Plot.

    This starter notebook will be finished when I have time, and I'll advertise the project after that.

    See also: #5.

    (It would also be good to write a "How it works" section to the README explaining the Rust Datalog compilation and reactive WebAssembly components + Web Workers.)

    opened by ekzhang 1
  • Better error messages with a lexical grammar

    Better error messages with a lexical grammar

    Right now, the Percival compiler simply parses inputs as a character stream with parser combinators. Although this is simple and was sufficient for the purposes of the basic language, some error messages are a little bit suspect, especially ones that mention "expected ... in input." For example, the error below is misleading. It should suggest either an import, rule, or end-of-file.

    image

    By switching to a two-phase lexical and syntactic analysis step on a stream of tokens, rather than characters, the parser will be much more flexible, while also having clearer error messages.

    opened by ekzhang 0
  • Fix pre-bundling issue by updating Vite config

    Fix pre-bundling issue by updating Vite config

    Previously, I was using a hack in runtime.ts, adding some lines that were necessary because certain libraries were not present in the non-worker bundle. This makes tests get confused because they discover the libraries mid-execution and reload the page, breaking Puppeteer.

    By updating the optimizeDeps config option of Vite, I can tell Vite which entry points to crawl for pre-bundled dependencies at boot time, avoiding this reloading issue entirely.

    opened by ekzhang 0
  • Adjustments for mobile

    Adjustments for mobile

    I would love to run Datalog on my phone. This requires some small tweaks to the current UI:

    • [ ] Make all elements visible without hovering for mobile devices (or for all devices?). This includes the gutter and buttons.
    • [ ] Add a "run code" button. Shift + enter doesn't work on mobile; even if it did, a "run" button would be more convenient.
    opened by remysucre 0
  • `dev` cells throws module import error

    `dev` cells throws module import error

    I suspect this is something rather obvious to someone more familiar with the JavaScript build tools:

    image

    when running npm run dev.

    npm run serve allows cell execution (as expected) -- I'm not exactly sure why this throws during dev.

    opened by femtomc 3
  • Sidebar UX

    Sidebar UX

    Orient sidebar buttons vertically, as described in https://github.com/ekzhang/percival/pull/16#issuecomment-1120137337. This will conflict with 16; that PR should probably be merged first and I'll rebase this one on top of it after.

    Changes:

    • Orient buttons vertically, with the chevron in the top right corner. This makes it easier to add more buttons, and also makes the buttons clickable on mobile.
    • Adjust the spacing of the sidebar and Notebook so that the sidebar and buttons are always visible, even with a very narrow view. It's a tiny bit off-center at this width but worth actually being able to interact with the UI.
    • Add small border to the right of the sidebar, visible on cell hover, and small cell padding left to space it. This makes it easier to see which input and output views are part of the same cell; this was visually confusing at first and will get worse as we add more views to cells.
    • Remove FaChevronDown in favor of a conditional rotate-90 transform class when the input is shown. Combined with transition-all this nicely animates the chevron when you click on the sidebar.
    • Only show delete button when input is visible, this helps prevent taps on the invisible delete button on mobile (for closed cells at least), and works around an issue on closed markdown cells with one line where the delete button sticks out below the bottom of the sidebar.
    • Adjust the positioning calculation of the sidebar so that the left side reaches just to the edge of the view, not far past. This will make it easier to position items in the sidebar in general, and covers the case where the page is very zoomed out and 2000px is not enough. Generally this is minor but feels more 'correct' to me.
    opened by infogulch 0
  • Plot cell: allow async, multiple inputs, one output

    Plot cell: allow async, multiple inputs, one output

    This PR updates Plot cells to work as more fully-featured JS cells.

    • Plot cells can take multiple input dependencies: (dep1, dep2) => ... or no inputs, () => ....
    • Plot cells can be async, allowing them to access modules with await import(...): async dep => (await import('https://esm.sh/...'))(dep).
    • Plot cells can write a reactive result, using assignment syntax: resultName = () => [{ x: 1 }, { x: 2 }]. Currently this emits one relation, not a set of relations like code cells. A future change could use syntax like { table1, table2 } = () => ... to emit multiple relation outputs.
    • Cell outputs with outerHTML are rendered as HTML (as before). Array outputs are rendered as relations. Otherwise, the output is rendered as JSON. If #16 is merged, I'll replace the simple JSON renderer with the JSON tree inspector used there.

    After these changes "plot" vs "code" seem like misnomers - perhaps we should rename these concepts to "datalog" and "js". However, that's a bigger policy/design change.

    image
    opened by justjake 0
  • JS in Percival / Percival in JS

    JS in Percival / Percival in JS

    Percival makes Datalog so much more accessible! Now instead of needing to pick an implementation, install it, read the manual, figure out syntax highlighting in your editor, etc etc etc - we can just visit https://percival.ink. There's a bunch of examples, syntax highlighting, and a quick syntax to fetch data. That's great.

    But though you've broken the barrier to access a Datalog implementation, there's more we can do to make Datalog useable.

    1. Bring JS into the Percival notebook: We should add JS code blocks that can consume relations and emit new data to the DAG. This will allow learners to use Percival for analysis even if they're Datalog beginners. The user can attempt analysis in Datalog, and bail out to JS if they hit a roadblock. As they gain fluency, users gradually transition to Datalog.

      If we allow cells to return Promise, Percival becomes end-user extendable via ES6 modules. We can now import anything from NPM - not just CSV & JSON data.

      The simplest implementation of this idea would be to parse plot cells of the form output_name = (more, than, one, arg) => ....

    2. Bring Datalog rules out of Percival into the wider JS world: publish percival-wasm and/or src/lib/** as experimental packages on NPM. Advanced users can build on top of the work here incentivizes further contribution. The risk of publishing to NPM a backwards compatibility burden. But, given the niche nature of Datalog today, I don't expect this to be a substantial issue if we publish using alpha tags like 0.0.0-alpha001

    For both (1) and (2), I'm personally excited to explore integrating Percival <--> ObservableHQ.

    • We use Plot already. We could also use @observablehq/inspector and @observablehq/stdlib in plot/JS code blocks.
    • Observable's runtime is open source. We could investigate replacing the notebook toposort engine with @observablehq/runtime. This could allow importing Observable notebooks into Percival.
    • If we publish runtime.build to NPM, we can import Percival and run datalog within ObservableHQ notebooks. But, we won't have syntax highlighting :(
    opened by justjake 1
  • Load and save percival files in git repos

    Load and save percival files in git repos

    I think it would be neat to be able to load, save, edit, and commit percival files directly against a git repo.

    Something like isomorphic-git could be useful here:

    https://github.com/isomorphic-git/isomorphic-git

    It seems there are some potential complications to supporting a git workflow, especially related to CORS.

    Thoughts?

    opened by infogulch 3
Owner
Eric Zhang
Student at Harvard, competitive programmer, and researcher in machine learning & programming languages. Previously @get-convex @jumptrading @scaleapi @NVIDIA.
Eric Zhang
A small charting/visualization tool and partial vega implementation for Rust

Gust A charting library for rust! Disclaimer This is still very much a work in progress! APIs are very unstable and subject to change. Contributions a

Samuel Resendez 128 Dec 24, 2022
A pure Rust visualization library inspired by D3.js

charts A pure Rust visualization library inspired by D3.js. See gallery and examples for code and more charts. Install You can add this as a dependenc

Iulian Gulea 186 Dec 29, 2022
A color-coded visualization tool for the instructions of an anchor program

anchor-viz A color-coded visualization tool for the instructions of an anchor program. (This is a schematic of basic-2 from anchor's examples/tutorial

cavemanloverboy 5 Oct 15, 2022
A better visualization of clang's -ftime-trace output

crofiler: Easier C++ build profiling Understanding why C++ builds get slow has become a lot easier since clang introduced their -ftime-trace build tra

Hadrien G. 7 Nov 15, 2022
Vector is a high-performance, end-to-end (agent & aggregator) observability data pipeline that puts you in control of your observability data

Quickstart • Docs • Guides • Integrations • Chat • Download What is Vector? Vector is a high-performance, end-to-end (agent & aggregator) observabilit

Vector 12.1k Jan 2, 2023
A rust drawing library for high quality data plotting for both WASM and native, statically and realtimely 🦀 📈🚀

Plotters - A Rust drawing library focus on data plotting for both WASM and native applications ?? ?? ?? Plotters is drawing library designed for rende

Hao Hou 2.7k Jan 4, 2023
below is an interactive tool to view and record historical system data.

A time traveling resource monitor for modern Linux systems

Facebook Incubator 824 Dec 31, 2022
Data plotting library for Rust

plotlib plotlib is a generic data visualisation and plotting library for Rust. It is currently in the very early stages of development. It can current

Matt Williams 417 Dec 31, 2022
binocle is a graphical tool to visualize binary data

a graphical tool to visualize binary data

David Peter 773 Dec 30, 2022
A library of to show data (in browser, evcxr_jupyter) as table, chart...

showata A library of to show data (in browser, evcxr_jupyter) as table, chart.... The crate provides display for: image vector and slice (as table) nd

Procyon 20 Dec 12, 2022
KDash - A fast and simple dashboard for Kubernetes

KDash - A fast and simple dashboard for Kubernetes

null 915 Jan 4, 2023
🌱 Easy to use, blazing-fast, and reliable Chia plot manager for your farm

?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? UNDER DEVELOPME

Daniel De Lucca 8 May 31, 2022
Swash is a pure Rust, cross-platform crate that provides font introspection, complex text shaping and glyph rendering.

Swash is a pure Rust, cross-platform crate that provides font introspection, complex text shaping and glyph rendering. Goals This crate aims to

Chad Brokaw 398 Dec 14, 2022
This is an implementation of an Iced backend for Plotters, for both native and wasm applications.

plotters-iced This is an implementation of an Iced backend for Plotters, for both native and wasm applications. This backend has been optimized as for

joylei 54 Dec 14, 2022
📝 Web-based, reactive Datalog notebooks for data analysis and visualization

?? Web-based, reactive Datalog notebooks for data analysis and visualization

Eric Zhang 486 Dec 28, 2022
Easy c̵̰͠r̵̛̠ö̴̪s̶̩̒s̵̭̀-t̶̲͝h̶̯̚r̵̺͐e̷̖̽ḁ̴̍d̶̖̔ ȓ̵͙ė̶͎ḟ̴͙e̸̖͛r̶̖͗ë̶̱́ṉ̵̒ĉ̷̥e̷͚̍ s̷̹͌h̷̲̉a̵̭͋r̷̫̊ḭ̵̊n̷̬͂g̵̦̃ f̶̻̊ơ̵̜ṟ̸̈́ R̵̞̋ù̵̺s̷̖̅ţ̸͗!̸̼͋

Rust S̵̓i̸̓n̵̉ I̴n̴f̶e̸r̵n̷a̴l mutability! Howdy, friendly Rust developer! Ever had a value get m̵̯̅ð̶͊v̴̮̾ê̴̼͘d away right under your nose just when

null 294 Dec 23, 2022
A general-purpose, transactional, relational database that uses Datalog and focuses on graph data and algorithms

cozo A general-purpose, transactional, relational database that uses Datalog for query and focuses on graph data and algorithms. Features Relational d

null 1.9k Jan 9, 2023
A high-performance web-based geospatial visualization tool with an emphasis LEO satellites and lunar missions.

A high-performance web-based geospatial visualization tool with an emphasis LEO satellites and lunar missions. Written in Rust to target WebGPU, with WebGL2 backwards compatibility.

Shane B. 5 Dec 20, 2023
Scalable and fast data store optimised for time series data such as financial data, events, metrics for real time analysis

OnTimeDB Scalable and fast data store optimised for time series data such as financial data, events, metrics for real time analysis OnTimeDB is a time

Stuart 2 Apr 5, 2022