Symbolically Executable Stack Machines in Rust

Overview

Symbolically Executable Stack Machines in Rust

Symbolic Stack Machines is a library for implementing symbolically executable stack-based virtual machines.

This works by providing various traits for a Stack, Memory (including read-only and writeable memory), VMInstruction, as well as generic Val structure to represent stack or memory values. Finally, it provides a set of abstract Machines which can be instantiated with a specific instruction set and can execute a sequence of instructions concretely or symbolically.

Implementing a new VM involves, simply, implementing the VMInstruction trait for an enum that describes the instruction set of the operations.

Each operation should output a state diff (ExecRecord), which describes the updates to the stack, memory, any new path constraints (in the case of symbolic execution), and machine's program counter. The machine can apply the update with ExecRecord::apply.

Reachability

The Machine constructs the updates performed on its state by instructions. Operations can add path constraints via the ExecRecord structure. The Machine incrementally constructs a set of constraints for each possible path of execution. Once these paths are constructed, the machine checks whether such paths are reachable. See src/machine/mod.rs, specifically, the BaseMachine::run_sym method for the implementation.

Currently, there is no intermediate path pruning performed.

Supported Memory & Stack Models

Currently, two forms of symbolic memory are built-in: Memory based on the theory of arrays, and finite concrete memory that can store possibly symbolic values.

The only built-in stack model right now is a finite stack that can store symbolic or concrete values.

The only symbolic values that have built-in support right now are integers.

Usage

See lib.rs for a toy instruction set and its symbolic execution.

Open Questions

  • How to handle endianness of various machines w.r.t bit vectors?

  • Best approach for modular plug-and-play style machine creation (storage, mem, stack, etc)?

  • How to handle special, niche environments? E.g., EVM has GAS opcode which requires a notion of gas within the machine.

  • Copy on write when storing machine states?

  • Niche exec environments:

    • Optional undefined context;
    • Pass this as an implicit argument to VMInstruction::Exec
    • Opcode author is responsible for writing the interaction
    • ExecRecord is extensible by this generic context as well

Notes

  1. Type constraints on Machine to ensure that the values stored on stack are convertible to the val type stored in memory as well as the val type used to index the memory
  2. Remove direct z3 dependency and generate an IR + transformation from IR -> target (e.g., smtlib2, rust-z3 bindings)
  3. For niche exec environment, provide custom context definition and access on the machine
  4. Add a generic context switch method; useful for describing behavior of one program calling another (such as smart contract calls)
Comments
  • remove generic from stack

    remove generic from stack

    Remove the generic from stack. This also requires removing the test where we test when symbolic values are on the stack. We can add that back in when StackVal is the more general Val that uses an ast.

    opened by williamberman 1
  • remove core/instructions/val in favor of core/value

    remove core/instructions/val in favor of core/value

    core/instructions/val is being replaced with the core/value module some of the non abstract machine + tests were dependent on the core/instructions/val module so removed those as well

    the contrib/tests/simple_lang.rs tests were dependent on core/instructions/val so I removed and replaced with contrib/tests/simple_langg.rs tests

    also ran cargo fmt :P

    opened by williamberman 1
  • Abstract Core

    Abstract Core

    Decouple core features from convenient features.

    Notable differences:

    1. Since most built in Abstract components are actually well defined data structures that are only abstract over the data types they contain, we have replaced the reliance on type signatures as the primary way to configure a machine with, instead, a "universal" value type that can be lazily interpreted as a different type depending on the intent of the machine author. This dramatically simplifies the type signatures.
    2. Configuration is thus done through a Config object that (will in the near future) contain the "hooks" provided by core::ast which enable customizing the way in which an abstract value is interpreted by the machine's interpreters.

    Still TODO:

    1. Make use of AST hooks to configure various Abstract-X structures via their Config objects
    2. Replace the StackVal, MemVal etc., with ast::Sentence
    3. Review the use of core::Constraint now that both values and constraints can be expressed as sentences
    4. Review the inner_interpreter-outer_interpreter architecture to integrate its execution with ast::Sentence
    5. Bring back the symbolic execution phase of the outer interpreter
    opened by WilfredTA 0
  • Made return type for AbstractInstruction abstract; Separated Abstract…

    Made return type for AbstractInstruction abstract; Separated Abstract…

    …Machine into AbstractMachine, InnerInterpreter, and OuterInterpreter

    Made return type for AbstractInstruction abstract.

    Because instructions can now return different types, the InnerInterpreter trait constrains the set of possible return types from individual instructions. All existing instructions still return AbstractExecRecord. All existing instructions are concrete so they return ConcreteAbstractExecRecord which has its constraint type parameter set to the unit type. This allows concrete instructions to not have any generic type for constraints while still using the same AbstractExecRecord return type.

    Separated AbstractMachine into AbstractMachine, InnerInterpreter, and OuterInterpreter.

    AbstractMachine - purely encapsulates the current state of the machine. It does not include constraints as concrete machines do not have constraints.

    InnerInterpreter - handles the return value and state update from execution of an individual instruction. We have two example inner interpreters. The concrete inner interpreter simply updates the current machine state. The symbolic inner interpreter returns a vec of new machine states and constraints.

    OuterInterpreter - handles orchestration of which machine state(s) to run. For the example outer interpreters we implement, we embed the inner interpreter in the outer interpreter struct as a trait object. This allows for some flexibility in choosing different inner interpreters so long as they have a step function of the right type for the outer interpreter. The concrete outer interpreter just steps the concrete inner interpreter until the machine can’t execute any further. The symbolic outer interpreter performs DFS on the search graph of machine states. Note that this architecture allows us to substitute out an alternative symbolic outer interpreter that performs BFS or any other alternative search strategy.

    I temporarily removed the PathSummary portion yet where we use the solver to find the reachable paths and return models. The type signature of SymbolicOuterInterpreter is already incredibly complex and I'd prefer to not add the additional generics that the solver is going to entail in a separate PR.

    opened by williamberman 0
  • Add z3 solver

    Add z3 solver

    rebased on top of https://github.com/WilfredTA/symbolic-stack-machines/pull/16

    Creates the abstract interface to the constraint solver. Running the symbolic machine now returns

    • the model mapping symbolic values to concrete values in the machine
    • the symbolic return value of the machine
    • the concrete return value of the machine
    opened by williamberman 0
  • init symbolic machine

    init symbolic machine

    rebased on top of https://github.com/WilfredTA/symbolic-stack-machines/pull/15

    This creates the symbolic machine that wraps a concrete machine. This initial version does not execute the SMT solver and only allows for collecting constraints and splitting the machine state.

    opened by williamberman 0
  • implement concrete machine

    implement concrete machine

    rebased on top of https://github.com/WilfredTA/symbolic-stack-machines/pull/14

    separating out the core concrete machine into its own trait+implementation

    opened by williamberman 0
  • instruction implementations

    instruction implementations

    rebased on top of https://github.com/WilfredTA/symbolic-stack-machines/pull/13

    Opcode implementations. Opcodes are written so they can be dynamically dispatched to.

    opened by williamberman 0
  • separate out concrete and symbolic instruction

    separate out concrete and symbolic instruction

    rebased on top of https://github.com/WilfredTA/symbolic-stack-machines/pull/12

    The separate concrete and symbolic instruction allows for clean interfaces and opcode implementations.

    concrete instructions

    • only operate on concrete machine state

    symbolic instructions

    • operate on concrete machine state
    • can split into multiple machine states
    • can add path constraints

    This gives the nice ability to define both concrete and symbolic implementations of the same opcode. I.e. see {concrete, symbolic} JUMPI.

    https://github.com/WilfredTA/symbolic-stack-machines/blob/aec8aa99c2c0c19370f8402c3976a41300ed8f70/src/instructions/sym.rs#L12-L48

    https://github.com/WilfredTA/symbolic-stack-machines/blob/aec8aa99c2c0c19370f8402c3976a41300ed8f70/src/instructions/misc.rs#L47-L71

    The concrete machine will only use the concrete instructions. The symbolic machine will use the hybrid instructions.

    opened by williamberman 0
  • add concrete indexed memory

    add concrete indexed memory

    this is rebased on https://github.com/WilfredTA/symbolic-stack-machines/pull/11

    Adds the ConcreteInt struct which just wraps i64 and implements some of the interfaces the machine needs

    opened by williamberman 0
  • remove lifetime from machine

    remove lifetime from machine

    This is rebased on top of https://github.com/WilfredTA/symbolic-stack-machines/pull/9 and https://github.com/WilfredTA/symbolic-stack-machines/pull/10

    Removing the lifetime from the machine means we have to remove the symbolic context. We'll separate out machines into two separate structs, one that is the concrete machine and a symbolic machine that wraps the concrete machine. So the current machine w/out the lifetime will become the concrete machine.

    The two {concrete,symbolic} machines work together with the {concrete,symbolic} instructions. We setup the types so that both machines can take a reference to the same program, but the symbolic machine's job is to make sure that symbolic instructions are never executed by the concrete machine. When they are, the opcode panics.

    https://github.com/WilfredTA/symbolic-stack-machines/blob/aec8aa99c2c0c19370f8402c3976a41300ed8f70/src/instructions/mod.rs#L59-L71

    https://github.com/WilfredTA/symbolic-stack-machines/blob/aec8aa99c2c0c19370f8402c3976a41300ed8f70/src/machine/symbolic.rs#L65-L106

    The symbolic machine will hold a list of constraints but again the constraints will be abstract and not have the lifetime.

    opened by williamberman 0
  • Make Extension Environment Generic

    Make Extension Environment Generic

    In a previous version of symbolic-stack-machines (SSM), each modular component (memory, stack, etc) were all generic over Instruction. The idea was to use Rust's type system to configure the machine's behavior. This turned out to be detrimental to the library's ease of use & ergonomics more specifically.

    While climbing out of the so-called "generics-hell", we realized that most of the generics in various components were due to being generic over the type of values stored in the stack, memory, storage, etc. Thus, we created a universal value type that is lazily evaluated; the universal value is actually an AST.

    We also introduced a Configuration object for each modular component (StackConfig, MemConfig, etc). These config objects are meant to receive hooks which can be passed to the universal value's AST interpreter. This allows us to treat the universal value in Rust's type system as a single type (thereby drastically simplifying the library's API), while also allowing various components to customize the semantics of that type and specialize it to whichever type is relevant for that component. In other words, it gives us polymorphism across different components as well as polymorphism across different values within a particular component, since the hooks passed to the universal value's interpreter can conditionally evaluate the value as a different type based on the component's state.

    The only component that does in fact need to be generic is the ExtEnv. This is because it is meant to be a catch all extension mechanism for implementing symbolic machines with more components, as well as those with bespoke features (e.g., the concept of "gas" in the EVM). Notably, the ExtEnv can also be leveraged to construct register-based machines. The crucial difference between the extension environment component and other components is that it does not have a predefined structure. While other components are well defined though generic over the relevant values on which they operate (a memory is treated as a flat array with some index set and value set; a stack is an abstract data type with API to push, pop, and peek), the ExtEnv has no predefined API nor structural constraints. Thus, we should re-introduce generics to the definition of this component.

    enhancement help wanted 
    opened by WilfredTA 0
Owner
TannrA
Working on the distributed web
TannrA
ttasm is an assembler for assembling source code to TTVM executable.

ttasm is an assembler for assembling source code to TTVM executable.

maDeveloper 1 Nov 8, 2021
Auto launch any application or executable at startup

Auto Launch Auto launch any application or executable at startup. Supports Windows, Mac (via AppleScript or Launch Agent), and Linux. How does it work

GyDi 38 Dec 27, 2022
Executable memory allocator with support for dual mapping and W^X protection

jit-allocator A simple memory allocator for executable code. Use JitAllocator type to allocate/release memory and virtual_memory module functions to e

playX 5 Jul 5, 2023
LIBFFM - field-aware factorization machines - in Rust

LIBFFM Rust LIBFFM - field-aware factorization machines - in Rust Getting Started LIBFFM Rust is available as a Rust library and a command line tool.

Andrew Kane 1 Jan 8, 2022
Cassette A simple, single-future, non-blocking executor intended for building state machines.

Cassette A simple, single-future, non-blocking executor intended for building state machines. Designed to be no-std and embedded friendly. This execut

James Munns 50 Jan 2, 2023
Frame is a markdown language for creating state machines (automata) in 7 programming languages as well as generating UML documentation.

Frame Language Transpiler v0.5.1 Hi! So very glad you are interested in Frame. Frame system design markdown language for software architects and engin

Mark Truluck 35 Dec 31, 2022
Envwoman is an application, to sync your .env-files across multiple machines

Envwoman is an application, to sync your .env-files across multiple machines. The main goal is to make Envwoman secure and trustworthy, so everything is open-source and the data will never in plain-text on the server. Encryption happens client-sided via aes-gcm.

Mawoka 3 Sep 28, 2022
Source code and documentation for our 'full stack on rust' meetup on 29-9-2022

Full stack on Rust This is the code and documentation repository for our 'Full stack on Rust' meetup on 29-9-2022. It includes step-by-step documentat

Baseflow 7 Oct 23, 2022
Ruxnasm is an assembler for Uxntal — a programming language for the Uxn stack-machine by Hundred Rabbits

Ruxnasm is an assembler for Uxntal — a programming language for the Uxn stack-machine by Hundred Rabbits. Ruxnasm strives to be an alternative to Uxnasm, featuring more user-friendly error reporting, warnings, and helpful hints, reminiscent of those seen in modern compilers for languages such as Rust or Elm.

Karol Belina 44 Oct 4, 2022
The language that eats the stack. Heavily inspired by porth which is inspired off of forth

Snack The language that eats the stack. Heavily inspired by porth which is inspired off of forth Install To use Snack you will need Rust and fasm Afte

Cowboy8625 2 Mar 20, 2022
A toy-level BLE peripheral stack

bleps - A toy-level BLE peripheral stack This is a BLE peripheral stack in Rust. (no-std / no-alloc) To use it you need an implementation of embedded-

Björn Quentin 4 Oct 17, 2022
Open Source Application Stack & PaaS

mycelia Open Source Application Stack & PaaS Installation cargo xtask build NOTE: We opted for cargo-xtask because Cargo build.rs is not supported for

Mycelia 3 Sep 6, 2023
First Git on Rust is reimplementation with rust in order to learn about rust, c and git.

First Git on Rust First Git on Rust is reimplementation with rust in order to learn about rust, c and git. Reference project This project refer to the

Nobkz 1 Jan 28, 2022
A stupid macro that compiles and executes Rust and spits the output directly into your Rust code

inline-rust This is a stupid macro inspired by inline-python that compiles and executes Rust and spits the output directly into your Rust code. There

William 19 Nov 29, 2022
Learn-rust - An in-depth resource to learn Rust 🦀

Learning Rust ?? Hello friend! ?? Welcome to my "Learning Rust" repo, a home for my notes as I'm learning Rust. I'm structuring everything into lesson

Lazar Nikolov 7 Jan 28, 2022
A highly modular Bitcoin Lightning library written in Rust. Its Rust-Lightning, not Rusty's Lightning!

Rust-Lightning is a Bitcoin Lightning library written in Rust. The main crate, lightning, does not handle networking, persistence, or any other I/O. Thus, it is runtime-agnostic, but users must implement basic networking logic, chain interactions, and disk storage. More information is available in the About section.

Lightning Dev Kit 850 Jan 3, 2023
Telegram bot help you to run Rust code in Telegram via Rust playground

RPG_BOT (Rust Playground Bot) Telegram bot help you to run Rust code in Telegram via Rust playground Bot interface The bot supports 3 straightforward

TheAwiteb 8 Dec 6, 2022
`Debug` in rust, but only supports valid rust syntax and outputs nicely formatted using pretty-please

dbg-pls A Debug-like trait for rust that outputs properly formatted code Showcase Take the following code: let code = r#" [ "Hello, World!

Conrad Ludgate 12 Dec 22, 2022
Playing with web dev in Rust. This is a sample Rust microservice that can be deployed on Kubernetes.

Playing with web dev in Rust. This is a sample Rust microservice that can be deployed on Kubernetes.

André Gomes 10 Nov 17, 2022