🍄 A disassembler for the UEFI Bytecode Virtual Machine.

Related tags

Virtualization spore
Overview

Spore - UEFI Bytecode Disassembler

🍄 A disassembler for the UEFI Bytecode Virtual Machine. For details on the VM, see section 22 of the UEFI Specification.

Installation

$ cargo install spore-disassembler

# Or alternatively
$ cargo install --git https://github.com/Pebaz/spore

Demo

Given the following FASMG-EBC UEFI Bytecode Assembly file:

;; Adapted from https://github.com/pbatard/fasmg-ebc/blob/master/hello.asm

include 'ebc.inc'
include 'efi.inc'
include 'format.inc'
include 'utf8.inc'

format peebc efi  ;; PE executable format, EFI Byte Code

entry efi_main

section '.text' code executable readable

efi_main:
    MOVn   R1, @R0(EFI_MAIN_PARAMETERS.SystemTable)
    MOVn   R1, @R1(EFI_SYSTEM_TABLE.ConOut)
    MOVREL R2, string_hello
    PUSHn  R2
    PUSHn  R1
    CALLEX @R1(SIMPLE_TEXT_OUTPUT_INTERFACE.OutputString)
    MOV R0, R0(+2,0)
    JMP efi_main
    RET

section '.data' data readable writeable
    string_hello: du "Hello World!", 0x0A, 0x0

Compile it using FASMG-EBC by cloning the project and putting the file into the project root.

Save it to a familiar name such as "bc.asm" for example.

# Generates `bc.efi`
$ make bc.asm

# This is a PE executable that contains UEFI Bytecode
$ file bc.efi
bc.efi: PE32+ executable (DLL) (EFI application) EFI byte code, for MS Windows

Now that we have a bootable PE executable, we can output the bytecode instructions inside of it:

$ spore bc.efi

The disassembled bytecode instructions are then emitted by Spore:

      72 81 41 10  MOVnw R1, @R0(+1, +16)
      72 91 85 21  MOVnw R1, @R1(+5, +24)
      79 02 F4 0F  MOVRELw R2, 4084
            35 02  PUSHn R2
            35 01  PUSHn R1
83 29 01 00 00 10  CALL32EXa @R1(+1, +0)
      60 00 02 10  MOVqw R0, R0(+2, +0)
            02 F2  JMP8 -14
               04  RET

Usage

Why

I am learning about making operating systems because I think it is fun. It is actually quite difficult, but the rewards are pretty satisfying.

While learning about operating systems, I discovered the Unified Extensible Firmware Interface (UEFI). It's basically a set of C ABI interfaces that allow you to write pre-OS recovery software, OS installers, or boot managers.

The UEFI Virtual Machine is literally preinstalled by default on most modern motherboards. Most of the time, you write applications that adhere to the UEFI API in C, and then you can boot into them directly. Astoundingly, you can also boot right into applications written with UEFI Bytecode (EBC)!

The reason this is astonishing is that EBC is cross-platform, so it is truly like Java Bytecode in that regard.

Even more surprising is that EBC has access to the same Boot & Runtime Services that normal UEFI applications have. This means you can write a bootloader using EBC.

After I discovered this I set out to learn UEFI Bytecode, but didn't find very much on it. Thankfully, Pete Batard built FASMG-EBC, an assembler for EBC.

I wanted to better understand the output of the assembler and to do that, I would need to work with it more. That's when I decided to build Spore. 😄

How did I do it?

I started out by perusing the UEFI Specification very carefully and found that all of the information I would need was contained in section 22. From there, I decided to use Rust to build the disassembler since it is my current favorite language for many reasons.

I then created a Python script that would generate some specific bytecode sequences without having to deal with the PE file format. This was one of the most important steps in the entire project because I learned how to interpret the binary structure of a few of the instructions.

Once I had some bytes to work with, I wrote some Rust code that would open the file and iterate through the bytes. Amazingly enough, the entire application runs by passing a single iterator to all the functions that parse out instructions! All the parsing functions just advance the iterator and return any errors.

This really made the whole process easier because all I had to do was look at the first byte to determine the opcode of the instruction and route it to the appropriate parser function. Since many instructions have varying lengths of bytes, the iterator approach worked amazingly well.

After I had written a fair bit, I noticed that many of the instructions were using the exact same parsing routines. I also noticed that I was implementing 1 instruction at a time and there were 55 total instructions so this was not going to work well. I then had to step back and see what could be reused and what had to be kept. I then read through all 55 instructions in the specification and grouped each instruction type by how it was parsed. Once that was done, I just had to write 1 parsing function per type (total of 7)!

At this point, I knew what I was getting myself into and determined that I needed to have some better unit testing to ensure I was not messing anything up during refactoring. I had been incrementally adding to the Python script to output each and every instruction as well as every combination of arguments but I knew that I would need to somehow verify that I was not breaking anything.

The bytecode generator script looked like this:

bc = open('bc.bin')

bc.write(0b00101010_00000001.to_bytes(2, 'big'))  # $ STORESP R1, FLAGS

# The rest of the instructions ...
# This totaled 1162 lines of code

I wrote another Python script that would scrape out all of the assembly instructions from the bytecode generation script and write them to a file.

It worked by looking for all the comments that contained the dollar sign $ and writing them to a file. The result looked like this:

STORESP R1, FLAGS

;; The rest of the instructions

Now that I had what I expected to come out of the disassembler, I could simply compare the output with this file to determine if I had any regressions!

Once I had this, I could implement the rest of the instruction parsers and modify the existing ones at will and be sure that nothing would break.

Implementing the rest of the parsers took a while, but once I was done, I needed to convert the Python test scripts to Rust unit tests.

Again, I started by converting them one at a time by hand, but this soon became slightly ridiculous.

I wrote another Python script that parsed the bc.write(...) directives from the bytecode generator script and converted them to Rust unit tests. This was immensely satisfying! 🙃

Once I had working Rust unit tests, I went on to polish the CLI and fix bits and bobs of miscellaneous todo items. The rest is what you see in this README!

All in all, the project took me 20 days to complete.

Some things that I noticed while working on Spore:

  • Python was absolutely essential for rapid prototyping. I will continue to remember how well Rust + Python pair together for future projects. I used Python for:
    • Generating bytecode
    • Verifying disassembler output
    • Converting Python tests to Rust unit tests
  • Rust was amazing to use because it did not force me to think about structuring my code. It was just functions + structs + 1 trait 😛
    • One other thing that was amazing was that after I got the code to compile, very rarely was it incorrect. Rust's compiler is world-class. It also made refactoring fearless because I knew it would catch all the things.
  • The Rust ecosystem is amazing. I needed a PE file loader, so I used the pelite crate, and it was a breeze.
  • The colored crate made outputting colored text very fun!
  • Rust doesn't have a Vec-like data structure that is allocated on the stack. I know this might be a strange requirement, but I don't like allocating everything on the heap all the time. The arrayvec crate allowed me to have a fixed-size buffer that had a Vec-like interface.
  • I used include_str!() for the CLI usage information and this was oddly satisfying! 😆
  • Strings. Let's talk about strings. I mean, why can't I stack allocate them very easily? I essentially want like a stack-allocated string builder to allow me to not make even a single allocation but implementing this would have taken forever. 😕

Stargazers over time

Stargazers over time

Notes

  • 🍄 Spore is named after mushroom spores.
  • 👏 Thank you to Pete Batard for creating FASMG-EBC which is based on the Flat Assembler. Without this tool, I would not have had the assembly files to disassemble!
  • 🤯 The UEFI Specification is exceptionally well-written and contained all the information necessary to implement Spore.
  • 🤷‍♂️ Although Spore is cross-platform (Windows, MacOS, Linux), I have not tested whether FASMG-EBC works on other platforms.
You might also like...
LC3 Virtual Machine written in Rust 🦀

LC3 - Emulator LC3-rust is a virtual machine for the Little Computer 3 architecture, written using the Rust programming language. The VM has been writ

little computer 3 (lc3) virtual machine written in Rust

Little Computer 3 (LC3) Virtual Machine (by @lowlevelers) What is Little Computer 3? Little Computer 3, or LC-3, is a type of computer educational pro

Hashlink bytecode disassembler, analyzer, decompiler and assembler.
Hashlink bytecode disassembler, analyzer, decompiler and assembler.

Hashlink bytecode This repository contains a collection of Rust crates and cli tools to load, disassemble, decompile and analyze Hashlink bytecode. Re

A Fuel VM bytecode disassembler.
A Fuel VM bytecode disassembler.

Fuel Disassembler A disassembler for the Fuel VM byte code. About The forc utility provides a parse-bytecode command which will print a bare-bones dis

Scriptable tool to read and write UEFI variables from EFI shell. View, save, edit and restore hidden UEFI (BIOS) Setup settings faster than with the OEM menu forms.
Scriptable tool to read and write UEFI variables from EFI shell. View, save, edit and restore hidden UEFI (BIOS) Setup settings faster than with the OEM menu forms.

UEFI Variable Tool (UVT) UEFI Variable Tool (UVT) is a command-line application that runs from the UEFI shell. It can be launched in seconds from any

A Rust implementation of the Lox programming language. Based on clox, the bytecode virtual machine.

A Rust implementation of the Lox programming language. Based on clox, the bytecode virtual machine.

Nederlandse programmeertaal. Geinterpreteerd en met dynamische types. Met bytecode compiler en virtual machine, in Rust.

Nederlang Nederlang is een geinterpreteerde programmeertaal, in het Nederlands! Met als bestandsnaam extensie.... .nl! Het maakt gebruik van dynamisch

Easy c̵̰͠r̵̛̠ö̴̪s̶̩̒s̵̭̀-t̶̲͝h̶̯̚r̵̺͐e̷̖̽ḁ̴̍d̶̖̔ ȓ̵͙ė̶͎ḟ̴͙e̸̖͛r̶̖͗ë̶̱́ṉ̵̒ĉ̷̥e̷͚̍ s̷̹͌h̷̲̉a̵̭͋r̷̫̊ḭ̵̊n̷̬͂g̵̦̃ f̶̻̊ơ̵̜ṟ̸̈́ R̵̞̋ù̵̺s̷̖̅ţ̸͗!̸̼͋

Rust S̵̓i̸̓n̵̉ I̴n̴f̶e̸r̵n̷a̴l mutability! Howdy, friendly Rust developer! Ever had a value get m̵̯̅ð̶͊v̴̮̾ê̴̼͘d away right under your nose just when

Revolutionary Machine (revm) is a fast Ethereum virtual machine written in rust.

revm - Revolutionary Machine Is Rust Ethereum Virtual Machine with great name that is focused on speed and simplicity.

Binja Arm64 Disassembler

Binja Arm64 Disassembler These are bindings to the Binary Ninja arm64 architecture/disassembler plugin. Note that while Binary Ninja is an (excellent)

Wait, another virtual machine ?

WAVM WAVM, Wait, another virtual machine ?, is a register based 64 bits virtual machine written in Rust. It relies on 32 registers and 31 opcodes that

vore is a virtual machine management tool focused on VFIO set ups.

vore is a virtual machine management tool focused on VFIO set ups. with a minimal TOML file you should be able to get you should be able to create a VFIO-focused VM.

Detect if code is running inside a virtual machine (x86 and x86-64 only).

inside-vm Detect if code is running inside a virtual machine. Only works on x86 and x86-64. How does it work Measure average cpu cycles when calling c

Ethereum Virtual Machine implementation

Fast EVM implementation with full async support. Port of evmone to Rust.

An LC-3 virtual machine written in Rust for learning purposes.

LC-3 written in Rust An LC-3 virtual machine written in Rust for learning purposes.

crosvm is a virtual machine monitor (VMM) based on Linux’s KVM hypervisor

crosvm - The Chrome OS Virtual Machine Monitor crosvm is a virtual machine monitor (VMM) based on Linux’s KVM hypervisor, with a focus on simplicity,

STARK-based virtual machine

Polygon Miden A STARK-based virtual machine. WARNING: This project is in an alpha stage. It has not been audited and may contain bugs and security fla

[WIP] An experimental Java-like language and it's virtual machine, for learning Java and JVM.

Sky VM An experimental Java-like language and it's virtual machine, for learning Java and JVM. Dependencies Rust (rust-lang/rust) 2021 Edition, dual-l

A parser, compiler, and virtual machine evaluator for a minimal subset of Lua; written from scratch in Rust.

lust: Lua in Rust This project implements a parser, compiler, and virtual machine evaluator for a minimal subset of Lua. It is written from scratch in

Releases(v0.1.1)
  • v0.1.1(Nov 3, 2021)

    Fixes some usage issues when disassembling output from FASMG-EBC.

    Updated crate published to crates.io.

    Changes

    Breaking changes are indicated in bold

    • Fix handling of multiple zeroes in a row
    • Skips lone zeroes as they are used by the FASMG-EBC assembler to separate labels
    Source code(tar.gz)
    Source code(zip)
Owner
Samuel Wilder
Rustacean 🦀. Azure Observability @Microsoft 🐝. Formerly @AWS, @CapitalOne
Samuel Wilder
Revolutionary Machine (revm) is a fast Ethereum virtual machine written in rust.

revm - Revolutionary Machine Is Rust Ethereum Virtual Machine with great name that is focused on speed and simplicity.

null 513 Dec 31, 2022
vore is a virtual machine management tool focused on VFIO set ups.

vore is a virtual machine management tool focused on VFIO set ups. with a minimal TOML file you should be able to get you should be able to create a VFIO-focused VM.

eater 8 Mar 20, 2022
Ethereum Virtual Machine implementation

Fast EVM implementation with full async support. Port of evmone to Rust.

Artem Vorotnikov 147 Dec 9, 2022
An LC-3 virtual machine written in Rust for learning purposes.

LC-3 written in Rust An LC-3 virtual machine written in Rust for learning purposes.

Rodrigo Araújo 30 Dec 11, 2022
crosvm is a virtual machine monitor (VMM) based on Linux’s KVM hypervisor

crosvm - The Chrome OS Virtual Machine Monitor crosvm is a virtual machine monitor (VMM) based on Linux’s KVM hypervisor, with a focus on simplicity,

Google 454 Dec 31, 2022
STARK-based virtual machine

Polygon Miden A STARK-based virtual machine. WARNING: This project is in an alpha stage. It has not been audited and may contain bugs and security fla

Polygon (previously Matic) 415 Dec 28, 2022
SVM - Spacemesh Virtual Machine

SVM - Spacemesh Virtual Machine Project Goals Self-contained. Should be hosted by the Spacemesh Golang full-node and future Spacemesh Rust full-node B

Spacemesh 83 Sep 15, 2022
Dragonball-sandbox is a collection of Rust crates to help build custom Virtual Machine Monitors and hypervisors

Dragonball-sandbox is a collection of Rust crates to help build custom Virtual Machine Monitors and hypervisors. The crates here are considered to be the downstream of rust-vmm.

OpenAnolis Community 62 Dec 30, 2022
Virtual Machine Language - Yet another stack-based programming language

Virtual Machine Language - Yet another stack-based programming language

null 2 Feb 26, 2022
BM - a basic virtual machine written in rust

A basic virtual machine implementation with it's own binary format and assembly, as a learning experience.

KaviiSuri 1 May 8, 2022