MLIR Rust multi-level compiler framework

Related tags

Cryptography mlir-rs
Overview

MLIR-RS

Multi-Level Intermediate Representation framework for Rust.

What

Modern programming language design is moving towards multi-level lowering to get from AST to ASM. What does that mean?

For eaxmple, Rust goes through many stages of compilation levels. These include

  • AST - Abstract Syntax Tree - an in memory representation of the rust source code
  • HIR - High-level IR - This is the AST with some high level concepts removed like for and while loops and if let is turned into match.
  • THIR - Typed HIR - HIR with all types checked and inferred, and all concepts of structs and traits are removed. All implicit behaviour like autoref are also made explicit here.
  • MIR - Mid-level IR - This is THIR with all high level control flow structures removed. They are replaced entirely with jumps
  • LIR - Low-level IR - This doesn't yet exist, but it's a theoretical lowering from MIR into a data-dependency graph.
  • LLVMIR or CLIF - LLVM and CraneLift are codegen backends.

That's a lot to keep track of, and as a result there is 560000 lines of code in the Rust compiler.

Other languages perform similar abstractions. It's a shame that all of this needs to be rewritten.

Prior Art

The LLVM project started working on something called MLIR. As far as I could tell, it was an effort driven mostly by Google. It's a C++ framework for implemention multi level optimsing compilers. They show off some usecases like tensorflow computation graphs being modelled as MLIR and lowering down to CUDA.

I don't particularly want to use C++. Unfortunately, it relies on cmake and C++ abstract classes. This project is an attempt to make a Rust rewrite. The current design is taken from the LLVM project, so see there for high level documentation for now.

Concepts

This is an extremely bare and high level overview of the concepts. Expect it to change

Dialects

"Levels" of IR are represented as a dialect. Each dialect is in charge of defining types and operations it can support. The neat thing about MLIR is that is supports multiple dialects existing at once.

Operations

Operations are the core of MLIR. This can represent high level concepts like "make an RPC call", mid level concepts like for loops and other built in functions, and low level concepts like syscalls, addition, jump-to-subroutine. These are defined by dialects

Operations have arguments and return values, and everything is typed. This doesn't necessarily mean strongly typed as your dialects can provide an "any" type. Operations can also have metadata and source code information.

Operations can also be "terminators", this means they will continue into another block (eg a switch statement)

Lastly, Operations can contain regions. This can be used to represent functions. Eg defining a function is an operation, and the statements inside are blocks inside the contained region.

Regions

Regions contain potentially multiple blocks.

Blocks

A block is a sequence of operations. A block also has a name and a set of arguments.

In Rust terms, the name of a block could be thought of as a label, and the arguments can be interpreted as a set of captures.

Passes

The end goal of MLIR is to take your high level IR and lower it into either a codegen IR or some other IR that can support an interpreter. To do this, we use the concept of a "pass".

Passes cover 2 core concepts:

  1. Optimisations
  2. Lowering

Both of these follow a similar system of recursively walking the tree of operations, finding patterns, and then converting to some other operation (or set of operations).

Passes can be defined on generic operations, or specific ones, or specific patterns of operations.

The way I'm currently thinking about this is that all operations have a weight to them. All passes should guarantee that the weight of the tree is reduced. I'm not sure if this is worth implementing however.

Dialects

I plan to implement some of the basic dialects from MLIR (Affine, ControlFlow, Bufferization, Memref etc) as well as use CLIF instead of LLVMIR.

I am going with Cranelift instead of LLVM for 2 reasons:

  1. It's written in Rust.
  2. It's a much simpler codegen backend, performing little optimisation passes. Since MLIR is intended to be an optimisation framework already, it means we can take that burden instead.
You might also like...
Single and multi-threaded custom ingestion crate for Stellar Futurenet, written in Rust.

rs-ingest Ingestion library written in rust for Futurenet rs-ingest Ingestion library written in rust for Futurenet Features Running offline Single-th

A fast and secure multi protocol honeypot.

Medusa A fast and secure multi protocol honeypot that can mimic realistic devices running ssh, telnet, http, https or any other tcp and udp servers. W

multi-market crank for serum-dex

A performance and cost optimized serum-dex crank that allows combining multiple market cranking instructions into a single transaction, while concurrently generating the crank instructions allowing for increased throughput.

A frontend web compiler for building slim UIs.

Delgada ❗️ Warning: This is not production ready software and is in very active development. ❗️ What is Delgada? Delgada is a small frontend compiler

Eternally liquid. Forward compatible. Nested, conditional, & Multi-resourced NFTs.
Eternally liquid. Forward compatible. Nested, conditional, & Multi-resourced NFTs.

RMRK Substrate Rust Setup First, complete the basic Rust setup instructions. Run Use Rust's native cargo command to build and launch the template node

Multi Party Key Management System (KMS) for Secp256k1 Elliptic curve based digital signatures.

Key Management System (KMS) for curve Secp256k1 Multi Party Key Management System (KMS) for Secp256k1 Elliptic curve based digital signatures. Introdu

Zero-Knowledge Assembly language and compiler

zkAsm A Zero-Knowledge circuit assembly language, designed to represent Zero-Knowledge circuits in a compressed format, to be stored on blockchains. I

Compiler development environment.

compiler-dev 北大编译实践教学用编译器开发环境 (Compiler Development Environment). 该仓库的内容将被打包为 Docker 镜像, 所以不建议直接使用该仓库, 具体使用方法见使用方法一节. 使用方法 WIP. 镜像中包含的内容 必要的工具: git, f

A standalone Aleo prover build upon snarkOS and snarkVM, with multi-threading optimization

Aleo Light Prover Introduction A standalone Aleo prover build upon snarkOS and snarkVM, with multi-threading optimization. It's called "light" because

Owner
Conrad Ludgate
Conrad Ludgate
A brand-new multi-scenarios smart contract compiler framework

The Smart Intermediate Representation The Smart Intermediate Representation(short for IR) project is a new compiler framework intended for smart contr

AntChainOpenLabs 62 Jan 2, 2024
CosmWasm multi-contract testing framework

Multi Test: Test helpers for multi-contract interactions Warning: Alpha Software Designed for internal use only. This is used for testing cw-plus cont

CosmWasm 7 Dec 6, 2022
Glommio Messaging Framework (GMF) is a high-performance RPC system designed to work with the Glommio framework.

Glommio Messaging Framework (GMF) The GMF library is a powerful and innovative framework developed for facilitating Remote Procedure Calls (RPCs) in R

Mohsen Zainalpour 29 Jun 13, 2023
An encrypted multi client messaging system written in pure Rust

?? Preamble This is a pure Rust multi-client encrypted messaging system, also known as Edode's Secured Messaging System. It is an end-to-end(s) commun

Edode 3 Sep 16, 2022
Rust implementation of multi-party Schnorr signatures over elliptic curves.

Multi Party Schnorr Signatures This library contains several Rust implementations of multi-signature Schnorr schemes. Generally speaking, these scheme

[ZenGo X] 148 Dec 15, 2022
An experimental rust zksnarks compiler with embeeded bellman-bn128 prover

Za! An experimental port of the circom zk-SNARK compiler in Rust with embedded bellman-bn128 prover. I created it as a PoC port of the existing JavaSc

adria0.eth 39 Aug 26, 2022
Two-party and multi-party ECDSA protocols based on class group with Rust

CG-MPC-ECDSA This project aims to implement two-party and multi-party ECDSA protocols based on class group with Rust. It currently includes schemes de

LatticeX Foundation 16 Mar 17, 2022
Ethereum JSON-RPC multi-transport client. Rust implementation of web3 library

Ethereum JSON-RPC multi-transport client. Rust implementation of web3 library. ENS address: rust-web3.eth

Tomasz Drwięga 1.2k Jan 8, 2023
Freelance payment protocol written in Rust.. (with multi-sig signing for dispute settling)

Freelance Escrow payment protocol ?? About The freelance protocol is a protocol built on decentralized and open systems such as blockchain and decentr

Femi Bolaji 4 Jan 3, 2023
Simple shared types for multi-threaded Rust programs

Keep Calm (and call Clone) Simple shared types for multi-threaded Rust programs: keepcalm gives you permission to simplify your synchronization code i

Matt Mastracci 5 Mar 6, 2023