C-like language compiler, the final project of ZJU Compiler Principle course

Overview

cc99

cc99 (not cc98.org) is a C-like language compiler, which is the final project of ZJU Compiler Principle course. It supports many of the C99 language syntax, and can compile source code to executable file, assembly language, abstract syntax tree (json style) and so on.

cc99 can be used on Linux, Mac, even windows (unofficial support), anyone can build from source code or download binary file to have a try!

cc99

Supported Syntax

Types

  • void
  • char, short, int, long, long long
  • unsigned char, unsigned short, unsigned int, unsigned long, unsigned long long
  • _Bool
  • float, double
  • pointer (any basic type)
  • array (any basic type, any dimension)
  • function
  • struct

Statements

  • compound (which means {})
  • if, else
  • while, dowhile, for, break, continue
  • return

Expressions

  • assignment: = += -= *= /= %= &= |= ^= >>= <<=
  • unary: ++a --a a++ a-- +a -a |a ^a *a &a sizeof(a)
  • binary: a+b a-b a*b a/b a%b a|b a^b a^b a>>b a<<b a&&b a||b a==b a!=b a<b a>b a<=b a>=b a,b
  • function call: a(10,20,30)
  • type cast: identifier as T
  • conditional: a>10?1:0
  • sizeof: sizeof(a), sizeof(int), sizeof(int*)
  • member of struct: struct course c; c.name
  • array subscript: int a[10]; a[0]
  • identifier: int a
  • literal (any base type): 123, 123.123, 123l, "123", '1'

Get cc99

Before we start, make sure you have already installed gcc or clang, because cc99 need one the them to link object files. You can click href and install one of them.

There are three ways to get cc99:

  1. Download from releases. We just provide Linux (x86_64) and MacOS (Intel chip) version. As you know, they can cover almost all of develop situations.

  2. Build with Docker. We provide a Dockerfile at root directory which can build cc99. It includes dir, web-frontend, web-backend into a Ubuntu image. You can get your own image by following steps:

git clone https://github.com/RalXYZ/cc99.git
cd cc99
docker build . -t cc99:latest
# now this image contains all target files, you can use `docker cp` to copy them out or start a container!

# start a container named cc99_all, bind port and mount volumes
docker run --name cc99_all -p 6001:5001 -v ./data/runtime:/backend/runtime -d cc99:latest 

# get executable file
docker cp cc99_all:/backend/cc99  .
# get include files
docker cp cc99_all:/backend/include  .
  1. Compile from source code. Here is a sample (on ubuntu:20.04):
git clone https://github.com/RalXYZ/cc99.git
cd cc99
sed -i 's/archive.ubuntu.com/mirrors.aliyun.com/g' /etc/apt/sources.list && \
 apt-get update && apt install build-essential git curl libssl-dev wget libz-dev -y

# install rust toolchains
curl https://sh.rustup.rs -sSf | sh -s -- -y

# use tuna registry to speed up
echo '[source.crates-io]\n\
registry = "https://github.com/rust-lang/crates.io-index"\n\
replace-with = "tuna"\n\
[source.tuna]\n\
registry = "https://mirrors.tuna.tsinghua.edu.cn/git/crates.io-index.git"' > /root/.cargo/config

# add llvm registry 
echo 'deb http://apt.llvm.org/bionic/ llvm-toolchain-bionic-13 main' >> /etc/apt/sources.list

# optional
curl -LO http://archive.ubuntu.com/ubuntu/pool/main/libf/libffi/libffi6_3.2.1-8_amd64.deb
dpkg -i libffi6_3.2.1-8_amd64.deb

# install llvm
wget -O a https://apt.llvm.org/llvm-snapshot.gpg.key && apt-key add a
apt update -y && apt upgrade -y && apt install -y llvm-13 llvm-13-dev libllvm13 llvm-13-runtime libclang-common-13-dev

# a necessary ENV
export LLVM_SYS_130_PREFIX=/usr

# build it!
cargo build --package cc99 --bin cc99 --release

Usage

At first you need write a source code file (in cc99 syntax, we will call it a.c in the following description). Then you can simply run:

cc99 a.c 
# or
cc99 a.c -o a

# you will get an executable file named `a`, just run it!
./a 
# happy coding!

cc99 supports many command line arguments, you can simply use cc99 --help to find out:

  • without any extra options: executable file, using clang(default) or gcc to link
  • -c or --assemble: Compile and assemble, but do not link
  • -S or --compile: Compile only; do not assemble or link
  • -b or --bitcode: Generate LLVM bitcode only
  • -p or --parse: Preprocess and parse; do not compile, assemble or link
  • -V or --visual: Convert stdin as code file and generate AST format to stdout
  • -E or --expand: Preprocess only; do not parse, compile, assemble or link

In addition, we provide some useful options:

  • -O or --opt-level <OPT_LEVEL> : Optimization level, from 0 to 3 [default: 0]. Like gcc or clang provided, even have a more aggressive strategy than gcc and clang
  • -i or --include: Add directories <dir>, <dir>, <dir>(from left to right) to the list of directories, and cc99 will search for header files during preprocessing. Absolute paths are strongly recommended.

Compile/Run Online, Visualize AST

We provide a demo playground, you can play with your source code online and compile/run with a provided backend. Also you can observe the AST (abstract syntax tree) of yout source code.

web-frontend uses react.js and antv, web-backend uses golang and gin. These two modules can be found in ./web directory.

Compare with C99 Standard

To be honest, cc99 can parse almost all of the standard C99 language syntax, but we cannot generate some of them to bitcode. What's more, there exists some differences between our language standard and C99 standard. It is strongly recommended to read this chapter before having a try.

Preprocessing

#include <custome_header.h>
#ifdef _DEF_CC99_1
#define CC99 C99
#endif
typedef double REAL;

We use four preprocess steps:

  • First pass: Process all line continuation characters. Add a newline if there is no newline at the end of the file.

  • Second pass: Delete all comments.

  • Third pass: Process all preprocessing directives like #include, #define, #if etc.

  • Fourth pass: Merge adjacent string literals,

    E.g. char s[] = “\033[0m””Hello”; => char s[] = “\033[0mHello”

We provide three simple header files, which can be found in /inlcude directory. These files can cover most situations and you can try them as you like, but don't forget to include them using #include <stdio.h> !

You can also add other C runtime functions to /include dir, all your need is add a function signature, but there are something you need to notice:

  • Support variable parameter, like scanf and printf
  • Support parameter qualifier, you can add any standard qualifiers like const , atomic and so on
  • Don't support size_t, you must change size_t to long (8 bytes)
  • Function must be contained in standard glibc runtime
  • Welcome to submit PR to add them!

Multidimensional Arrays and Multidimensional Pointers

int a[10][20][30];
a[0][0][1] = 123;

int *p = a[0][0];
p++;
int **pp = &(p + 1);
pp = malloc(sizeof(int*) * 10);
pp[0] = malloc(sizeof(int) * 10);
pp[0][0] = 1;

You can use arrays and pointers as you like, since we already support most of arrays and pointer operations. But be attention:

  • Don't convert multidimensional arrays directly into multidimensional pointers, it's illegal
int array[10][10];
int **ptr = array; // illegal!!
  • Best Practices for multidimensional pointer
#include <stdlib.h>
int ***p;
p = malloc(sizeof(int**) * 10);
for(int i = 0; i < 10; i++){
    p[i] = malloc(sizeof(int*) * 10);
}
for(int i = 0; i < 10; i++){
    for(int j = 0; j < 10; j++){
        p[i][j] = malloc(sizeof(int) * 10);
    }
}
p[i][j][k] = 123;

Struct Support (Partial)

You can define a struct in global scope, but can't define it in function!(maybe later we will support). Here is a usage:

#include <stdlib.h>
struct course {
    char *name; 
    int credit;   
    char ***pre_course;
    int pre_status_num; 
    int *pre_course_num; 
    int grade; 
};
int main(){
    struct course c;
    c.name = malloc(sizeof(char) * 10);
    c.credit = 10;
      
    int credit = c.credit;
}

In addition, we also support access variable from pointer, which means the following code is legal:

struct course c;
struct course *ptr = &c;
char *t = ptr->name; 

But, we don't support struct initial list now!

struct course c = {"123",123,....} // illegal! 

Type Cast

In standard c syntax, it use truncation to deal it, like following

long long int a = 114514114514;
int b = a; // lose information but allowed

But in cc99, every basic type has a rank, and we deny implicit type cast from a high rank to a low rank. Here is the rank table:

Name Rank
void 0
_Bool 1
char, unsigned char 2
short, unsigned shot 3
int, unsigned int 4
long, unsigned long 5
long long, unsigned long long 6
float 7
double 8

You can use explicit type cast to convert high rank to low rank

double a = 123.123;
float b = a as float; // we use `var as type` syntax

Yes, we use var as type syntax for explicit type cast

double a = 123.123;
float b = (float)a;   // illegal!
float c = float(a);   // illegal!
float d = a as float; // legal

Function Hoisting and Global Variable Promotion

int main(){
    int s = sum(1, 2); // legal, all funcitons will hoist to top
    int e = d + 10;    // legal, all global variables will hoist to top
}
int sum(int a, int b){
    return a + b;
}
int d = 10;
You might also like...
Vite + Webassembly starter project
Vite + Webassembly starter project

Vite + Typescript+ Webassembly A starter project for you to create a blazingly fast web application Before getting started You need to get these prere

🚀 An OSS project to develop and run serverless applications on WebAssembly

Wasm Workers Server Wasm Workers Server (wws) is a framework to develop and run serverless applications server in WebAssembly. These applications are

A shell scripting language

Slash The system level language for getting the job done. Detailed documentation is available on the Slash site Motivation Bash is an awesome shell, b

An interactive scripting language where you can read and modify code comments as if they were regular strings
An interactive scripting language where you can read and modify code comments as if they were regular strings

An interactive scripting language where you can read and modify code comments as if they were regular strings. Add and view text-based visualizations and debugging information inside your source code file.

Tyrade: a pure functional language for type-level programming in Rust

A pure functional language for type-level programming in Rust

An OOP programming language I am making by following Crafting Interpreters.

horba An OOP programming language I am making by following Crafting Interpreters. https://craftinginterpreters.com/ I intend it to have a somewhat C-s

Lagoon is a dynamic, weakly-typed and minimal scripting language. 🏞
Lagoon is a dynamic, weakly-typed and minimal scripting language. 🏞

Lagoon is a dynamic, weakly-typed and minimal scripting language. It draws inspiration from a handful of modern languages including JavaScript, Rust and PHP.

A dynamically typed, interpreted, stack-based language.

Stacc A dynamically typed, interpreted, stack-based language. How does it work? Each call-frame/scope has its own variables and stack, so you can get/

A small scripting language running on V8.

Rook A small scripting language running on V8. After spending a few hours playing with V8, it seems like the only way to make this work is by transpil

Comments
  • chore(deps): bump ejs from 3.1.6 to 3.1.8 in /web/web-frontend

    chore(deps): bump ejs from 3.1.6 to 3.1.8 in /web/web-frontend

    Bumps ejs from 3.1.6 to 3.1.8.

    Release notes

    Sourced from ejs's releases.

    v3.1.8

    Version 3.1.8

    v3.1.7

    Version 3.1.7

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • chore(deps): bump terser from 5.12.1 to 5.15.1 in /web/web-frontend

    chore(deps): bump terser from 5.12.1 to 5.15.1 in /web/web-frontend

    Bumps terser from 5.12.1 to 5.15.1.

    Changelog

    Sourced from terser's changelog.

    v5.15.1

    • Fixed missing parentheses around optional chains
    • Avoid bare let or const as the bodies of if statements (#1253)
    • Small internal fixes (#1271)
    • Avoid inlining a class twice and creating two equivalent but !== classes.

    v5.15.0

    • Basic support for ES2022 class static initializer blocks.
    • Add AudioWorkletNode constructor options to domprops list (#1230)
    • Make identity function inliner not inline id(...expandedArgs)

    v5.14.2

    • Security fix for RegExps that should not be evaluated (regexp DDOS)
    • Source maps improvements (#1211)
    • Performance improvements in long property access evaluation (#1213)

    v5.14.1

    • keep_numbers option added to TypeScript defs (#1208)
    • Fixed parsing of nested template strings (#1204)

    v5.14.0

    • Switched to @​jridgewell/source-map for sourcemap generation (#1190, #1181)
    • Fixed source maps with non-terminated segments (#1106)
    • Enabled typescript types to be imported from the package (#1194)
    • Extra DOM props have been added (#1191)
    • Delete the AST while generating code, as a means to save RAM

    v5.13.1

    • Removed self-assignments (varname=varname) (closes #1081)
    • Separated inlining code (for inlining things into references, or removing IIFEs)
    • Allow multiple identifiers with the same name in var destructuring (eg var { a, a } = x) (#1176)

    v5.13.0

    • All calls to eval() were removed (#1171, #1184)
    • source-map was updated to 0.8.0-beta.0 (#1164)
    • NavigatorUAData was added to domprops to avoid property mangling (#1166)
    Commits
    • 4255fb3 5.15.1
    • d31c829 update changelog
    • bb2e20f take optional chains into account when locating the left-most item in an expr...
    • 350f965 some cleanups
    • 5fc3925 Update actions/setup-node action to v3 (#1156)
    • be73559 Update actions/upload-artifact action to v3 (#1161)
    • 666f79e chore(deps): update actions/checkout action to v3 (#1158)
    • 3652dce fix: incorrect to_simple_statement (#1253)
    • 34e8262 chore: fixed a couple of typos (#1271)
    • 089e32a avoid a class being inlined twice when identity can be compared. (close #1267)
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
Releases(v0.2.0)
  • v0.2.0(Jun 8, 2022)

    What's Changed

    • Resolve issue 1 by @CHN-ChenYi in https://github.com/RalXYZ/cc99/pull/4
    • #include support by @CHN-ChenYi in https://github.com/RalXYZ/cc99/pull/5
    • feat(preprocess): support define by @CHN-ChenYi in https://github.com/RalXYZ/cc99/pull/9
    • feat(ci): add drone.yml and dockerfile by @cxz66666 in https://github.com/RalXYZ/cc99/pull/10
    • fix(ast): change declaration node name by @CHN-ChenYi in https://github.com/RalXYZ/cc99/pull/11
    • Add basic code of generator by @RalXYZ in https://github.com/RalXYZ/cc99/pull/12
    • Finish web backend and CD by @cxz66666 in https://github.com/RalXYZ/cc99/pull/13
    • make error output much better by @RalXYZ in https://github.com/RalXYZ/cc99/pull/14
    • chore(deps): bump ejs from 3.1.6 to 3.1.8 in /web/web-frontend by @dependabot in https://github.com/RalXYZ/cc99/pull/15

    New Contributors

    • @CHN-ChenYi made their first contribution in https://github.com/RalXYZ/cc99/pull/4
    • @cxz66666 made their first contribution in https://github.com/RalXYZ/cc99/pull/10
    • @RalXYZ made their first contribution in https://github.com/RalXYZ/cc99/pull/12
    • @dependabot made their first contribution in https://github.com/RalXYZ/cc99/pull/15

    Full Changelog: https://github.com/RalXYZ/cc99/commits/v0.2.0

    Source code(tar.gz)
    Source code(zip)
    cc99(43.22 MB)
Owner
Ralph
Stare into the void | @QSCTech | @ZJUSCT | @milkice233
Ralph
A script language like Python or Lua written in Rust, with exactly the same syntax as Go's.

A script language like Python or Lua written in Rust, with exactly the same syntax as Go's.

null 1.4k Jan 1, 2023
My 'Lugli' language compiler for learning purposes, written in rust. 🥰🤠

Lugli language compiler My 'Lugli' language compiler for learning purposes, written in rust. This language is to be a subset for other purposes, for e

Vinicios Lugli 3 Nov 2, 2022
Osmon's compiler crate. Programming language made for starter & novice Uzbek audience.

Osmon Tili Osmon bu registrlarga asoslangan virtual mashinalik va yengil dasturlash tili Osmon boshqa o'zbek open source dasturchisi Sukhrob Khakimovn

Osmon 31 Dec 22, 2022
👾 Run WebAssembly (WASM-4) games on small devices (like PyBadge)

?? gamgee Run WebAssembly (WASM-4) games on small devices. Gamgee is a WASM-4 games emulator written in Rust and designed to be executed on devices wi

Orsinium Labs 5 Feb 27, 2024
Vue's template compiler reimplemented in Rust!

Vue's template compiler reimplemented in Rust!

Herrington Darkholme 686 Jan 3, 2023
swc is a super-fast compiler written in rust; producing widely-supported javascript from modern standards and typescript.

Make the web (development) faster. SWC (stands for Speedy Web Compiler) is a super-fast TypeScript / JavaScript compiler written in Rust. It's a libra

swc 25.4k Dec 31, 2022
A parser, compiler, and virtual machine evaluator for a minimal subset of Lua; written from scratch in Rust.

lust: Lua in Rust This project implements a parser, compiler, and virtual machine evaluator for a minimal subset of Lua. It is written from scratch in

Phil Eaton 146 Dec 16, 2022
A project for generating C bindings from Rust code

cbindgen   Read the full user docs here! cbindgen creates C/C++11 headers for Rust libraries which expose a public C API. While you could do this by h

Ryan Hunt 1.7k Jan 3, 2023
A self-guided learning project that includes Rust + Wasm together

A self-guided learning project that includes Rust + Wasm together. Who knows, maybe Typescript and React joins too..

M.Yavuz Yagis 1 Feb 14, 2022
Create, open, manage your Python projects with ease, a project aimed to make python development experience a little better

Create, open, manage your Python projects with ease, a project aimed to make python development experience a little better

Dhravya Shah 7 Nov 18, 2022