CUP: C(ompiler) U(nder) P(rogress)
A badly named, in-progress programming language just to learn how these things work. Wait, doesn't everyone write a compiler when they're bored?
Currently, the language is comparable to C, with some syntax changes inspired by Rust (that also make it a little easier to parse). The compiler outputs assembly code in yasm
format, so you will need yasm and a linker of your choice to compile it. The included Makefile and scripts use ld
. (Alternatively, you can use nasm
, but you will have to change the command being run in compiler/main.cup
and meta/bootstrap.sh
)
Only linux and macOS (only on x86_64) are supported.
Building
Tools
Make sure you have yasm
and ld
installed, and located on your PATH
.
Compiling
The reference implementation of the compiler is written in CUP, so you'll need to use the pre-compiled YASM files to get the initial executable. You should be able to run the command below to create the ./build/cupcc
compiler:
$ ./meta/bootstrap.sh
Compile a program (and optionally run it) using:
$ ./build/cupcc /path/to/program.cup -o prog
$ ./prog 1 2 3 4
# OR
$ ./build/cupcc /path/to/program.cup -o prog -r 1 2 3 4
Make sure to not have the executable name end in .yasm
or .o
, since there are some temporary files created during compilation.
Code Samples
Hello World
Some common functions you'll want are located in std/common.cup
import "std/common.cup";
fn main(arc: int, argv: char**): int {
putsln("Hello, world!");
return 0;
}
Variables
Variables are strongly typed. You can either declare them with a type, or they can be inferred if there is an initial assignment.
fn main() {
let x: int = 5; // Explicity define the type
let y = 6; // Infer the type
let z = x + y; // Add them, and infer the type
}
Pointers and arrays
fn main() {
let x: int[10]; // An array of 10 ints (initializers not supported)
let y: int* = x; // Automatically decays to a pointer when passed or assigned
let z = y; // type(z) == int* also works
let a = x[0]; // Access the first element (`a` is an int)
let b = *(x+1); // Access the second element (can use pointer arithmetic)
}
Structs / Unions / Enums
// For now, enums just generate constant values with sequential numbers.
// They aren't a "type" on their own.
enum Type {
TypeInt,
TypeFloat,
TypeChar,
}
struct Variable {
typ: int; // Can't use `Type` here, because it's not a type
value: union { // Anonymous nested structures are allowed.
as_int: int;
as_char: char;
as_ptr: Variable*; // Can recursively define types.
};
};
fn main() {
let x: Variable; // No struct initializers yet
x.typ = TypeInt;
x.value.as_int = 5;
}
Methods for Structs/Unions
struct Value {
x: int;
};
method Value::inc(amount: int) {
// self (pointer) is implicitly passed in
self.x = self.x + amount;
}
method Value::print() {
print(self.x);
}
fn main() {
let v: Value;
let v_ptr = &v;
v.x = 0;
// Call methods using `::`
v::inc(10);
v_ptr::print(); // Also works for pointers
}
File I/O
For now, the file I/O is very inspired by C, but it's wrapped using methods for the File
object. Optionally, you can use the raw syscalls (which behave like C), to deal with file descriptors manually. However it's preferred to use the File
object as it's more convenient and also provides buffered writes.
A simple implementation of cat
is:
import "std/file.cup";
fn main(argc: int, argv: char**) {
for (let i = 1; i < argc; ++i) {
let file = fopen(argv[i], 'r');
defer file::close(); // Close the file at the end of the block (in each iteration)
let buf: char[1024];
let n = file::read(buf, 1024); // use file-specific functions
while (n > 0) {
write(0, buf, n); // Use raw system calls
n = file::read(buf, 1024);
}
// file closed here because of defer
}
}
Want some more examples? Check out the examples directory, or the compiler directory, which contains the implementation of the compiler itself.