0. Intro & setup
0. Intro & setup
Motivation
Why study compilers? This question is worth asking, since compilers are non-trivial contraptions
and most of us probably won’t work directly on one of them during our careers.
While curiosity is always a good reason, there are practical reasons too:
Compilers are a collection of interesting and generalizable algorithms and programming patterns
that fit together nicely. This makes building a compiler
an excellent and fun software engineering exercise .
Understanding compilers gives you deep insight into why programming languages work the way they do,
and it’s good to know your daily tools well.
Understanding the problems and design choices faced by programming languages helps you
understand programming more deeply .
While creating an entirely new language and the associated tooling is usually not worth the effort,
sometimes it’s useful to write code analyses or extensions for existing languages. Knowing compilers helps you
do that effectively.
Course philosophy
This course aims to be straightforward, practical and hands-on.
It typically presents one good way of accomplishing a task,
and then asks you to try it in practice.
This means that, due to limited time, some theoretical breadth and depth has been sacrificed.
If you’re looking for more, there are many great books on compilers.
The text has some green links .
These reveal optional, non-essential content that
expands upon or contextualizes things.
These usually explain:
alternative approaches
how real languages have done something
how we could improve upon whatever we just built
how things are actually more complicated than it seems
Project setup
While you can complete the course project in any language you like,
our examples and instructions are for Python.
If you’re using Python, you can
download the project template
and follow the instructions in README.md
to set it up.
There are additional notes for Mac and Windows users .
If you’d prefer to use a different language,
see the alternative project setup .
Submitting your project
To submit your completed project, run ./test-gadget.py submit
.
This requires a working Docker installation .
On Linux, the version that comes with your distro should be enough: apt install docker.io
but you can get instructions for installing the latest version here .
On Linux, after installing Docker, remember to add yourself to the docker group: adduser YOUR_USERNAME docker
,
and then log out and log in again.
For other operating systems, you can use Docker Desktop .
Click here for more details about Test Gadget.
On the first run, the submit command will prompt you to log in.
You should have received a password if you were registered to the course
when Test Gadget opened.
Please contact martin.partel@helsinki.fi
if you have not received a password by February.
The first run will take a while to build and upload, but subsequent submissions should be much faster,
as only the difference to the previous submission is sent.
You can view the test results and grade on Test Gadget’s website .
You can send multiple submissions to Test Gadget.
Test Gadget may delete some of your older submissions,
but it will always keep your best-scoring submission as well as a few of your latest submissions.
Your grade is determined by the latest best-scoring submission sent before the deadline.
This submission may be manually inspected to ensure the project’s rules were followed.
Test Gadget does not replace your own testing.
Test Gadget runs end-to-end tests, which start to pass only after you’ve completed
most of the project. Local unit testing along the way is highly recommended.
If you have technical issues with submitting to Test Gadget,
see these troubleshooting instructions .
Get the latest version of test-gadget-client by running this command in your project directory:
curl --proto '=https' --tlsv1 .2 -sSf https://hy-compilers.github.io/spring-2025/assets/ext/downloads/test-gadget-client-latest.tar.gz | tar xvpz
If you get a Docker error about line COPY pyproject.toml poetry.lock .
then edit Dockerfile
to change that line to COPY pyproject.toml poetry.lock ./
If you’re still having trouble, after reviewing all the instructions here,
please ask on Discord (likely faster) or martin.partel@helsinki.fi .
Additional instructions for MacOS and Windows
You can get pretty far with just MacOS or Windows, but near the end of the project you need to generate
machine code for x86-64 Linux and test it. While in principle you could test only with Test Gadget,
that would be very slow and extremely inflexible.
There are several ways to use an x86-64 Linux environment while developing on MacOS or Windows.
Here are instructions for a few of them.
The goal is to have an x86-64 Linux command line that can see and run
the code that you are editing in MacOS.
Method 1: Remote development
If you have access to an x86-64 Linux machine, you can develop on that via SSH.
VSCode has a ”Remote - SSH” extension that makes this feel very seamless
(instructions ).
Method 2: Docker for automated Linux VM setup
This gives you a repeatable and automated way to build a Linux virtual machine on your Mac or Windows machine.
On Apple Silicon Macs, make sure you have Rosetta: softwareupdate --install-rosetta
Install Docker Desktop (Mac , Windows )
Open Docker Desktop and complete the configuration wizard.
You don't need to sign in.
You can skip the survey.
In your project, create a file called Dockerfile.dev
with the following contents:
FROM --platform=linux/amd64 debian:12
RUN apt-get update \
&& apt-get install -y build-essential python3 curl git zlib1g-dev libssl-dev libbz2-dev libffi-dev libreadline-dev liblzma-dev libsqlite3-dev \
&& apt-get clean
RUN curl https://pyenv.run | bash \
&& echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bash_python_setup \
&& echo '[[ -d $PYENV_ROOT/bin ]] && export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bash_python_setup \
&& echo 'eval "$(pyenv init -)"' >> ~/.bash_python_setup \
&& echo 'source ~/.bash_python_setup' >> ~/.bashrc \
&& echo 'source ~/.bash_python_setup' >> ~/.profile
COPY .python-version /project/
RUN cd /project && bash -lc 'pyenv install'
RUN curl -sSL https://install.python-poetry.org | python3 - \
&& echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.bash_python_setup
COPY pyproject.toml poetry.lock README.md /project/
COPY src/compiler/__init__.py /project/src/compiler/
RUN cd /project && bash -lc 'poetry install' && rm -Rf /project && mkdir -p /project
WORKDIR /project
Run the following command to open a Linux shell in your project (the first run may take a while):
docker build -f Dockerfile.dev -t compilers-dev:latest . && \
docker run -it --rm -v "$(pwd):/project" compilers-dev:latest
Now you have an x86-64 Linux command line where you can see your code in directory /project
and run the project’s testing scripts like ./check.sh
and ./compiler.sh ...
.
Any changes you make in the Linux environment outside of /project
will be lost.
To make permanent changes, edit the Dockerfile.dev
and restart the Linux environment.
Method 3: Full Linux VM
You can manually install a full Linux desktop in a virtual machine using e.g. UTM (for Macs) or VirtualBox (for Mac & Windows).
You can use a shared directory
(UTM , VirtualBox )
to share your project directory to the VM. Then you can use your IDE in Mac/Windows while getting a command line that sees the project in Linux.
Alternatively you can combine this with method 1 and develop in the VM over SSH. The project files would reside in the VM in this case.
It’s also possible to do all your development in the virtual machine window,
but this may be uncomfortable because the user interface and keyboard shortcuts will be different, especially on MacOS.
Alternative project setup
If you want to write your compiler in a language other than Python,
you are very welcome to do so, but you need to replicate parts of the Python project template yourself.
If this interests you, look at these instructions .
You can use the Python project template as a starting point or reference when following these instructions.
Write a compiler server
To make grading significantly faster, Test Gadget communicates with your compiler over TCP.
Your compiler must be able to start a simple TCP server that implements the following protocol.
The client sends one JSON request.
The client closes the read half of the stream (you get ”end of file” while reading on the server side).
The server sends a JSON response.
The server closes the connection.
There are two types of requests that you need to support: ping requests and compile requests.
// Ping request example
{ " command " : " ping " }
// Expected response
{} // (empty object)
// Compile request example
{ " command " : " compile " , " code " : " source code text " }
// Expected response
{ " program " : " base64-encoded statically linked x86_64 program " }
// -or-
{ " error " : " text of compile error " }
Test Gadget may make multiple concurrent requests to the server.
The Python project template’s src/__main__.py
’s function run_server
shows a small Python implementation of this kind of server.
While it’s highly recommended to integrate this server into your compiler program,
it’s permitted (but possibly significantly slower) to write a separate server
that invokes your compiler program for each compile request.
In any case, make sure that concurrent requests don’t step on each other’s toes
e.g. by accidentally writing to the same temporary files!
Write a Dockerfile
You must package your compiler and its source code with Docker.
The Docker image must start your server on port 3000.
It’s recommended to keep your Docker image small,
so it’s faster to repeatedly upload to Test Gadget,
and because Test Gadget loads it into a ramdisk
that counts towards your memory limit.
The main ways to optimize a Docker image for size and fast turnaround are:
Use a small base image, preferably one based on alpine
or a slim
Debian variant.
Install as few dependencies as possible, and delete unnecessary files afterwards
(Use apk add --no-cache ...
for alpine-based images,
follow apt-get best practices for Debian-based images.)
Do all dependency setup before copying your full source code into the container.
This allows the dependency setup to be cached.
‼️ Remember: you are required to submit your compiler’s source code
for manual inspection as part of your Docker image,
even when using a compiled language.
Download (or compile) test-gadget-client
The Python project template comes with a small Rust program that builds and uploads
your Docker image to Test Gadget.
You can copy it from the Python project template
(copy test-gadget.py
and subdirectory .test-gadget
).
Alternatively, you can compile it from
source yourself.
While you technically could use any HTTP client to send your submissions,
the official client does non-trivial diffing between your previous
and your current submission, which reduces upload times and server-side storage
requirements enormously.