Compilers

spring 2024

THIS IS AN OLD VERSION - Find the 2025 course here

Language spec

This is a specification for the base language that we build in chapters 2-7. This page is meant as compact reference, not as an introduction.

This specification does not include any of the additional features specified in the project.

Example

The language we’re going to build looks like this:

var n: Int = read_int();
print_int(n);
while n > 1 do {
    if n % 2 == 0 then {
        n = n / 2;
    } else {
        n = 3*n + 1;
    }
    print_int(n);
}

Syntax

(Most relevant for chapter 3)

This specifies the structure of a valid program in our language.

An expression is defined recursively as follows, where E, E1, E2, … En represent some other arbitrary expression.

Variable declarations (var ...) are allowed only directly inside blocks ({ ... }) and in top-level expressions.

Precedences:

  1. =
  2. or
  3. and,
  4. ==, !=
  5. <, <=, >, >=
  6. +, -
  7. *, /, %
  8. Unary - and not
  9. All other constructs: literals, identifiers, if, while, var, blocks, parentheses, function calls.

All non-operator expressions such as if, while and function calls must be (syntactically) allowed to be part of other expressions, so e.g. 1 + if true then 2 else 3 must be allowed.

A type expression is defined recursively as follows, where T, T1, T2, … Tn represent some other arbitrary type expression.

The program consists of a single expression, called the top-level expression. If multiple expressions are given, they are treated like a block without { and }.

Arbitrary amounts of whitespace are allowed between tokens. One-line comments starting with # or // are supported.

Semantics

(Most relevant for chapter 4)

This specifies specify how a program should work.

This works directly as a specification for an interpreter. A compiler should produce a program that behaves the same way.

We don’t specify type-checking here.

A value may be one of the following:

* You don’t need to support any use of function values other than calling them directly.

A context consists of:

A lookup of identifier ID in context C proceeds as follows:

An expression is evaluated with a given context. The result of expression evaluation is a value and optionally a modification of the context’s locals.

An expression in a context C is evaluated as follows.

When evaluating the top-level expression, the initial context has no parent context and has the built-in functions and operators as its locals.

The built-in functions are:

The output of a program whose evaluation succeeds consists of the outputs of any built-in print functions that were evaluated, followed by the result of its top-level expression, if it was an integer or a boolean.