Peak

Peak will be a high-level dynamic language for the Bedrock virtual computer system.

Motivation

Bedrock is designed to be easy to implement on as many platforms as possible, down to the smallest and least capable microcontrollers. The assembly language is similarly low-level, which imposes a high barrier to entry for people who are unfamiliar with low-level programming.

Wouldn’t it be great if there was a way for people to easily write programs in a high-level language that will then be able to run on any platform ever created? This is the niche that Peak aims to fill.

Constraints

Peak will be a beginner-friendly, high-level, dynamic language that runs on Bedrock. This means that:

  • the syntax must be intuitive and familiar
    This language is aimed first and foremost at beginner programmers, who don’t necessarily have the tenacity to learn a novel syntax.
  • the language must provide high-level constructs
    Programmers shouldn’t have to know anything about the design of the underlying Bedrock system in order to write a program (“a programming language is low level when its programs require attention to the irrelevant”).
  • programs must compile down to regular Bedrock programs
    In order for Peak programs to be able to run on any platform supported by Bedrock, Peak programs must compile down to plain old Bedrock programs.
  • programs must be able to be authored from inside Bedrock
    If Peak programs can be compiled using a Bedrock program, then programs will be able to be developed on any platform supported by Bedrock. This would also allow the implementation of an eval function within the language.
  • programs must be able to link to native Bedrock code
    Programs written with Peak are expected to be a couple of magnitudes slower than if they were written directly in Bedrock assembler. In order to accelerate performance-critical code and to allow users to dip their toes into writing raw Bedrock code, there needs to be a mechanism for assembling Bedrock code from within a Peak program and then dynamically loading and calling that code.

Prior art

The plan

My biggest inspiration for this project is Lil (and by extension Lua). The language is small, focused, and very expressive.

The language will support a small number of primitive types:

  • nil
  • integer
  • list
  • dictionary
  • function

Some miscellaneous notes:

  • I probably also want a special string type, for efficiency and compactness, and a special character type. They’ll be able to convert losslessly between lists and integers, respectively.
  • Peak will have no runtime errors. Any syntactically valid program will also be a semantically valid program.
  • I really like the conforming behaviour of Lil, where variables of different types act as the higher type of the two under a variety of operations. This is cheaper than explicit iteration, because the loop can be implemented as a native Bedrock function.
  • I’m interested in implementing Lua coroutines as a flexible control flow mechanism.
  • I don’t have specific commitments for the syntax at the moment. Programs will assemble down to Peak bytecode, so we aren’t limited to having just the one syntax. I want the syntax to be intuitive for people who have never programmed before in their life.
  • Uninitialised variables have a value of nil, and are initialised when written to.
  • Values are passed by reference and copied on write. Arrays and dictionaries are shallow-copied. Will this be unacceptably expensive?
  • Variables are stored in a global dictionary.
  • Everything is an expression.

Optimisations

  • A large portion of the 16-bit heap pointer address space represents immediate integers (say, integer values -4096 to 4095). If an integer falls within this interval, the value is stored directly in the pointer instead of as a variable-length integer on the heap. The integer should be stored in the lower bits, with the highest bit being 1 to indicate whether the value is a pointer or an immediate integer.
  • The zero address (could be the zero immediate integer) represents the nil value. Writing a nil value to a variable will drop that variable. Writing a zero integer, an empty list, or an empty dictionary to a variable will also drop that variable. I’m not sure if this is a bad idea or not; it wouldn’t necessarily be any better than waiting until a variable is out of scope to drop it, and it adds reinitialisation overhead for variables that are repeatedly set and cleared.

Memory

I don’t have enough experience in memory allocation schemes or garbage collection to be confident here, but this is how I think it’ll probably work.

All values are stored on the heap, and are referenced by a heap pointer. Heap pointers are ideally 16-bits wide, to make them easier to work with in Bedrock. The first byte of a value contains type information and metadata, allowing the value to be read correctly.

The heap is stored using the memory device, in order to free program memory for the runtime and current program. If all available heap memory is exhausted, the program will gracefully halt. Current values are referenced by heap pointers stored in program memory, either on a stack or inside an allocation table. I’m not sure what information the garbage collector will need.

Nil

Represents the absence of a value. Promotes to the number zero, the empty list, or the empty dictionary.

Numbers

Peak will only support signed integer values, there will be no floating point support. This is the biggest difference I can think of from Lua/Lil, but it’s justified by there being no floating point behaviour to build off of in Bedrock.

Integers are stored as signed variable-length values, encoded using LEB128. Equality checks can terminate early if a non-matching byte is found, but ordering checks will have to traverse the entire integer.

Integers are used as boolean values where needed. A value of 0 is false, a value of 1 is true, and all other integer values are truthy. When a non-integer value is treated as a boolean, empty and nil values are falsey, and every other value is truthy.

Lists

Lists are stored as an array of heap pointers. They’ll need to reserve a chunk of space for future items, like the Vec type in Rust. When an item is inserted past the next index, the list is converted to a dictionary with integer indices.

When a non-list value is treated as a list, it will be promoted to a list containing itself.

Dictionaries

Dictionaries are stored as a hashmap, with arbitrary values as keys. I don’t know enough yet about how to implement a hashmap or a good hashing algorithm.

When a non-dictionary value is treated as a dictionary, it will first be promoted to a list, and then that list will be promoted to a dictionary with the keys and values being the indices and values of the list.

Functions

Functions have a list of named arguments, acting as variables within the function. If too few arguments are passed to a function, the remaining arguments evaluate to nil. Argument values are passed using a stack.

The value of the final expression in the function is used as an implicit return value. A value can also be returned explicitly with a return keyword. Exactly one value will be returned by a function, with nil being returned by default.

Functions can access variables in outer scopes, like a closure.

A non-function value will return itself when called.

Control flow

  • if
  • for
  • while
    Loop while a condition is true.
  • until
    Loop until a condition is satisfied.