Introduction to Torque

This article is a gentle introduction to the Torque meta-assembler, explaining the core concepts of the language and showing examples to help with writing programs.

What is Torque?

Torque is a language and assembler for writing low-level computer programs. These programs assemble down to a sequence of processor words, with each word being a sequence of bits of any length (modern processors use eight-bit words, but other systems might use words of any length). These words have no meaning on their own; it’s up to you to provide that meaning by running the assembled program on a processor.

It’s important to note that Torque is a meta-assembler. It doesn’t hard-code behaviours for specific processor architectures like you might expect from a normal assembler: instead, you implement these behaviours yourself as macros, using the datasheet for the target processor as a guide. This means that you won’t be able to write a working program without understanding the processor that you’re writing for, but it also means that you’ll be able to write programs for any processor you can imagine using a single language.

To download a copy of Torque, go to the project page.

Getting started

The base element of every Torque program is the word template, which assembles down to a single word in the final program. A word template starts with a # character, followed by a sequence of 1 and 0 bits (with _ characters to make the template more readable). We’ll use an eight-bit word here for familiarity.

Comments in Torque are wrapped in parentheses (like this), which is a syntax borrowed from Bedrock.

( Assembles to the byte 0x41, or letter A in ASCII. )
#0100_0001

Macros

To save us from having to write the entire program in binary, we can package the template up into a one-argument macro. Replacing every bit in the template with the name of the macro argument causes that range of bits in the word to be replaced with the argument value when invoked (the argument name will need to be a single letter in order to pull this off). This allows us to inject any integer value directly into the word.

Macro definitions start with a % character, followed by the macro name, then followed by an argument list (using the : character as a separator). After this comes the body of the macro, which is just a word template in our case, and a ; character to finish. Macros are invoked by typing a name followed by an argument list, with the whole invocation being replaced by the macro body when assembled.

( A macro definition, taking the single argument n. )
%BYTE:n  #nnnn_nnnn ;

( Assembles to the bytes 0x41 0x41. )
BYTE:0x41  ( Invokes the macro with a hexadecimal literal. )
BYTE:'A'   ( Invokes the macro with a character literal.   )
BYTE:65    ( Invokes the macro with a decimal literal.     )

Instructions

Macros are most often used to encode instructions for a target architecture. The PIC mid-range microcontroller family encodes instructions as 14-bit words, with operands packed into each instruction. The BSF instruction is used to set a specific bit inside a register on the chip, and takes as operands the index of the bit and the address of the register.

We can turn this instruction into a macro and then use it just like we would a built-in instruction in a normal assembler.

( Sets bit b of register f. )
%BSF:f:b  #01_01bb_bfff_ffff;

( Using the instruction to set bit 3 of register 0x05. )
BSF:0x05:3

This can be made more readable by using macros to name our constants. If we wanted, we could have also chosen a different, more memorable name for the instruction (such as BSET or SETBIT).

%GPIO   0x05;
%PIN-3     3;

( Set pin 3 of the GPIO pins. )
BSF:GPIO:PIN-3

Data types

There are three types of value that we can use in a Torque program: integers, lists, and blocks. We’ve already seen integers (decimal 65, hexadecimal 0x41, and character 'A' are all integers) and blocks (word templates are blocks, as is anything that assembles down to a sequence of words).

Lists

The third type of value is a list, which holds a sequence of integers. These can either be written as either a list of integers inside square brackets (like [65 66 67]), or as a string of characters inside double-quotes (like "ABC").

We use lists by passing them into macros as arguments. Our BYTE macro from earlier takes a single integer value as an argument, but we can choose to pass a list value instead. This will cause the macro to be invoked once for every integer in the list.

( These will all assemble to the bytes 0x41 0x42 0x43. )
BYTE:[65 66 67]
BYTE:['A' 'B' 'C']
BYTE:"ABC"

We can also create a macro that explicitly takes a list as an argument by wrapping the argument name in square brackets; passing a list will now invoke the macro exactly once, with the argument receiving the entire list.

( Encodes a null-terminated string. )
%STRING:[chars]  BYTE:chars BYTE:0 ;

( Assembles to the bytes 0x41 0x42 0x43 0x00. )
STRING:"ABC"

Integers

Integers are passed around in Torque as signed 64-bit values, and must be passed into a word template to be assembled into a program. Many processor architectures require large integers to be split into multiple 8-bit words, which we can do with the help of expressions.

An expression is a piece of math that evaluates down to a single integer during assembly. Expressions look like lists that also contain operators, and are evaluated left-to-right (this is called post-fix notation). Integers are pushed onto a stack, operators consume values from that stack, and the result is the value left on the stack at the end.

( Assembles to the byte 0x03. )
BYTE:[1 2 +]

We can break an integer down into bytes with the <shr> (shift-right) and <and> (bitwise-and) operators.

( Encodes an integer as a big-endian 16-bit value. )
%16BE:n
  BYTE:[n    8 <shr>]   ( shift n right by 8 )
  BYTE:[n 0xFF <and>];  ( keep lowest 8 bits )

( Assembles to the bytes 0x12 0x34. )
16BE:0x1234

Integers are often also used as addresses, referencing pieces of data or code. Addresses can be calculated using a label, which is an @ character followed by an identifier. The identifier can then be used anywhere that you’d normally use an integer, with the value being the address at the label.

( Assembles to the bytes 0x00 0x02. )
@start
BYTE:start
BYTE:end
@end

A slightly different form of label (called a local label) is needed inside a macro definition, starting with an & character and prefixed with a ~ character when referenced. This allows the label to be duplicated safely with each invocation of the macro.

( Encodes a length-prefixed string. )
%STRING:[chars]
  BYTE:[~end ~start -]
  &start BYTE:chars &end;

( Assembles to the bytes 0x03 0x41 0x42 0x43. )
STRING:"ABC"

Blocks

We’ve talked mostly about integers so far, but we can pass blocks of the program around as values too. Whole chunks of program can be wrapped in braces to create a single block value, and an argument name can be wrapped in braces so that it accepts a block value instead of an integer.

The following example uses a macro to generate a for loop for a hypothetical instruction set.

( Run inner n times, counting with reg. )
%FOR:reg:n:{inner}
  MOV:reg:n         ( set register to value n     )
  &loop             ( local label to loop back to )
    inner           ( the passed block argument   )
    DEC:reg         ( decrement register value    )
    CMP:reg:0       ( compare register to zero    )
    JNZ:~loop;      ( jump to &loop if not zero   )

( Call an interrupt 8 times, looping with the CX register. )
FOR:CX:8:{ INT:0x03 }

We can also conditionally include a block into a program based on the value of an integer. This can be used with recursion to generate variable-length code from a single invocation. A ? character is used for this purpose, followed by an integer value and a block value: the block value will be included in the program only if the integer is not zero.

( Assembles to n zero bytes. )
%PAD:n
  ?[n 0 >] {
    BYTE:0
    PAD:[n 1 -]
  };

( Assembles to five zero bytes. )
PAD:5

And more

There’s so much more that Torque can do, this introduction was only a taster. To learn more, head over to the project page.