Core
Values are either a byte (8-bit unsigned integer) or a double (16-bit unsigned integer stored as two bytes, big endian). Values are wrapped on overflow and underflow.
The program memory is an array of 65536 bytes that holds the current program. To load a program, set every byte to zero, set the stack and instruction pointers to zero, then copy the program into memory, starting at address zero.
The stacks (working stack and return stack) are each an array of 256 bytes with an 8-bit pointer. To push a byte, write the byte to the address referenced by the pointer and then increment the pointer. To pop a byte, decrement the pointer and then read from the referenced address. To push a double, push the high byte first. To pop a double, pop the low byte first.
The device bus is an array of 256 ports, grouped into 16 slots. Each slot connectes to a different device. Reading a byte from a port will return a byte from the device, and writing a byte will send a byte to the device. The system will pause until the read or write finishes. Reading or writing a double will read or write the low byte from the following port.
The processor contains a 16-bit instruction pointer. To execute a processor cycle, read an instruction byte from the program memory address referenced by this pointer, increment the pointer, and then execute the instruction.
The upper three bits are mode flags. If 0x80 is set, swap the stacks for this cycle. If 0x40 is set, values of unknown size are doubles, else bytes. If 0x20 is set, the first byte or double to be popped will instead be read from the following bytes of program memory, high byte first, incrementing the instruction pointer with each byte read.
The lower five bits are the operation. WST means “working stack”, RST means “return stack”, IP means “instruction pointer”. Pushing and popping is to the working stack by default.
0x00Halt if no mode flag is set.0x01PopxfromRST, pushxtoWST.0x02PopxfromWST.0x03PopxfromRST, pushxtoRST, pushxtoWST.0x04Popx, pushx,x.0x05Popy,x, pushx,y,x.0x06Popy,x, pushy,x.0x07Popz,y,x, pushy,z,x.0x08Pop doublea, writeatoIP.0x09Pop doublea, pushIPtoRST, writeatoIP.0x0APop doublea, popt. Iftis not zero, writeatoIP.0x0BPop doublea, popt. Iftis not zero, pushIPtoRST, writeatoIP.0x0CPop doublea, readvfrom memory addressa, pushv.0x0DPop doublea, popv, writevto memory addressa.0x0EPop bytep, readvfrom device portp, pushv.0x0FPop bytep, popv, writevto device portp.0x10Popy,x, pushyplusx.0x11Popy,x, pushyminusx.0x12Popx, pushxplus 1.0x13Popx, pushxminus 1.0x14Popy,x. Ifxis less thany, push byte0xFF, else0x00.0x15Popy,x. Ifxis greater thany, push byte0xFF, else0x00.0x16Popy,x. Ifxis equal toy, push byte0xFF, else0x00.0x17Popy,x, pushx,y. Ifxis not equal toy, push byte0xFF, else0x00.0x18Pop bytey, popx, pushxbit-shifted left byybits.0x19Pop bytey, popx, pushxbit-shifted right byybits.0x1APop bytey, popx, pushxbit-rotated left byybits.0x1BPop bytey, popx, pushxbit-rotated right byybits.0x1CPopy,x, pushxbitwise-ORy.0x1DPopy,x, pushxbitwise-XORy.0x1EPopy,x, pushxbitwise-ANDy.0x1EPopx, push bitwise-NOTx.
Assembler
A source file is a text file with extension .brc that assembles to a program file. Each character in the source file is a Unicode scalar value. A program file is a binary file with extension .br.
A token is a sequence of characters. To assemble a program, parse the source file as a sequence of tokens, then convert each token to a sequence of bytes, concatenating the sequences to make the program file.
To parse the source file, iterate over each character and collect them into tokens according to the following rules:
- If no token is in progress, ignore characters
U+0000toU+0020. ',", or(begin a span token. Collect up to and including the next',", or), respectively.- All other characters begin a word token. If it is
),[,],{,},;, or:, collect no further characters. Otherwise, collect up to and including the next:, or up to and excluding the next(,),[,],{,},;, or character in rangeU+0000toU+0020, or end of file, whichever comes first.
The address of a token is equal to the number of bytes in the program before it. Addresses are unsigned 16-bit integers, big-endian. The first character of a token determines how it will be converted to bytes, as follows:
(,),[, and]denote a comment, assembling to nothing.{and}denote an opening or closing block delimiter, respectively. A closing delimiter matches the closest previous unmatched opening delimiter. An opening delimiter assembles to the address of the closing delimiter. A closing delimiter assembles to nothing.@and&denote a global or local label definition, respectively. The remaining characters are the identifier. The name of a global label is the identifier. The name of a local label is that of the most recent global label (if any), followed by/and the identifier. A label definition assembles to nothing.%and;denote a macro definition or terminator, respectively. A macro definition token matches the next terminator token, and the sequence of tokens between the two is called the macro body. The remaining characters of the definition token are the name of the macro. The body cannot contain an unmatched block delimiter, or a label or macro definition. The definition, body, and terminator assemble to nothing.'and"denote a raw or terminated string, respectively. The remaining characters are the string content. Both tokens assemble to the string content as a UTF-8 encoded byte sequence. A terminated string is followed by a zero byte.#denotes padding. The remaining characters are the pad value, which must be a two or four digit hexadecimal value. Padding assembles to a number of zero bytes equal to the pad value.- Any other character denotes either a literal or a symbol. A two or four digit hexadecimal number is a literal, and assembles to a byte or double, respectively.
- Any other token is a symbol. If the first character is
~, replace it with the name of the most recent global label (if any), followed by/. If the symbol matches the name of a previous macro definition, substitute the symbol for the macro body and assemble. If the symbol matches the name of a previous or future label definition, it assembles to the label definition address.
The following macro definitions are built into the assembler, preceding every program:
%HLT 00; %NOP 20; %DB1 40; %DB2 60; %DB3 80; %DB4 A0; %DB5 C0; %DB6 E0;
%PSH 01; %PSH: 21; %PSH* 41; %PSH*: 61; %PSHr 81; %PSHr: A1; %PSHr* C1; %PSHr*: E1;
%: 21; %*: 61; %r: A1; %r*: E1;
%POP 02; %POP: 22; %POP* 42; %POP*: 62; %POPr 82; %POPr: A2; %POPr* C2; %POPr*: E2;
%CPY 03; %CPY: 23; %CPY* 43; %CPY*: 63; %CPYr 83; %CPYr: A3; %CPYr* C3; %CPYr*: E3;
%DUP 04; %DUP: 24; %DUP* 44; %DUP*: 64; %DUPr 84; %DUPr: A4; %DUPr* C4; %DUPr*: E4;
%OVR 05; %OVR: 25; %OVR* 45; %OVR*: 65; %OVRr 85; %OVRr: A5; %OVRr* C5; %OVRr*: E5;
%SWP 06; %SWP: 26; %SWP* 46; %SWP*: 66; %SWPr 86; %SWPr: A6; %SWPr* C6; %SWPr*: E6;
%ROT 07; %ROT: 27; %ROT* 47; %ROT*: 67; %ROTr 87; %ROTr: A7; %ROTr* C7; %ROTr*: E7;
%JMP 08; %JMP: 28; %JMP* 48; %JMP*: 68; %JMPr 88; %JMPr: A8; %JMPr* C8; %JMPr*: E8;
%JMS 09; %JMS: 29; %JMS* 49; %JMS*: 69; %JMSr 89; %JMSr: A9; %JMSr* C9; %JMSr*: E9;
%JCN 0A; %JCN: 2A; %JCN* 4A; %JCN*: 6A; %JCNr 8A; %JCNr: AA; %JCNr* CA; %JCNr*: EA;
%JCS 0B; %JCS: 2B; %JCS* 4B; %JCS*: 6B; %JCSr 8B; %JCSr: AB; %JCSr* CB; %JCSr*: EB;
%LDA 0C; %LDA: 2C; %LDA* 4C; %LDA*: 6C; %LDAr 8C; %LDAr: AC; %LDAr* CC; %LDAr*: EC;
%STA 0D; %STA: 2D; %STA* 4D; %STA*: 6D; %STAr 8D; %STAr: AD; %STAr* CD; %STAr*: ED;
%LDD 0E; %LDD: 2E; %LDD* 4E; %LDD*: 6E; %LDDr 8E; %LDDr: AE; %LDDr* CE; %LDDr*: EE;
%STD 0F; %STD: 2F; %STD* 4F; %STD*: 6F; %STDr 8F; %STDr: AF; %STDr* CF; %STDr*: EF;
%ADD 10; %ADD: 30; %ADD* 50; %ADD*: 70; %ADDr 90; %ADDr: B0; %ADDr* D0; %ADDr*: F0;
%SUB 11; %SUB: 31; %SUB* 51; %SUB*: 71; %SUBr 91; %SUBr: B1; %SUBr* D1; %SUBr*: F1;
%INC 12; %INC: 32; %INC* 52; %INC*: 72; %INCr 92; %INCr: B2; %INCr* D2; %INCr*: F2;
%DEC 13; %DEC: 33; %DEC* 53; %DEC*: 73; %DECr 93; %DECr: B3; %DECr* D3; %DECr*: F3;
%LTH 14; %LTH: 34; %LTH* 54; %LTH*: 74; %LTHr 94; %LTHr: B4; %LTHr* D4; %LTHr*: F4;
%GTH 15; %GTH: 35; %GTH* 55; %GTH*: 75; %GTHr 95; %GTHr: B5; %GTHr* D5; %GTHr*: F5;
%EQU 16; %EQU: 36; %EQU* 56; %EQU*: 76; %EQUr 96; %EQUr: B6; %EQUr* D6; %EQUr*: F6;
%NQK 17; %NQK: 37; %NQK* 57; %NQK*: 77; %NQKr 97; %NQKr: B7; %NQKr* D7; %NQKr*: F7;
%SHL 18; %SHL: 38; %SHL* 58; %SHL*: 78; %SHLr 98; %SHLr: B8; %SHLr* D8; %SHLr*: F8;
%SHR 19; %SHR: 39; %SHR* 59; %SHR*: 79; %SHRr 99; %SHRr: B9; %SHRr* D9; %SHRr*: F9;
%ROL 1A; %ROL: 3A; %ROL* 5A; %ROL*: 7A; %ROLr 9A; %ROLr: BA; %ROLr* DA; %ROLr*: FA;
%ROR 1B; %ROR: 3B; %ROR* 5B; %ROR*: 7B; %RORr 9B; %RORr: BB; %RORr* DB; %RORr*: FB;
%IOR 1C; %IOR: 3C; %IOR* 5C; %IOR*: 7C; %IORr 9C; %IORr: BC; %IORr* DC; %IORr*: FC;
%XOR 1D; %XOR: 3D; %XOR* 5D; %XOR*: 7D; %XORr 9D; %XORr: BD; %XORr* DD; %XORr*: FD;
%AND 1E; %AND: 3E; %AND* 5E; %AND*: 7E; %ANDr 9E; %ANDr: BE; %ANDr* DE; %ANDr*: FE;
%NOT 1F; %NOT: 3F; %NOT* 5F; %NOT*: 7F; %NOTr 9F; %NOTr: BF; %NOTr* DF; %NOTr*: FF;