This weekly debrief is for paying supporters of my work. Please only read if you’ve paid. Thanks!
→ Click here if you've paid ←
This week I’ve been working on a style guide for Bedrock, and I’ve finally locked onto a good design for the Bedrock audio devices.
Style guide
The latest piece of documentation completed for the Bedrock computer system is an assembler style guide, which documents a good way of structuring and formatting programs written in the Bedrock assembler language. It covers things like comment style, indentation, names for labels and macros, and how to notate the stack effects of functions.
The hope is that this guide makes it easier for people to write well-structured code that will scale well without having to think too hard about it, especially with how free-form the assembler language is. The guide is more of a ‘nice to have’ than an integral part of the project, but I think it helps to round things out a bit.
I still need to do a second pass over it, I realised after writing that it would be better to group things based on actual usage, but I haven’t gotten around to that yet.
Audio devices
Ever since I starting working on Bedrock near the start of 2024, I really had no idea what I wanted out of the audio devices. Uxn handled audio well enough, although barely any programs really took advantage of it (the only programs I can think of are the exceptional piano.tal and the built-in synth in Orca). The Uxn audio devices lean on the ability of devices to freely read from program memory though, which isn’t an option for Bedrock, and they take up four whole device slots, which is more than I’m willing to spend (I’ve budgeted for just two).
Up until two weeks ago I’d relegated the task of designing these devices to an older and wiser Ben. Unfortunately, that Ben is now me, and these devices still need to be designed. In both the original Bedrock specification and the second revision, the audio devices were just a stub that says to come back later. The reason for all of this procrastination is just that audio stuff is hard. Digital samples (which is pretty much all audio these days) have to play back at around 40000 samples a second, with slight fluctuations sounding very noticeable, and I just didn’t know how to select the capabilities that should be supported without accidentally cornering myself in the future.
Screens are easy, thwack down 16 colours and some dithering and it’ll look great, no question. If you need it to work in monochrome, easy, just tell programmers to use a particular palette layout and most programs will continue to tick along happily with fewer colours. But what does this kind of graceful fallback behaviour look like in audio? What’s the audio equivalent of fewer colours, or lower resolution? Audio capabilities aren’t standardised to that degree, they often rely on specialised circuitry and interfaces in order to meet strict latency requirements.
Constraints
Every good design starts with a list of constraints to work within. Here’s what I put down for the audio devices:
- The design must be straightforward to implement in hardware. The raison d’être for Bedrock is to be a foundational computer system that can be implemented on anything. This probably means primitive wave synthesis, things like square and triangle waves, which can be built from basic components, and are available in pretty much every console and computer since around 1980.
- The devices need to be capable of playing interesting and varied music. As a creative direction, I want to lean further towards deeper, bassier, textured sounds, like drone or D&B. I’d even take a hurdy-gurdy. I want to avoid classic chiptune sounds, they point to a very specific, very dated period, and they’re just plain hard to listen to for long periods (here is a critically acclaimed soundtrack on the Commodore 64 as an example).
- It must be fully implementable on a Nintendo DS, to give me a ceiling and help rein me in. No particular reason for this exactly, it’s just a console with modest capabilities and a good form factor that I own and would like to use with Bedrock.
I’m specifically not looking to conform to the audio systems of older hardware though. There are just too many ways of doing things, pretty much every old system is incompatible with every other system, and music has to be recomposed for each system that a game is ported to.
No, there has to be a better way.
False starts
The first iteration of the design had a tone device and a sampler device.
The tone device would emit notes using four primitive wave synthesisers (square, saw, etc.) and the sampler device would play back up to a couple of seconds of pre-recorded PCM audio (pulse-code modulation, the standard digital format for audio samples). The waveforms of the tone device would be hard-coded, so channel 1 would always be a square wave, channel 2 a triangle, and so on.
This would allow using the basic audio machinery in old consoles, kind of. The problem is that there is no standard set of waveforms on these consoles, all audio circuitry was bespoke according to the whims of the designer and the economy. At best, you can be pretty sure of the availability of a 50% duty square wave and at least one noise channel, but anything else is a lucky dip. If I could choose I’d want to go with two noise channels, a triangle wave, and a square wave because I have to.
The other reason for using primitive waves only on the wave device is for implementability. The analogue circuitry needed to generate a 50% square wave, a triangle wave, or pseudo-random white noise is super simple.
I threw together a couple of small pieces of music using a square/tri/noise/noise combination of channels, using only volume control, pitch control, and an ADSR envelope. I wanted to see what kind of sounds I could squeeze out of it.
Sample 1, active and bouncy:
Sample 2, gloomy and dark:
All in all, not super exciting. There’s just no depth to it, no texture.
The idea behind the sampler device was to have a 64KB tape to load samples onto, and a pair of playheads. For each playhead, you’d set a start point and an end point within the tape and hit play, with pitch and volume controls. This would give a bank of one-shot samples, with the ability to do interesting things like loading in a single long sample and playing individual fragments by moving the end points around. Something like this finger-drumming style.
I never carried out any testing for this idea though. I kept feeling uneasy about the implementability of it — you’d need a whole 64KB of memory for starters, which on a memory-constrained system would be more useful as another 256 pages in the memory device, and I wasn’t sure how hard it would be to implement two playheads on one block of memory, with simultaneous mutable access and all that. The complexity couldn’t possibly be worth it.
I wasn’t really sure how the tone device and sampler device were supposed to work together, either. The tones would sound dreadful, the samples would sound great, but you’d need to use both together because you’ve only got two channels for sample playback and about a second or two of tape. And what would happen when you run your program on a more constrained platform that implements the tone device but not the sampler? You’d end up hearing half a song, which would sound bad.
Second attempt
Okay, so that didn’t work out so great. What if we went in the other direction?
I love the kinds of music that come out of the Teenage Engineering OP-Z, a kind of pocket-sized sampler-sequencer. It looks like what you’d get if you cross a small ruler with a pocket calculator and some LEGO. Here’s a quick showcase of what it does. And gosh, what a difference a few basic filters make — a low-pass filter, some reverb, a bit of polyphony, it sounds so good.
So the line of thought was that it would be better to have an audio device that is harder to implement but sounds great, rather than one that’s easy to implement but sounds horrible (because no one would ever want to use it anyway).
I didn’t go too far down this path of inquiry though. The port interface would have provided four channels, but instead of separate controls per channel there would be a whole swath of ports shared between the channels to make space. You’d select the active channel with one port, and then all the other ports would reflect that channel (controlling filters, reverb, pitch, etc.). This means that it would take longer to fire off notes on different channels, there’d be a lot of switching channels back and forth.
In the end though, this idea swung too far in the other direction. Implementations would have had to implement a whole bank of audio filters in hardware, which would be a bit of a distraction from the computer side of things. It’d be kind of fun to pack really good sound hardware into every Bedrock system, sort of a stand-out feature that would lend a lot of character to Bedrock programs, but there’s no way that even half of the platforms that I’d like to implement Bedrock on would be capable of providing this.
It gave me a good idea of what I wanted, though. Better sound quality, and really good music.
All together
The question that kept bugging me throughout this whole process was this: how can it be possible to fall back to lower quality sound on less capable platforms? How can I design a uniform audio interface that is easily implemented on more constrained platforms, that can support higher-quality audio on more powerful platforms? Where the implementation can mediate the audio quality based on available hardware, whether that’s just a square wave generator or if there’s enough memory and processor to play back PCM samples.
The answer came to me in the end after playing around with piano.tal on Uxn, keeping all of this thinking in mind. It’s possible to take all of the best parts of the Uxn audio devices, but pack them into just two devices, and have the audio quality automatically fall back to an acceptable baseline on the GameBoy and similar.
So, Bedrock will have a tone device and a waveform device.
The tone device has four channels, each channel being allocated four ports: Two for envelope, one for volume, and one for pitch, similar to Uxn. If the waveform device is not implemented, the waveforms are implementation defined, preferring noise on channel four and melodic waves (square or similar) on the other three. This ensures that audio is implementable on pretty much any system ever.
- The envelope port pair sets an ADSR envelope, four bits per component, linear interpolation, duration for each component measured in 1/16th seconds (compatible with the timers on the clock device).
- The volume port sets the volume of the note in stereo, four bits per channel.
- The pitch port sets the pitch of the note in semitones and fires off a note. I’m not sure what encoding to use for this value yet (I can’t find any documentation for how Uxn handles this either).
The waveform device is used to load in a custom 256 byte PCM sample for each of the four tone channels. This sample will repeat rapidly to create the waveform for each channel on the tone device, and will lock in when a note is fired (so that modifying the buffer will not affect the sound of notes in progress). Each channel is allocated four ports on the waveform device: one for length, one for position, and two identical ‘head’ ports for loading bytes into the buffer.
- The length port determines whether to use the full 256 byte buffer or just a slice. The value is the address of the final sample in the loop, so
0xFFwill use the full buffer and0x03will use just the first four bytes. - The position port determines the address of the sample to next write to.
- Writing to a head port will overwrite the sample at the current position, and will increment the position value. This is a common pattern for Bedrock devices.
With this design, music and other audio will sound different on a platform that doesn’t support custom samples, but it should still sound close enough. The essence of the notes will stay the same, pitch and volume and envelope, it’s just the timbre that will change.
I’m happy with this trade-off, I think it’s the best I’m going to get.
Thanks
Thank you, person reading this, I really appreciate your support. I couldn’t do all of this cool work with programming languages and systems design without you. Have a fantastic week!