takumar's "perma-audio" webpage 🔊

Table o' Contents

  1. What
  2. Why
  3. How to listen
  4. How it works
  5. Projects
  6. Community

What

I am experimenting with primitive software audio synthesis. You might have heard some of my early results on the Elmet Brae 01: The Land Compilation, where it bookends the work of a bunch of actual real musicians!

Why

One motivation is to avoid my audio projects being in any way fundamentally tied to any particular computing platform, and especially not to any audio framework or file format which might be ubiquitous today but obscure in 10 or 20 years. No PulseAudio, no ALSA, no OSS, no MP3, no Ogg Vorbis, nothing like any of that.

Of course you need to use some kind of audio framework to listen to my projects, but the idea is that they all work, they all work equally well, it is extremely fast and easy to switch from one to the other, and they are all invisible in the synthesis code itself. They can (and will!) come and go without consequence.

My programs use nothing but the C standard library and output an audio byte stream to stdout, and that's it.

Not only do they avoid using transient frameworks, my projects are also very minimalistic and require very little CPU power or memory. They run fine on 10 year machines with 32-bit CPUs. I haven't had the opportunity to test them yet on 20 year old machines, but I am supremely confident they would run fine there, too.

In this way my projects are both "future proof" and "past proof". They push back against the ever increasing pace of needless obsolescence which characterises just about all of modern computing, both software and hardware. This is part of a broader movement known by the term permacomputing.

How to listen

Compilation

You can compile any of my projects like this with gcc:

gcc project.c -lm -o project

This creates a binary named project which emits audio samples to stdout. You won't hear anything if you just run it by itself. It will just spew garbage over your terminal. You need to redirect the output to something which will actually play it. There are a great many options here for various platforms. I've listed some below, but there are certainly a whole lot more. If you manage to find other ways that work, please let me know and I'll update the list.

Playing

SoX

SoX is a cross-platform package of audio tools which includes a play command which can be used to listen to my projects. SoX runs on a lot of different platforms, including Linux, *BSD and MacOS / OS X. At the moment it seems to be the best way to listen to my projects on a Mac.

You can play a project like so:

./project | play --ignore-length -t raw -e unsigned-integer -b 16 -r 16000 -c 1 -S -

Some people have reported needing to use -e signed-integer and -b 32 instead. I don't really understand what could cause this, but if the above does not sound like you expect, give it a try?

Linux

You can use SoX on Linux (see above), but there are other options too which might already been installed by default.

If your system uses Pulse Audio (like most modern distros), you can use the pacat command like so:

./project | pacat --format=s16le --rate=16000 --channels=1

(Note that Pulse bafflingly doesn't support unsigned 16 bit samples, but using signed samples just makes the output quieter and adds a harmless DC offset)

If your system does not use Pulse Audio but uses Alsa you can use the aplay command like so:

./project | aplay -t raw -f U16_LE -r 16000

I think on very old pre-ALSA Linux systems you ought to be able to redirect the output to /dev/dsp and it will work...

BSD

Of course, SoX remains an option here, too.

On OpenBSD and FreeBSD you can use the aucat command to play a project like so:

./project | aucat -h raw -e u16le -r 16000 -i -

(I've tested this on OpenBSD and it works, I assume it will work identically on FreeBSD but haven't tried it yet)

NetBSD doesn't have aucat, but it has an audioplay command which looks like it will do the trick. I haven't personally tested this yet, but the following should work:

./project | audioplay -f -e ulinear_le -P 16 -s 16000 -c 1

How it works

So far I've just played around with sine waves, produced via Numerically controlled oscilators. Each oscillator has an unsigned integer "phasor". Once per tick of the audio sample rate, a fixed constant term is added to the phasor. The 11 highest order bytes of the result are used as an index into a 2048 element look up table containing a single cycle of a sine wave, and the indexed value is that oscillator's output for that audio sample. The fixed constant value which is added to the phasor determines the oscillator frequency. That's it.

There's surprisingly little time-sensitive code involved. The program looks like it just spits samples out to stdout in a tight loop as fast as it possibly can, and the samples will come out faster or slower depending on the CPU speed. What actually happens when you pipe the output to a program which plays the audio data is that the other program's buffer will quickly fill up and blocking IO will effectively put the synthesis program to sleep while the player program "sips" samples from the buffer at the specified rate. As a result, what looks like an infinite tight loop in the synthesis program actually puts only a very low load on the CPU.

The only place that the audio sample rate actually enters into the synthesis code is for converting between actual audio pitches in Hertz and the constant step value which is added to an oscillator's phasor. Note that this means that by telling the player program to render a synthesis program's output at a higher or lower sample rate, you can shift the output up or down by an octave without changing the synthesis program at all. This also means it is easy to write synthesis programs which can be recompiled at any sample rate you like to suit a particular platform - just define the sample rate once as a constant. Be aware, though, that if you set the sample rate too low you will hear aliasing artifacts unless you only play deep bass.

The sine wave values in the oscillator look up tables are stored as floating point values between -1 and 1, and in fact all computations are done in floating point right up until the last moment when an audio sample is converted to an unsigned integer output and output with fputc(). This has a few advantages. For one, just like you can easily change the sample rate, you can easily change the output format by changing just one small section of code. Depending upon the combination of sample rate and bit depth you can easily switch the same program between a very clean and pure sounding sine wave output or something noticeably buzzier and crunchier for more of a chiptune vibe. Using floating point oscillator outputs also gives you plenty of fine grained resolution for e.g. multiplying one oscillator's output by another's (shifted and scaled from [-1, 1] to [0,1]) to use it as a sinusoidal envolope, and then use that amplitude modulated oscillator's output to frequency modulate yet another oscillator by adding it to the tuning constant. All of this could sound quite clunky and discrete if the look up tables just used unsigned 8-bit integers.

Projects

Major Sunrise (October 2023)

My debut piece, hastily prepared for the Elmet Brae 01 release.

C code (2.3 KB)

This file emits unsigned 16 bit little endian audio at 16000 samples per second.

You can run it on Linux after compiling it with the following command:

./major_sunrise | aplay --rate 16000 -format u16_le

Unlike the released version, the program just runs forever until you kill it. How long can you last?

The basic idea behind the code is this:

Community

"Community" is a bit of a grandiose term at this point, but there is at least one other person playing with these ideas and techniques! Fellow Merveillite Caffeine's Heir has dedicated their December Adventure to a project using my Major Sunrise as a jumping off point. They've even taken things mobile with a USB powerbank a Raspberry Pi Zero!

ichi