Alain Galvan ·2/4/2017 7:30 PM · Updated 1 year ago
A tutorial on how to program your own synth with WebAudio.
Tags: blogsynthaudioprogrammingtutorial
Shadertoy proved that being a good rendering engineer is just like being a good sound engineer, and that the mathematics and code for both are for the most part the same.

Sound is a series of compressions and decompressions in the air.
Instantaneous acoustic pressure (current sound) is the ambient atmospheric pressure plus perturbation (basically noise) caused by a sound wave at a determined instant in time.
This is similar to lighting models in computer graphics.
Fourier Transforms are a method of converting continuous signals to discrete ones, or visa-versa. This is particularly useful in audio jacks, where sound is simply the modulation of voltage, but has other applications, such as Jpeg Compression.
The key is that in software we're working with discrete steps, like pixels in an image, or samples in audio.
Sample - One measurement of audio data. For Pulse Code Modulation (PCM) encoding, a sample is an instantaneous representation of the voltage of the analog audio. There are other types of encoding, like u-law and a-law, that are rarely used.
Sampling Rate - The number of samples in one second. Measured in Hertz ( ( Hz ) ) or kiloHertz ( ( kHz ) ). The most common sampling rate is 44.1 kHz (CD quality).
Sample Size - Number of bits in one sample. It is typically a multiple of eight because data is stored in 8-bit bytes. Most common sample size is 16 bits, which is CD quality audio.
As a comparison, 8 bits is the number of bits in a single channel of color for images, so imagine a stereo sound as an image with only red and blue.
Channel - Stereo is the most common form of multi-channel audio (one independent left and one independent right -channel). Higher-end audio formats include 5.1 and 7.2 surround (6 and 9 channels respectively).
Frame - a cross section of samples across all channels in the audio file. So, a 16-bit stereo (two-channel) audio file will have 32-bit frames (16 bits per sample * 2 channels per frame = 32 bits per frame)
Sound on computers is represented by tiny changes on the speaker every 1 / 44100 seconds. (That sounds like a lot more than 60 fps). By themselves, these changes will just sound like pops or won't sound like anything at all, but when fast enough and with enough time, they can sound like anything!
Let's take a look at the source code from ShaderToy that the devs were kind enough to leave unminified for us to check out. :3
Behind the scenes what happens is, ShaderToy converts whatever sound you write into a 512x512 image with 1/fps seconds of encoded information of that sound, sends it to an AudioContext object, and that plays it.
//Initialize WebGL / AudioContext
let canvas = document.getElementById("canvas");
let gl = canvas.getContext("webgl");
let wa = new AudioContext();
// Create a 1 second buffer for our audio
let mSampleRate = 44100;
let mPlayTime = 60;
let mPlaySamples = mPlayTime * mSampleRate;
let mBuffer = wa.createBuffer(2, mPlaySamples, mSampleRate);
// Get pointers to each audio channel
let bufL = mBuffer.getChannelData(0);
let bufR = mBuffer.getChannelData(1);
// Create an output buffer to write to from the GPU
let mTextureDimensions = 512;
let mTmpBufferSamples = mTextureDimensions * mTextureDimensions;
let mData = new Uint8Array(mTmpBufferSamples * 4);
let numBlocks = mPlaySamples / mTmpBufferSamples;
// For ~10 runs of the shader
for (var j = 0; j < numBlocks; j++) {
// Set Uniform for Buffer Offset
let off = j * mTmpBufferSamples;
gl.uniform1f(iBufferOffset, off / mSampleRate);
// Draw
gl.drawArrays(gl.TRIANGLES, 0, 6);
//Read encoded information
gl.readPixels(
0,
0,
mTextureDimensions,
mTextureDimensions,
gl.RGBA,
gl.UNSIGNED_BYTE,
mData
);
// For 262144 Buffer Locations
for (var i = 0; i < mTmpBufferSamples; i++) {
bufL[off + i] =
-1.0 + (2.0 * (mData[4 * i + 0] + 256.0 * mData[4 * i + 1])) / 65535.0;
bufR[off + i] =
-1.0 + (2.0 * (mData[4 * i + 2] + 256.0 * mData[4 * i + 3])) / 65535.0;
}
}And in our shader we simply encode a function mainSound(float time) onto the output image.
void main()
{
//Get current sample location
float t = iiBlockOffset + (gl_FragCoord.x + gl_FragCoord.y * 512.0) / 44100.0;
//Get Song Function
vec2 y = mainSound(t);
//Encode Output
vec2 v = floor((0.5 + 0.5 * y) * 65536.0); //convert to 16 bit int.
vec2 vl = mod(v, 256.0) / 255.0; //Get decimal portion
vec2 vh = floor(v / 256.0) / 255.0; //Get int portion
gl_FragColor = vec4(vl.x, vh.x, vl.y, vh.y);
}The key takeaway here is the fact that you're rendering sin waves to play frequencies like 440 Hz, or A5 on the keyboard.
For multiple sounds to play, you simply take the average of the two!
/*************************************************************************
* Constants
*************************************************************************/
#define TAU 6.283185
/*************************************************************************
* Utilities
*************************************************************************/
// note number to frequency
float ntof(float n)
{
return 440.0 * pow(2.0, (n - 69.0) / 12.0);
}
/*************************************************************************
* Synths
*************************************************************************/
//Sin
float synthSin(float f, float x) {
return sin(mod(f * x * TAU, TAU));
}
float synthSquare(float f, float x)
{
return floor( 2.0 * floor( f * x ) - floor( 2.0 * f * x ) + 1.0 );
}
float synthNoise( float x )
{
return fract( sin( 123523.9898 * x ) * 43758.5453 );
}
/*************************************************************************
* Main
*************************************************************************/
vec2 mainSound(float time)
{
float tone = synthSquare(ntof(67.0), time);
float tone2 = synthSquare(ntof(74.0), time);
float tone3 = synthSquare(ntof(79.0), time);
return vec2(mix(tone3, mix(tone,tone2, 0.5),0.5));
}You can take this further by putting WebMIDI into the equation. For that I built a simple library to hook WebMIDI to redux. From there I send the midi information in the redux store to my synth as a uniform.
# Download it for your projects!
npm i redux-webmidi -SThe results are that you're able to programmatically create a Synth powered by your GPU!