Around the wwworld:
Web MIDI, Web AUDIO and what the web does best
So, you don't use the Web Audio and Web MIDI APIs in your day job - does that mean they have nothing important to tell us about the nature of the Web? Think again! Katie Fenn takes us on a tour of the Web Audio and Web MIDI APIs by creatively-coding Daft Punk's 1997 classic, "Around the World". The talk will reflect on what the Web is good at, and the enduring value of unlicensed standards.
Videos
Links
Transcript
Hey everyone, my name's Katie.
I'm a senior developer at the Financial Times.
And a quick content warning, I'll echo it again.
There's going to be loud music and loud noises during and throughout this talk from beginning to end.
So I really like Daft Punk, but you don't have to like Daft Punk to like this talk or appreciate this talk, I promise.
This talk isn't really about Daft Punk.
This is more about the technology of electronic music, a whirlwind tour of the web audio and web MIDI APIs and what they're capable of, and a reflection on what the web is for.
So if you're okay with that, let's go.
So just in case you don't know who Daft Punk are, they are a Grammy award-winning electronic music duo from Paris, and they're famous for dressing up as robots while they're performing.
But, and this is the image that most people are familiar with, but what a lot of people don't know is that their career spans nearly 30 years.
They're pioneers of French house music, and this is the image that they selected for their 1997 album.
And this is the first time you've ever seen them like this, but yeah, they're very good at like hiding their identity.
Early on in their career, they used synthesizers that were kind of already old in the '90s, including the Roland TR-909 drum machine, which was released in 1983, and the Roland Juno 106, which was released in 1984.
And these machines could be found for under a hundred pounds second hand at the time, because they were considered old hat, they were old technology.
But one thing that they did have, one new technology that would change the music, that would change the music world forever was MIDI.
This is what MIDI cables and ports look like.
Nearly all synthesizers have them now, and MIDI cables are very cheap and they're very easy to use.
MIDI lets musicians synchronize different machines of different types and different brands, and you can prepare performances at home on a MIDI enabled sequencer, and then mix them live on the synthesizer and a drum machine in the live performance.
Synthesizers were a common sight in prog rock bands in the '70s like Emerson Lake and Parma, and slowly they went from being a smaller part in a bigger band to being the band.
Bands like Kraftwerk, Pet Shop Boys, Erasia, Orbital, Chemical Brothers, Prodigy, Daft Punk, all the Dave's favorite bands, used synthesizers to revolutionize the way that music was made.
Now the Web Audio API and the Web MIDI API integrate analog synthesis and MIDI into the web.
These APIs are mere footnotes compared to our day-to-day work, but they are very powerful, they're very creative, and they're a lot of fun to use.
So let's explore these technologies by making a complete song.
Let's make Daft Punk's 1997 single "Around the World."
For those in the audience, this is what it sounds like.
♪ Around the world ♪ ♪ Around the world ♪ ♪ Around the world ♪ ♪ Around the world ♪ ♪ Around the world ♪ ♪ Around the world ♪ ♪ Around the world ♪ ♪ Around the world ♪ ♪ Around the world ♪ ♪ Around the world ♪ ♪ Around the world ♪ ♪ Around the world ♪ ♪ Around the world ♪ ♪ Around the world ♪ - So where do we start?
Let's start with this.
This is my Arturia KeyStep Pro.
This is a MIDI controller and a sequencer.
It looks a bit like a keyboard that you'd make music on in your bedroom in the '90s, but it's not.
This makes no sound of its own.
It has no speakers.
It can't make any music on its own.
This sends MIDI data to my computer, which is here.
MIDI data tells computers, synthesizers, and drum machines which notes to play.
MIDI data can even be used to control lighting and visual effects on stage as well.
And this really goes to show that MIDI is about message passing and nothing else.
So let's hook up this up to our web browser using the Web MIDI API.
And then we can see what that MIDI data looks like.
So let's go down here.
And I should have some demos lined up in the Sources tab.
So here we go.
Right, so first thing that we're going to do is we're going to use the MIDI access.
We're going to request it from the navigator, navigator.requestMidiAccess.
And then we're going to iterate through the values for device of access.
Close that and we can make that a bit bigger.
Is that big enough for people to see?
Cool.
Inputs.values.
You're almost certainly going to see some typos today and things are going to go wrong, but we'll try and get through it.
Right, next, if the device name equals key step pro, key step pro, then we're going to set device.onMidiMessage to a callback.
And then the message that we get passed in is a of type Uint8Array, and we want just a normal array.
So we're going to create a new object called data equals array.fromMessage.
And that should give us all the nice array functions that we want to get access to.
And then we're going to put this into my slide.
So we're going to get use document.querySelector, and we want to find that element there.
Okay.
And then we're going to set innerText equal to, come on, message.data.slic3.
So that's an array, we just want the first three items.
So hopefully if I run that, I get an error, excellent.
Iterate through device of access.inputs.values.
Yep, that's a common one that came up, well done.
Thank you very much.
We're going to get through this together.
This is hideously complicated.
Inputs access.inputs.values, there we go.
Let's see, there we go, excellent.
Right, so MIDI sends a lot of data, including the device name, which we use to iterate through the data, but at its core, it breaks down to three values.
The first value is the status byte.
So if I push down one of the keys, then we get a status for key down, which is 153.
If we lift up, then we get a different one, which is 137.
The next, oh, let me see.
So yes, the second byte is the data byte, which carries information about the note being pressed.
So if I press bottom C, then we get one five, sorry, we get 36.
If we go up a semitone, then we get 37, 38, 39.
And then additional parameters may follow.
So this could be the velocity.
So if I press one of these keys down very slowly, then the third byte should be a very low number.
So it's nine or one.
And if I press it down very quickly, then we get high number, like 120.
And you can write your code to play louder notes if you want to, or you can extend the length of them, and you can be really creative like that.
And if you use some of these knob controls up here, then you can get the range of that knob as well.
So that's what MIDI data looks like.
You don't need any API documentation to learn how this works.
It seems really counterintuitive 'cause there's so many MIDI devices out there.
But the brilliant thing about these three values is that as long as you do the same actions, if you do the same action twice on a MIDI controller, you will get the same values.
And counterintuitively, this makes MIDI easier to work with 'cause you don't need any API documentation.
MIDI is brilliant.
I love it to bits.
So now that we know how to tell the computer how to make a noise using MIDI, we need to figure out how to make it make that noise.
And this is what the Web Audio API is for.
The Web Audio workflow, you're gonna see me write a lot of code today, but this is what it boils down to.
You take an input, you shape it using effects, and then you send it to a destination, which is usually your system audio out, such as your laptop speakers or your headphones.
So I'm gonna write a lot of code.
It boils down to this.
You connect nodes together.
And this workflow models the way that a real life analog hardware synthesizer works, where you take an oscillator, you patch between an oscillator and an output, and then you can make some noise.
This is how modular synthesizers work.
So I mentioned oscillators.
Let's start our journey with making an oscillator.
Oscillators create an electronic signal that makes noise when you connect them to a loudspeaker.
And I've hooked up my real life synthesizer to an oscilloscope so that you can see the shape of the wave that it's making, the electronic signal, and then you'll be able to hear the sound that it's making.
So that's a round sine wave.
It's very soft.
Sawtooth wave is very sharp.
Square wave is like a mixture of the two.
And a pulse wave is like a sharp square wave.
And you can vary the pulse width to make it sound sharper or softer.
You can increase the frequency to play higher notes.
Or lower notes. (loud beeping) And by putting this all together, (loud beeping) you have the foundations of all electronic music.
Was everyone able to hear that?
Is it good at the back?
Is that good volume?
Fantastic, thank you very much.
So let's see if we can make an oscillator.
Let's go back to my demos.
Sources, go to oscillators, right.
Okay, so just like we requested MIDI access, that was like the gateway for the MIDI, web MIDI API.
We have another gateway for the web audio API, and that is the audio context.
So we're going to create a object called context equals new audio context.
And this will coordinate all the nodes, all the code we're about to write.
So if we want to create an oscillator, then we create a new object called oscillator.
New oscillator node.
And the first thing that you do when you create a new node in the web audio API is that you pass the context in.
And that is pretty much the same for every single node, I think.
And it coordinates all the nodes together.
And then the next argument that gets passed in is a options object.
And the oscillator has an option of the type, and we're going to create a sawtooth wave.
By default, oscillators start making a noise when you start them, and then they continuously make a noise.
And this is the exact same for hardware synthesizers as well.
What happens in a hardware synthesizer is that they have gates, electronic gate circuits, which open and close that signal so that when you push a key down, that gate is open, the signal comes through and you can hear the tone.
And we're going to do the same thing.
The way we're going to do that is we're going to create a gain node.
New gain node, pass in the context, and we're going to set the gain to zero by default, because we don't want to hear it until we push a key down.
We'll figure that bit out in a minute.
So the next bit is we're going to connect the oscillator to the gain, not connect, to the gain. - It really handles an oscillator. - Thank you very much.
That was a test.
And then we're going to connect the gain to the context destination, which is by default, your system configured sound out.
And then we're going to start the oscillator, oscillator.start.
So next we want to tune the oscillator up and down, depending on the key that I've pressed.
And we're going to set the oscillator.frequency.value equal to, and we want to set that to a value in hertz.
So what we want to do is I've got a helper function called MIDI to frequency that I've made myself.
And that takes a number value in terms of MIDI, and that turns it back into a number in hertz.
And we're going to pass in MIDI data dot input.
Next we want to open and close that gate when I push a key down.
So we're going to set gain.gain.linearRampToValueAtTime.
That's a really long name of a function.
It basically means ramp that value up in a linear motion, just a straight line from one value to the other.
And we want to set it to one when we push a key down.
And I've got a nice little time that I've configured up here, which is the end time, current time plus 0.005 seconds.
So we're going to put end there, and then we're going to copy that.
And conversely, when we lift the key up, we want to set it to zero.
And let's see if I've got this right.
No errors, I think.
So if it works, we're just about to have a sudden sound.
It went first time, how about that?
So hopefully if I go up in scale, higher notes and lower notes.
So that's oscillators.
That's what oscillators do.
Next, let's talk bass lines.
So we've got a sound, the basic sound.
How do we craft this and turn it into something which is recognizable from the song?
Let's see if we can make a bass line.
So I've got a different demo, so go to inspect.
Oh, hang on, I've got one more thing to show you.
Filters, we need to talk about filters.
So filters are used to quieten, remove and accentuate certain frequencies of sound.
And they're often used in dance music bass lines.
This video shows me patching an oscillator into a filter and then into the output.
So patch from the output of the oscillator into the filter, then patch the output of the filter into the output.
Should take that sharp sound.
You should hear it getting softer and eventually it goes away altogether. (loud buzzing) What I'm doing there is I'm modulating the cutoff frequency and that's the frequency of the sounds that it's cutting out.
That's what a filter does.
So let's see if we can create a filter.
Go to sources, bass filter, there we go.
So this is mostly the code that we've just written in the demo before and we're going to create a new filter node, filter equals new biquad filter node.
We're gonna pass in the context and then we're gonna pass in the frequency that we want to cut off.
We want to cut off at about 144 hertz.
And I think that's quite a complicated name.
We just think that it means that you can vary the cutoff frequency at four different points or something.
But that's what that means.
So let's see if that works.
So that's a lot softer.
And if I turn down the octave, turn down frequency. (loud buzzing) Get a lovely creamy bass line there. (loud buzzing) That's our bass line. (audience clapping) That's our bass line.
You might be able to hear audio clicking there.
It's because by out of the box, the oscillators, when you ramp them up, you get a sudden cut in of the sound.
And I've managed to fix that in the complete demo, but like just after the box, it can be a little bit fiddly.
So that's why you can hear that little clicking.
Next lead.
Let's see if we can make the lead part of the melody.
So the lead part of the melody in around the world has something called a low frequency oscillator.
And low frequency oscillators aren't designed to make sound like oscillators do.
They are designed to automate parameters of other sounds because there's some things that you'd like to do, but you literally cannot move the knobs yourselves quick enough.
So low frequency oscillators are used to do that fast and used to automate things.
And this video shows me patching the output of a low frequency oscillator into the pitch parameter of the oscillator.
And you'll be able to hear it go up and down. (high pitched whirring) And as I speed it up, increase the frequency, (high pitched whirring) you should hear it go up and down even faster. (high pitched whirring) (high pitched whirring) So let's see if we can make a low frequency oscillator.
Lead LFO, right, here we go.
Right, so again, very similar to what we had before.
We've got a filter already in place and we want to create a low frequency oscillator.
So we're gonna create a new note called LFO equals new oscillator node.
One of the great things in the Web Audio API is that low frequency oscillators are just like any other oscillator.
They're just an oscillator with a very low frequency.
And that's really, really nice.
That's just something that's really elegant that I really like about it.
So next, low frequency oscillators oscillate between zero and one values.
And what we want to do is that we want to use the LFO to automate the opening of that filter and move the cutoff frequency.
But if we only vary it by one Hertz, you're not going to be able to hear its effect.
So what we want to do is we want to amplify that low frequency oscillator by about 5,000 Hertz.
So we're going to create a new object called LFO gain, new gain node, pass in the context.
And then we want to set the gain to about 5,000.
And we're gonna connect the LFO to LFO gain.
Then we're going to connect LFO gain, connect to the filter's frequency property.
Then we're going to start the LFO oscillator, LFO.start.
And then let's see.
So you can hear that looks quite low.
Let's go back up a couple of octaves.
And that's our lead.
If I change the frequency of the oscillator, you should be able to hear it change a little bit.
So let's change that down to one Hertz.
You should be able to hear the filter cutoff moving slowly.
Speed up again.
You can hear it's going.
That LFO is modulating the cutoff frequency.
So yeah, that is our lead part.
Excellent, right.
What's next?
Samples.
Daft Punk didn't use any samples in around the world, but the drum machines that they used were themselves based on samples.
And one of the drum machines that they used was the Roland TR-909, which was made famous by the acid house scene.
It's particularly priced by the sound of its symbols, which you can hear throughout house music and throughout acid house.
And they were sampled from real life hi-hat and crash symbols.
So let's see if we can introduce some samples.
So go back to some demos, go to sources, sample.
Okay, so let's see if we can get some samples.
So buffer equals await, and we're going to fetch that file there.
There we go.
Then we're going to take that response, and then we're going to turn that response.
Say response to array buffer. (keyboard clacking) And then we're going to use, take the buffer, and then we're going to use the audio context to decode the audio data from that, from the buffer.
What that does is it takes that WAV file from my file, and it turns it into an array of numbers, which describe the shape of that wave.
And that's brilliant because that puts it into a nice little API, which you can code with in JavaScript.
You just got to have an array of values, which you can change stuff with.
So if you want to monkey around with all those numbers, then you can if you want to.
We're going to leave it as it is.
We're not going to risk breaking anything else in this talk.
So we've got that, we've got the buffer, and this function will play whenever it gets a MIDI event from the sequencer.
So what we're going to do is that we're going to create a new, let's see, source equals new audio buffer source node.
We're going to pass in the context again, and then we're going to pass in the buffer.
And then we're going to connect the source to the destination.
And then we're going to start the source node.
One pitfall that I fell into when I started learning this stuff is that I thought you'd be able to reuse the audio buffer source node, just rewind it back again and then play it again.
But apparently I don't think you can do that.
And so in my code at the very least, and all the examples that I could find online, you use, you create a new audio source buffer, audio buffer source node, and then you throw it away.
And then you start again with a new one each time that you need a new sample.
So let's see if we can hear a classic 909 clap. (clapping) There we go.
And then, (clapping) use a hi-hat.
So there's those famous 909 hi-hats.
So that is samples.
Vocoders.
The most characteristic sound in the song is that robotic voice that you hear.
And it's perhaps the instrument that Daft Punk are most associated with.
And that is the vocoder.
A vocoder is a voice controller that allows you to control the amplification of a synthesizer using the sound of your voice.
And it measures the sound of your voice at certain frequencies.
And then it uses those frequencies to control a filter or a gate to let those sounds through.
Vocoders can either be dedicated instruments or they can be a feature of a more complete synthesizer like this Microcorg.
And you can see the microphone on top.
According to fans who have tried replicating this sound, Daft Punk didn't actually use a vocoder at all.
They think that they used a guitar pedal called a TorqBox.
And this is the TorqBox that I've got plugged in here.
And with a flick of my switch, I should be able to route my system audio through this TorqBox.
And we should hopefully be able to hear a vocoder effect.
And about an hour ago, this didn't work.
So we'll give it a go.
Right, so this is, oh, I need to open the demo.
Let's go to that.
Let's go to inspect.
Let's go to sources.
Let's go to vocoder.
Okay, so by default, it hopefully shouldn't route it through the TorqBox and we should be able to hear it just as it is. (vocalizing) Very, very sharp.
Right, we'll see if this works.
Okay.
♪ Human ♪ ♪ Robot ♪ ♪ Human ♪ ♪ Robots ♪ Okay, I'm just gonna check that I've got my system audio back (vocalizing) Cool, excellent.
Right, so that is the vocoder.
So I think we've got pretty much everything, everything that we need.
So let's put everything together and we see if we can do a live performance.
Reminder, while I'm using this MIDI controller to program everything, this thing is not making any noise at all.
It's everything you're about to hear is going to come from the browser.
So we'll give this a go.
It might take me a few tries to get right, but I'll try my best.
Okay. (sighs) (buzzing) Oops, sorry.
There we go. (beeping) Oops, sorry.
I've missed out.
I knew I'd get something wrong.
Right, okay, let's go. (vocalizing) No, that's not the right one.
Inspect.
Let's go to sources.
Let's go to the complete demo.
We'll try one more time.
I'm gonna leave open the console so you should be able to see the MIDI data coming through.
Right, let's try it one more time. (beeping) Oh, no.
Okay.
That demo has been keeping me awake at night for the past two years. (laughing) Thank you very much.
Right, so let's get rid of that demo.
Cool, okay.
Right, so that's the Web Audio API.
How would I sum up the Web Audio API?
It requires a little bit of domain knowledge to get started.
It's designed for people who already know how electronic music is made, and that's a bit of a drawback for people if you're just starting out.
With that said, it's not that hard to take your first steps.
Easy things are easy-ish, and hard things are possible.
The drawback for beginners is it's hard to know which is which.
It's also not a digital audio workstation.
You're not going to get the same kind of production tools, mixing power and effects that Logic and Ableton have, but it is excellent for educational and experimental roles, and it's well supported in Chrome, Edge, Safari, and Firefox.
MIDI is very well supported in musical hardware, but it is utterly let down by its support in Safari, unfortunately, but it's also excellent for hobbyist development.
So we're coming to the last bit of this talk.
What does the web do best?
Should we be thinking of integrating the Web Audio API into our sites?
Well, no, don't do that.
Don't do that.
Is it a replacement for serious audio production tools?
Again, no, it's not Ableton, and it's not Logic.
But that doesn't mean that the Web Audio API and the Web MIDI API are worthless.
There's a huge community of hobbyist and experimental musicians out there, including people like this.
These are circuit breakers, circuit benders.
Circuit benders make electronic music using scrap, toys, and broken synthesizers.
If they can make music out of rubbish, you can bet someone can make something cool out of a web browser.
What makes MIDI so great is that it's used by practically everyone, by amateurs and professionals alike.
I knew nothing when I started writing this talk about electronic music, and that just goes to show how good an educational tool Web Audio and Web MIDI can be.
The web should be the platform of the amateur, of the hobbyist.
The auteur's doing something that nobody has ever conceived of before.
The web's compatibility, its ubiquity, its accessibility, and its approachability make it the ideal creative platform.
And this is a call to Apple to find a way to support Web MIDI on iPads and iPhones.
The web and MIDI belong together.
This is Dave Smith.
Dave Smith was an engineer and founder of Sequential Circuits, and he's also the creator of the Prophet 5, a very famous polyphonic synthesizer.
Together with the president of Roland, Ikitaro Kakahashi, they created MIDI together.
To this day, MIDI remains completely unlicensed.
No fees have ever been collected for its creation.
This has led to its widespread adoption and its near universal compatibility.
Whole industries and art forms exist because of MIDI, and it remains easy to use for beginners and experts alike.
And if that sounds familiar to you, then it should do.
If you're thinking about the World Wide Web, then you'd be right.
Since its creation, Tim Berners-Lee has not collected any fees for creating the World Wide Web.
Free and open standards create wealth for all of us.
The web has changed all our lives to the better.
It's critical that the web remains free, open, and endures as the software platform for the common good.
MIDI and the web are naturally allied to each other and deserve to endure together.
Thank you very much.
About Katie Fenn
Katie is an engineer from Sheffield in the North of England. She helps organise the Front End Sheffield user group, and has worked for companies like npm, Monzo and Canonical.