Audio Playback, Recording and Processing on the iPhone Part 2
This is a continuation of a previous iPhone audio post.
I’ve had success playing back synthesized audio out the headphone jack and/or speakers of Apple’s iPhone without killing the sound daemon. I’ve also had modest success getting audio in to work but we’re not 100% there yet. Very close.
[I’m moving the rest of this post “after the jump,” click on to read the rest… ]
- iPhone sample-level synthesis w/ CoreAudio & AudioToolbox
- File playback with AudioQueue
- Audio processing and iPhone audio input
Please let me know if you can help — brian.whitman at variogr.am.
Summary
As of Tuesday September 11
- We can send arbitrary samples out either the headphone jack (stereo) or the speakerphone out (mono) at 44kHz without having to kill mediaserverd
- We cannot yet receive samples coming in from the microphone (without going through the very high level file recording API) — but we are closer than ever before & I’m working on it
- We can do all the stuff in the previous post, like access the iPod library, play media files and URLs and different rates and seeks, etc…
Note about source code for these examples
Unfortunately, I cannot post all of the code of all my examples for these experiments like I did for the previous ones. I hope to give enough information and source code in this post to let a developer with legal access to the right headers write their own audio apps. Some of the posted code below requires headers that I cannot provide — you will have to obtain them on your own.
Playback without killing mediaserverd
I’ve finally figured out how to play audio samples out the phone without killing mediaserverd. We can play any audio (synthesized or files) on top of other sounds on the phone without stopping any servers or killing the phone’s audio.
Download this CLI iPhone app and source and try it out. You should hear two sine waves if the headphones are in or a mono sinewave if through the speakers. You can change the volume in the music player (ipod) portion of the phone. All your other sounds should still be fine.
There is source included in that bundle but it won’t compile. We’re using the AudioQueue API which we discovered while looking through the disassembly of Celestial. I spent some time researching AudioQueue and eventually discovered that it does not have any public headers and won’t until October. I was able to reverse it enough to get sound working without using any NDA code (the binary reflects this), but I can’t release it. If you have legitimate access to these headers you’ll be able to compile the example by including AudioQueue.h.
What’s great about AudioQueue is that it somehow attaches itself to the audio subsystem so softvolume (like the iPod portion’s volume bar) affects whatever audio you’re playing. So now you can synthesize audio and not have to worry about the state of the system — headphones or not, volume, calls coming in, etc.
The AudioQueue seems to want to be on a specific runloop, which may cause issues with multi-threaded apps. When I sent this over to NerveGas for the NES emulator he had some trouble getting it not to underrun. There’s a line that looks like
err = AudioQueueNewOutput(
&in.mDataFormat, // yer asbd
AQBufferCallback, // synth callback
&in, // callback data
CFRunLoopGetCurrent(), // run loop to attach the queue callback to
kCFRunLoopCommonModes, // modes to run it in
0, // don't know
&in.queue); // queue
If you are doing your own threading you need to substitute the CFRunLoopGetCurrent() there with the the right one. If you leave it NULL, apparently the queue will “use its own loop” which I don’t fully understand but it seemed to make Nes.app better. I am still confused what exactly the AudioQueue is talking to — mediaserverd? Does AudioToolbox create a mach part with the AudioQueueServer? If so I never connected to it. But it works, and we can’t ask any questions about it until October.
Another issue with the AudioQueue is that sample rate changes elsewhere in the phone (ringtones, UI sounds etc going to 22050) seem to modify the SR of the entire queue. You can install a listener to notify you when this happens, but I would guess there’s even a better way to not let this happen. What appears to be happening is that we are connecting to a queue that is owned by the system and we can’t really control it.
File playback with AudioQueue
You can also play files pretty easily with AudioQueue, a nice effect. “Why would I want to play files with AudioQueue when I can use Celestial?” You’re probably right, unless you are me, you probably don’t need lower-level control of file playback than what Celestial gives you — and celestial’s much easier. But if you want to ape ScheduledSoundPlayer up on the iPhone, I think you’ll need to join me in investigating the timestamp scheduling functions of the AQ. I don’t know much yet other than that I see they are there. Stay tuned for code posts…
Audio processing and iPhone audio input
AudioQueue also is what lets Celestial’s AVRecorder record audio (voice notes.) It’s pretty much the inverse of the output form. I can get the phone to receive buffers of PCM input with an AudioQueue — but — (a) the buffers are always silent and (b) if you have the headphones in it hangs. Something is off. The clock (sample rate) appears to be set by the audio output queue. So my input callbacks are coming in at 44K even though I have a 8K SR on the input queue. This is a problem we’re also having with NES.app — if another blessed AudioQueue starts running at 22050 and NES.app is playing at 44K, all of a sudden the buffers start doubling in NES.app, it gets all confused.
I’ve gone over the disassembly for [AVRecorder activate] dozens of times- here’s what it does:
- some setup..
- Calls AudioFileGetGlobalInfo with kAudioFileGlobalInfo_AvailableStreamDescriptionsForFormat, probably on a randomly named .amr file
- If that comes back ok, it gets the current runloop and calls AudioQueueNewInput. It passes along the amr asbd from the previous step, a callback proc & user data, the current runloop, and an unknown run mode (I think it’s the default one but I can’t confirm.)
- If that was OK it calls AudioQueueGetProperty with kAudioConverterCurrentOutputStreamDescription (’acod’ at least, this checks that the queue took the asbd you set it to and updates it slightly if necessary) and then another AudioQueueGetProperty with kAudioQueueDeviceProperty_NumberChannels. (I guess if the pre-existing queue supported stereo the iphone could record that? Are the bluetooth stereo mics yet?)
- It then calls AudioQueueAllocateBuffer three times (three buffers) with a 0×1000 buffer byte size. (4096 bytes of AMR packed data, AMR data rate is roughly 8kbits/s, so 1000 bytes/sec, these buffers are relatively long.)
- Then we’ve got the AudioQueueStart … the timestamp param is set to 0.
- [AVRecorder start] creates the amr file and enqueues the first buffer, which calls the AQ record callback (the AVRecord callback strangely is an ObjC function in AVRecorder_Private.) The end of the AQ record callback enqueues the next buffer, and so on and so on.
Current work is looking towards how to not get silent buffers in our own callback.
(NB: I have never seen my battery drain so fast as when I forgot this code was running and it went for an hour)