The presentation by Adobe’s Jim Corbett on Alchemy at the Adobe MAX 2008 conference in Milan last Tuesday (2nd of December), got me thinking about something I used to spend quite some time on back in the early 90’s, when I studied Sonology at the Kon. Conservatorium in Den Haag: spectral analysis & (re)synthesis of sound. This is usually done by means of the Fourier Transform, which transforms time-based data such as sound to the frequency domain. The most used algorithm for this is the Fast Fourier Transform, or FFT, which requires the number of samples per transformation to be a power of 2. Usually a block of 1024 samples is used, which equals roughly 50ms of sound at 44.1kHz.

Now the FFT algorithm is quite CPU-intensive, and as Alchemy is put forward as useful for large number-crunching operations, I thought this could be a very interesting case. In order to allow for proper comparison, I decided to first port the FFT code from C to Actionscript, and do some testing. It took some refactoring of the C-code in Actionscript to allow for specific bottlenecks popping up, such as multiple lookup of the same value from a Vector, which turns out to be an expensive operation. Also the calculation of sine & cosine was done beforehand, to allow for a lookup table to be used in the actual FFT algorithm. All in all I managed to gain some 50% speed increase by these optimizations.

To my surprise, a single transformation of one block of 1024 samples takes only 5 ms in Actionscript. Even when performing the necessary after-calculations (normalizing & getting the lenght of vectors, which requires Math.sqrt), a back-and-forth transformation shouldn’t take more than 10 ms, leaving 40ms for doing fun stuff with the data. So even without a C-port it’s possible to do real time audio transformation in the frequency domain.

The first thing to do is plot the data in an image. Right now I’m only posting a screenshot, as I’d like to polish the example a bit more before putting out the actual code, but this is what a frequency plot of a short piece of music looks like:

More about this (including source code) soon!

4 Dec, 2008, 14:01 o'clock

Add your own comment or set a trackback

Currently 9 comments

  1. Comment by stephan

    Sources are available, see this blog post. I haven’t had time to look further into Alchemy yet, but it still is on my wanted list. Pixel Bender might be an option, but the lack of loops might make its use very limited, as that would make it only suitable for short samples. But it’s worth looking at, sure!

  2. Comment by jkozniewski

    Hi,

    You’ve mentioned Alchemy as an option of
    computing FFT – have you made any tests if
    it’s much faster than pure as3 ?
    I’m wondering if PixelBender may be helpful here
    i know that lack of loops support etc. would be a problem
    but hm…. any thoughts on this πŸ™‚ ?
    Are you planning to release sources any time soon ?
    Regards !

  3. Comment by stephan

    Hamming window is next on the todo list πŸ™‚ To do 2-way FFT (with fun stuff in the middle) you have to do overlapping Hamming windows, otherwise you get an extra sound that can be quite nasty. The frequency of that added sound is the same as that of the base frequency of the FFT, so usually around 50Hz, and shaped like a square or a pulse. So basically you get a low sharp buzz extra. Problem is only that this requires the double number of transforms, so it’s twice as hard on the CPU. With my current findings that still leaves 20ms to do fun stuff. The Hamming window itself isn’t hard on the processor, as the curve can be precalculated and applied in a single run over one buffer.

  4. Comment by vFragosop

    Hmm.. that’s interesting..

    I was talking to Ben Houston, the responsable for FFT algorithms in Math.NET library, and he suggested me the Hamming Function.

    I don’t if Adobe took one of those Window functions in care. It’s an interesting topic anyway.

    [Wiki has a great article about Hamming].

  5. Comment by stephan

    Actually that’s not a mistake, but the standard way a Fourier Transform works. My code does exactly the same thing, except that it’s more flexible in choice of resolution over time vs. frequency. The linear nature of the Fourier Transform (each ‘slot’ takes up 1 frequency that is a multiple of the base frequency) is what makes it less suitable for accurately analysing audible content, as the human ear functions logarithmically. However it’s not very difficult to change the code such that it groups frequencies, with larger groups towards the higher frequencies. It’s basically a kind of post processing that is not very CPU-intensive. It does reduce the resolution in high frequencies, but as you say, that is already very high anyway. Unfortunately it doesn’t increase the resolution in low frequencies.
    One way to increase the resolution in low frequencies, is by increasing the sample size. As the base frequency is a factor of the sample size, it will get half for each doubling of the sample size. However, that implies also halving the time resolution. So the more we wish to know about which frequencies occur in a sound, the less we know about when those frequencies occur. It’s a direct result of Heisenberg’s Uncertainty Principle.
    In mathematics there’s another transform, called Wavelet Transform, that is a lot less well known than the Fourier Transform, but that does much better justice to the logarithmic nature of sound. There’s a number of explanations about that on the net, but they’re mostly quite mathematical, and haven’t found much application outside of science yet. So it seems like we’re stuck with Fourier for the time being.

  6. Comment by vFragosop

    That’s nice news! Thanks for reply.

    One big mistake Adobe made, at least in my opinion, is dividing all the frequencies in 256 equal groups of N frequencies.
    Normally, spectrum analyzers grows exponentially the N number of frequencies inside the groups.

    0-20, 20-60, 60-100, 100-180, 180-320, 320-640… and so on..

    This way, the graph looks more natural, since there’s just a few low frequencies and thousands of high frequencies.

  7. Comment by stephan

    I haven’t been adding much to it lately, but I’ve cleaned up the code, so I can upload the sources. Stay tuned, coming up soon!

  8. Comment by vFragosop

    Well, it’s just what i’m looking for.
    Since computeSpectrum curves aren’t precise, having a decent fft code would be really great.
    Hoping you’re still working on that. I could do it for myself, but my math skills are the same as an eight year old boy.
    Please, let me know about any progress.

  9. Comment by Shui

    This is exactly what I was planning to do.
    I’m looking forward to your results.

Add your own comment



Follow comments according to this article through a RSS 2.0 feed

Twitter

Flickr

www.flickr.com
This is a Flickr badge showing public photos from AcidCats. Make your own badge here.