Sunday, June 15, 2014

An Internet music player

Circuit Cellar is running a design challenge using the Wiz5500io module. I applied mainly to see if I could get one of the modules to try out. Deadline is 3rd August. I really haven't left enough time but I am trying to enhance a plucked string synthesiser app I wrote for the GA144 to output streamed MP3. This stream is then fed to the Wiz550io and voila! I have an Internet music player streaming beautiful meditation music.

It seemed quite simple at first. Then I looked at how one encodes MP3 frames and I'm starting to baulk.

The idea is to read 1152 16-bit, 44.1KHz PCM samples, split the samples into 32 frequency bands, run an FFT over the samples to work out which bands are dominant and use a psycho-acoustic model to select how much of each band to put into the output frame. Add in a Huffman encoder and we are left with a lot of code and a lot of RAM usage.

The question I can't answer yet is whether it is possible to fit it all into a GA144 alongside the synthesiser.

So far I've worked out that I need two set of samples, 1152 for the current frame and 1152 for the previous frame. This allows better prediction of band energies. As it happens 1152 is 18 * 64 which fits precisely into one row of the GA144.

So I have the synth taking 18 nodes (2 x 1/2 rows), and the samples taking 2 rows leaving 5 rows of nodes to encode the huge number of calculation constants and transforms. I need the SPI interface in node 705 to initially load the code from flash RAM and then to output the generated MP3 frames to the W5500. At this stage I don't need any additional RAM/ROM but I could throw the samples and the constants into external RAM if necessary.

I think the F18 nodes will be fast enough. This is only for audio, not video or radio.

I am attempting to port the LAME encoder, which is written in C, to ArrayForth.

Some subprojects I will need to implement on the GA144:

  1. Move synth nodes to top of chip (609-617, 709-717) and verify it works.
  2. Load 1152 samples out of synth into RAM of 18 nodes. Easy enough to verify in Sim. Model transport from MD5 hash encoder example.
  3. Translate all the required constants in LAME code into floating point equivalents and load them into nodes (how many?). Easy enough to use Perl for the calcs.
  4. Implement fft_long and fft_short and test. Will need to also tool up LAME code for single step debugging so can verify results
  5. Implement window type selection based on FFT analysis.
  6. Implement mdct*. This might need cosine calc/lookup as used in (co)sine synth from previous project.
  7. Implement Huffman encoder. Needs a table of lookup vals.
  8. Create output frame from encode data plus side info.
  9. Output frame to W5500 and verify Internet transmission. A stand-alone test could be to output the same frame which presumably would play the same note(s) repeatedly. Each frame represents 38ms of sound so it would be very short.

No comments: