Small XMOS-based Guitar Effects Board

Started by markseel, March 15, 2015, 10:00:55 PM

Previous topic - Next topic

markseel

I'm making some small DSP boards and writing support software for audio related projects.

First project is a small board for digital guitar effects.  I plan on using KickStarter to pay for having the boards built by a contract manufacturer (so I don't have to build them by hand).  

Hardware:

1) Very small PCB - 1.6" x 0.8" - requires clean 5V for analog supply, 3.3V for digital supply
2) Stereo inputs and stereo outputs, 24bit sampling at 192 kHz (uses the AK4556 stereo CODEC)
3) Inputs are boosted (about x8 gain) - soft clipped to prevent exceeding allowable peak-to-peak CODEC input voltage.
4) 32/64-bit fixed-point DSP via the XMOS MCU (500 MIPS, up to 1000 MIPS in dual-issue mode, single-cycle multiply/accumulate, 256 KB RAM)
5) UART interface for controlling effects parameters
6) JTAG interface - you can use the free XMOS toolchain (xTIMEcomposer) to write your own programs and download/debug/flash



Software:

1) Source code for signal processing (FIR's, IIR's, BiQuads)
2) Source code for I/O peripherals (I2C, SPI, I2S/TDM, UART, ADAT, SPDIF)
3) Source code for example effects (overdrive, graphic EQ, chorus, etc)
4) Source code for USB audio class 2.0
5) http://xminimalisms.appspot.com/documentation.html

The software collection contains the effects application that's preloaded on this board.  The preloaded software, supporting multi-effects (overdrive, EQ, chorus, etc) can be controlled via the serial interface - which can be serial/UART or I2C.  This way you can control the effects from the PC (using a USB to serial adapter), from an Arduino or Rasberry Pi, or from a Bluetooth Low Energy module (such as the Bluegiga BLE112/113 supporting application development and UART/I2C outputs).  With BLE one could write an application for the PC or iPad/iPhone/Android application and use the device's Bluetooth LE capabilities to control the effects.

But there's a JTAG header (6-pin 50 mil) that you can use to attach the XMOS JTAG adapter (called the XTAG-2) to download/debug your own programs.  XMOS tools and support libraries are free to download from www.xmos.com.

More updates coming soon  :)

Digital Larry

Hey Mark,

Looks awesome!

Looking forward to updates.

DL
Digital Larry
Want to quickly design your own effects patches for the Spin FV-1 DSP chip?
https://github.com/HolyCityAudio/SpinCAD-Designer

Ice-9

Excellent Mark , this looks really interesting and I will follow the progress, maybe if it is allowed put the kickstarter link in the thread here.  :icon_cool:
www.stanleyfx.co.uk

Sanity: doing the same thing over and over again and expecting the same result. Mick Taylor

Please at least have 1 forum post before sending me a PM demanding something.

yeraym

I'm very interested in this. Count me in  ;D

micromegas

I dig the idea. I already suscribed to this topic  ;).

Don't know a lot about XMOS (shame on me), but it seems to be an interesting platform.

I was about to spend some money on an OWL Programmable Pedal, but I'll hold it a little to see how the kickstarter goes.

Cheers.
Adán
Software Developer @ bela.io

markseel

#5
Here's a short explanation of how the effects framework is organized.  This framework is for adding your own effects if you decide not to use the multi-effects that are preloaded and if you don't fancy programming the XMOS board from scratch.

The effects framework is a multi-threaded application that handles moving data over the I2S bus, responding to external parameter control (via the UART/I2C interface), and marshaling data to each of the data processing loops.

There's eight threads that are created when the framework application boots up (note that applications are stored in FLASH and transferred to the MCU's RAM by the MCU's boot loader):

Thread 1 - Application thread: Receives data from the UART thread and sends parameter id/values to each data processing thread.
Thread 2 - I2S thread: Implements the I2S interface for the CODEC (ADC/DAC).  Sends data to processing loop 1 and receives data from processing loop 5.
Thread 3 - Processing loop 1: Contains your own DSP algorithms. Receives data from the I2S thread and sends data to processing loop 2.
Thread 4 - Processing loop 2: Contains your own DSP algorithms. Receives data from processing loop 1 and sends data to processing loop 3.
Thread 5 - Processing loop 3: Contains your own DSP algorithms. Receives data from processing loop 2 and sends data to processing loop 4.
Thread 6 - Processing loop 4: Contains your own DSP algorithms. Receives data from processing loop 3 and sends data to processing loop 5.
Thread 7 - Processing loop 5: Contains your own DSP algorithms. Receives data from processing loop 4 and sends data to the I2S thread.
Thread 8 - UART/I2C thread: Implements the UART/I2C interface.  Received data is sent to the application thread.

The effects framework defines these functions that the effects programmer must implement:

 void processing_loop1( int* sample_L, int* sample_R, int param_id, int param_val );
 void processing_loop2( int* sample_L, int* sample_R, int param_id, int param_val );
 void processing_loop3( int* sample_L, int* sample_R, int param_id, int param_val );
 void processing_loop4( int* sample_L, int* sample_R, int param_id, int param_val );
 void processing_loop5( int* sample_L, int* sample_R, int param_id, int param_val );

These functions get called once for each audio cycle (192 KHz).  The framework moves audio data from the I2S thread to each processing loop and back to the I2S thread automatically.  The audio samples arrive to and exit from the processing loop via the pointer variables sample_L and sample_R.  If a parameter arrives from the UART/I2C thread then application thread will distribute the parameter ID/value to each processing loop on the next audio cycle after UAR/I2C data input completion.  From the perspective of each processing loop (1 through 5) the param_id will be non-zero if it's a new parameter arrival or it will be zero if no parameter activity has occurred.

Here's an example of a simple application organized as follows:

1) Processing loop 1 - blocks the DC component in the audio signal for the left channel, leaves the right channel unaffected.
2) Processing loop 2 - Does nothing.  Samples pass through this loop unaffected.
3) Processing loop 3 - Does nothing.  Samples pass through this loop unaffected.
4) Processing loop 4 - Does nothing.  Samples pass through this loop unaffected.
5) Processing loop 5  performs a low-pass FIR filter in the left channel.


static int loop1_dcblock_q25( int sample_q25 )
{
   {   // Differentiator: y[n] = x[n] - x[n-1]
       int temp = sample_q25; static int prev_sample = 0;
       sample_q25 = sample_q25 - prev_sample; prev_sample = temp;
   }  
   {   // Leaky Integrator: y[n] = pole * y[n-1] + x[n]
       int temp = sample_q25; static int prev_result = 0;
       sample_q25 = prev_result = xsignalproc_mult_r25x25y31(prev_result,Q31(0.998)) + sample_q25;
   }
   return sample_q25;
}

void processing_loop1( int* sample_L, int* sample_R, int param_id, int param_val )
{
   *sample_L /= 64; // Q31 --> Q25
   *sample_L = _loop_1_dcblock( *sample_L );
}

void processing_loop2( int* sample_L, int* sample_R, int param_id, int param_val )
{
}

void processing_loop3( int* sample_L, int* sample_R, int param_id, int param_val )
{
}

void processing_loop4( int* sample_L, int* sample_R, int param_id, int param_val )
{
}

static int loop5_antialias( int sample_q25 )
{
   static int coef[] = { Q31(+0.000063008965),Q31(+0.002800992631),Q31(+0.026655281619),
                         Q31(+0.104794330580),Q31(+0.223294720691),Q31(+0.284783331029),
                         Q31(+0.223294720691),Q31(+0.104794330580),Q31(+0.026655281619),
                         Q31(+0.002800992631),Q31(+0.000063008965) };
   static int hist[] = {0,0,0,0,0,0,0,0,0,0,0,0,0,0};
   return xsignalproc_fir_n10_r25x25c31( sample_q25, coef, hist );
}

void processing_loop5( int* sample_L, int* sample_R, int param_id, int param_val )
{
   *sample_L = loop5_antialias( *sample_L );
   *sample_L *= 64; // Q25 --> Q31
}


Well, hope this makes some sense!  Let me know if it's confusing or if you have questions :-)

Digital Larry

#6
Hi Mark,

As you might imagine, the first thing that pops into my mind is how I can adapt my FX CAD program (you know the one) to this system.  Wouldn't that be rad?

When targeting the FV-1, SpinCAD takes your patch model (a bunch of blocks connected together) and sorts it into a topological order so that everything that is upstream gets calculated before everything that is downstream.  For the Xmos CPU, beyond that there's the idea of having to lay the code out into one of (five) spaces, similarly putting them into order.  I suppose I could start at the end and put block code into a slot until the next one wouldn't fit, then move on up the chain until I run out of code, or space, which would trigger an error.

Are all these functions typically written in C code?  Can code in any slot access arbitrary locations in RAM or CPU registers?

It also looks like the code in each slot has to have only one pair of inputs and one pair of outputs.  Is that true?  That might complicate things a bit.

OK one more question - the parameters come in from a queue, which is fine as sampling rate is way faster than you need to track control changes.  So then I presume that each slot that needs to use a particular control parameter keeps a local copy of it, so that it can be used as part of the current calculation pass?  Think I'm getting confused here.  Explain what would happen if I have 5 control knobs spewing out bytes on a UART as fast as I can - which is still way slower than each sample time.

Thx,

DL
Digital Larry
Want to quickly design your own effects patches for the Spin FV-1 DSP chip?
https://github.com/HolyCityAudio/SpinCAD-Designer

free electron

Veery nice project Mark! Definitely interested in it!
I imagine it could be a nice platform to build a DSP based cabinet simulator with preloadable/editable filter responses.
Spotted something on your PCB:

Looks like these decoupling caps (?) are not connected to anything, but gnd.
Making it a single side loaded board would significantly reduce the manufacturing cost. The boards would have to pass the production line only once. Of course assuming the board size is not restricted to what it is now.
Anyway, looking forward to updates! I really need to get into these XMOS chips! Little versatile beasts! :)

/Piotr

markseel

Hi DigitalLarry - Great questions!

It's unfortunate that the MIPs have to be divided between threads but that's the only way to utilize all the MIP's of the XMOS part.  The xCore architecture requires processing power to be split up across cores.  So I see your point - it will be a bit of a challenge to efficiently partition the signal processing nodes across the five threads.  Your idea of mapping SpinCADs processing block's (that seem to be organized as a single data flow path) to processing threads (which are also organized as a single data flow path) by filling up success threads with processing block algorithms/functions until a thread's computational capacity is maxed out seems like the general approach to take.  A challenge is knowing when a processing thread is 'maxed out' so that mapping can proceed on to the next thread.  The XMOS compiler has the capability of performing cycle accurate static analysis - perhaps this could be used to signal when a collection of code exceeds a thread's MIPs allocation (XMOS cores/threads share the total 500 MIPs and are guaranteed 62.5 MIPS and can not exceed 125 MIPs).

Yes, all of the functions are written in 'C' code (or maybe XC since it may produce tighter code since 'C' is harder to optimize).  Each processing thread is simple a while(1) loop that runs on an XMOS core (there's 8 cores on this MCU).  Each core has access to all RAM and MCU control registers.  If two cores attempt to access the same resource a hardware exception likely occurs.  Programming in XC prevents this for the most part - it's intended as a safer language for concurrent programming than 'C'.  So to your question; yes, each thread has full/arbitrary access to RAM and registers.

You're correct; each processing thread receives a pair of samples and returns a pair of samples:
ADC ---(left/right sample) --> Thread1 --(left/right sample)-->... Thread5  --(left/right sample)--> DAC
Are you thinking we should support more channels or support a more complex data flow arrangement?

Right, the parameters come in from a queue that buffers received UART/I2C data.  Good observation about rates - the parameter rate can't exceed the sample rate.  So I'll have to make sure that the host that feeds parameters is throttled or flow-controlled properly (different ways to do this for UART vs I2C).  I intentionally kept the parameters model simple.  There's a mechanism for moving the parameters from a host to all threads but that's about it.  Each thread can observe and keep copies of parameter data if it makes sense for it to do so.

markseel

#9
Hi free electron - nice catch!  Those two parts form the RC low-pass filter for the MCU's PLL - I forgot to hook them up  :icon_redface:  I'll fix that!!

I agree - this part should not have a problem doing CAB sims.  How does CAB simulation work?  Is it fundamentally an FIR?  If so, how big are the FIR's?  I think I've seen cab simulation also refer to non-linear modeling and modeling of reflections.  Not sure of the latter is a bit like modeling echoes and/or reverb but if so there should be enough MIPs/RAM for this, agree?

To your point about XMOS micros being flexible little beasts;  agreed - they really are quite powerful.  The current architecture has been around for many years and still offers a competitive 500 MIPs and single-cycle 32/64 bit MAC on a single tile.  Some parts have two tiles in one package so these figures then double.  A refresh of the part is coming out this summer that adds dual issue instruction dispatching and a few more instructions.  The I/O's on this part are completely unique as far as MCU's go - no I2S/I2C/SPI/UART peripherals.  Just C/XC code that manages flexible I/O ports that support serialization, timed input/output, time stamping, etc.  It takes some time to learn how to make them go but it's worth it  :)

Digital Larry

#10
Hi Mark,

SpinCAD represents functional blocks with attributes and er, "whatnot" that are very specific to the FV-1 architecture.  Given that I built it on top of Andrew Kilpatrick's ElmGen, an FV-1 simulation and abstraction of the ASM into Java, that's not a big surprise.  But the visual model is just a way to hook "things" together, and of course those things and what becomes of the whole enchilada, can be adapted to a different chip underneath.  I didn't say easily, mind you.

One thing I like about the FV-1 and SpinCAD is that it only constrains the model in terms of size (# of instructions, registers and available RAM) but the structure is pretty arbitrary.  So I don't spend much time making sounds that are like overdrive into chorus into delay.  I like splitting things into three frequency bands and doing weird things in each band and then mixing them all together in parallel right before the output.  And just to be completely honest, what I don't like about the FV-1 is just the constraints of the chip - I'd like more memory, more instructions, more registers, more controls.  A gen 2 that had twice as much of everything would be amazing, but I really don't see that ever happening.

It's nice to not have any structural constraints and I'm pretty sure something could be done along those lines that would still be able to map arbitrary signal flow structures to the multi-core Xmos thing.  I have to start from a place of optimism.

There's no reason that the core I/O has to be limited to a single set of inputs and outputs.  Of course I'm just popping off here, but I think that more flexible structures, maybe linked lists, could be developed that have the ability to support multiple I/Os for each core.  Yes, there'd be some overhead associated with that.  And then there's the challenge of sizing things per core by MIPS... I thought it would just be code length but that sounds like it's overly simplistic.  A static analysis of MIPS requirements per block could be done ahead of time though, so it could be known at compile time.

The other thing I wonder about is the ability to simulate things on the PC, the way we can do pretty well (barring pitch shift, for the time being) with SpinCAD.  I lay all the praise with Andrew K. there.  I've looked at the simulator code and saw how he did it.  With a buttload of work is how he did it!   Maybe I missed something you already wrote (I realized we had nearly this same conversation a year or two ago!) - is there a way to simulate stuff on the PC?

Thanks,

DL

PS send a message next time you're in town for a trade show.
Digital Larry
Want to quickly design your own effects patches for the Spin FV-1 DSP chip?
https://github.com/HolyCityAudio/SpinCAD-Designer

free electron

Quote from: markseel on March 17, 2015, 11:17:30 PM
I agree - this part should not have a problem doing CAB sims.  How does CAB simulation work?  Is it fundamentally an FIR?  If so, how big are the FIR's?  I think I've seen cab simulation also refer to non-linear modeling and modeling of reflections.  Not sure of the latter is a bit like modeling echoes and/or reverb but if so there should be enough MIPs/RAM for this, agree?
In most cases it's just a filter with a more complex frequency response. It can be implemented in a few ways. A very popular approach on the PC/DSP platform is to use a convolution with the impulse response of the cabinet.

Quote
To your point about XMOS micros being flexible little beasts;  agreed - they really are quite powerful.  The current architecture has been around for many years and still offers a competitive 500 MIPs and single-cycle 32/64 bit MAC on a single tile.  Some parts have two tiles in one package so these figures then double.  A refresh of the part is coming out this summer that adds dual issue instruction dispatching and a few more instructions.  The I/O's on this part are completely unique as far as MCU's go - no I2S/I2C/SPI/UART peripherals.  Just C/XC code that manages flexible I/O ports that support serialization, timed input/output, time stamping, etc.  It takes some time to learn how to make them go but it's worth it  :)
I guess it's time to get that startkit and play around!
Btw, a non fixed peripheral approach is not that unique. Cypress has been doing it for years, although in a slightly different way (8bit/ARM core + configurable PLD).

markseel

The website has been updated - still a lot of work to do though.  :icon_eek:

http://xminimalisms.appspot.com/documentation.html

The first rev of the effects boards, with just the analog conditioning and audio CODEC, arrived.
I plan to have one board hand-built and up and running soon to verify good audio data flow.

scintillation

I came across these chips the other day and wondered if anyone had done any audio effects with them. I came here and realised the answer is yes.

The first thing that came to mind, that I couldn't see in your post, what is the power consumption of your board?

markseel

#14
Power Consumption (no USB, just I2S in/out and effects processing):  The XMOS MCU core, running at 1.0V, should consume less than 300mA under heavy loads (static + dynamic) - so let's say 280mA at 1.0V.  On the PCB is a FAN5358 (1.0V switcher with max input voltage of 5.5V) that should operate at about 90% efficiency with a decent inductor.  So the core would draw 280mA/3.3/0.9=101mA - 100mA at 3.3V seems close enough.  Probably should add some mA for the 3.3V MCU I/O's to the CODEC.  So say 130mA total at 3.3V for the MCU portion of the PCB?  The CODEC runs at 3.3V and draws around 30mA.  Maybe another 5mA or 10mA for the analog signal conditioning circuits (op amps).  170 mA at 3.3V for the whole system.  Assuming a 9V supply with a 3.3 switching regulator providing the 3.3V power at 80% efficiency the current draw at 9V would be a bit less than 80mA.

markseel

Version 2 of the board should be back in a few weeks.  Made changes; Now uses the next generation of the XMOS CPU (XUF208), added a USB jack (for streaming audio to/from a USB host, like Windows or OS X), upgraded the ADC/DAC for better input dynamic range, removed the I2C interface (replaced by USB MIDI), and changed the analog sections for lower noise performance while maintaining full audio range response (-1db at 20 kHz).

I decided to remove the connections for I2C (for external control) and use USB MIDI instead.  The current firmware enumerates as both a USB audio class 2.0 device as well as a MIDI device (for effects control).  It also supports USB class compliant device firmware upgrade (DFU).

The stereo analog inputs are still high-impedance (for guitar) but incorporate a Sallen-Key low-pass followed by a single-order low pass - input signal attenuation at 6.144 MHz (sample-rate of the ADC) is now around -120dB.  Gain is two.  The stereo analog outputs also flow through a Sallen-Key buffer to remove any remaining digital noise.  Gain is 0.5.  The AK4556 was swapped out for an AK5386 ADC and an AK4384 DAC - better input dynamic range and digital control of the output level.

Note that although the inputs and outputs were designed for guitar effects use (with high impedance buffering and 2x gain at the inputs, 2x attenuation at the outputs) this board could be used for other stuff (line-level USB audio input/output with DSP, USB headphone driver, etc) by changing a few component values and using a high current output op-amp (to drive headphones).

The 8 pin SIP in the left side is for analog input and output, and for analog power supply (if not powered by the USB which can be noisy).  The 16-pin DIP in the middle of for JTAG.  I'll update the website with the latest firmware next.  It supports DSP functions for effects and USB audio class 2.0 (2 input and 2 output at 192 kHz).



Digital Larry

Mark,

What's the likelihood of being able to generate code for this using Faust or PureData?

Thx,

DL
Digital Larry
Want to quickly design your own effects patches for the Spin FV-1 DSP chip?
https://github.com/HolyCityAudio/SpinCAD-Designer

markseel

The board works OK.  Was able to route audio from PC over USB to CODEC, looped CODEC DAC outputs to CODEC ADC inputs, and back to PC over USB.  Asynchronous USB Audio Class 2.0 24-bit stereo at 192 kHz :-)

Needs some changes though.  Don't like the analog section - the new one has -120db attenuation at 6.144 MHz (the sampling rate of the ADC) is is very low noise/distortion.  Added four GPIO pins.  Removed the USB connector in favor of a few header pins so the board can be used with or without USB, and with any of the variety of USB connectors (have to add it yourself).

Board v2 is 1.5" by 0.8" - going to send GERBERs out ASAP and should have boards back in a few weeks.  This next round of boards would be near final as there may be a few final schematic tweaks - including changing out the DC-DC for the XMOS 1.0 Vcore to a better part (the one I'm using is is EOL).

Will update the software soon and start putting effects code out on the website.  Even without effects this would work a nice guitar PC interface - the firmware already works for that.


mhelin

Mark,
Good to see you didn't stop this project after all. Also the new xCORE-200 part is nice, it's got 256 kB RAM which is enough even for a reverb.

Btw. AMI technologies (India) is listed as one of the XMOS partners (http://www.xmos.com/support/partners ).
Do you happen to know if the AppEngine SDK (http://www.amitekh.com/sdk.html) supports the xCORE processors?
Would be an easy way to design and build the software.

Regarding the PCB layout I'm no expert but I think most (audio converter) EVM's I've seen have quite a large footprint to help avoiding the EMI from digital to analog parts by having large ground planes surrounding the noisy digital components, and the PCB's are usually four layer (btw, do three layer PCB's exist, haven't seen any?). If the target is a small board size then the converter could be a tiny one like the Cirrus CS4265. Likewise there are smaller XMOS parts but then the boards should be assembled at factory. For guitar effects the PCB could even have a stereo input and ouput jacks on the PCB, the PCB size matching some most common Hammond s like 1590BB which is quite large. Well, maybe the separate jack connectors are better solution, don't know.

markseel

#19
Thanks for the feedback mhelin.

The board is near final and firmware runs.  PCB is four layers now with ground and power planes and is 2.4" x 2.4".  Audio streams fine over USB and via I2S (to/from CODEC).  Added jacks for two guitar inputs and for two (stereo) outputs.  Depending on passive component selection the outputs can be either 6.0 Vpp balanced/differential line-level for hookup to mixer or powered speakers, or low-voltage instrument level for hookup to pedals or guitar amp.  So the board could be used for guitar to USB/PC input for recording, USB audio adapter for playing music or re-amping, low latency guitar effects, and mixing of guitar with computer audio - all at the same time.  I left off the high impedance input buffers for now so this board rev can be used to test noise and distortion levels via differential/balanced loop-back.  If noise and distortion tests look good I'll add the input buffers, add the USB ESD protection diodes, and call this version done ... except for one change down there road - I'll likely switch out the aging AK4556 for the AK4558.