Pitch Shifter Idea

nexekho · July 31, 2011, 05:28:42 PM

I have an idea of how to implement a pitch shifter. I don't know how exactly the standard ones work, but I think this should be perfectly implementable on most DSPs. It involves splitting the input and output into different algorithms with the input giving the output "chunks" to build a waveform from.

You begin with a number of buffers, which are probably something like 4x the length of a single wave of bass low E or something.

Input:
Copy the input to a scratch buffer as it goes.
Listen out for the waveform crossing the zero line heading up, then down.
When it finishes a wave, if the recorded data is short enough to fit in a buffer, overwrite a randomly chosen buffer with it, normalising the amplitude.

Output:
The output runs in a circular buffer at its own rate.
Keep the volume consistent with the peak input.
Every time the output runs into the previously placed chunk, another chunk is copied from a randomly chosen buffer.

Because the input and output are very loosely linked, we can play back our buffered waves as fast as we want. The random picking of full waveforms should make a fairly convincing pitch shifted representation of the input, at least more than repeating a single buffer of the last full wave. If the pitch is lower, some of the buffers will be never played because the output is using them slower than the input is overwriting them. If the pitch is higher, some of the buffers will be used more than once, but combined with the randomization this should not be too intrusive.

It's unlikely to be perfect, and I don't have an implementation, but I was just wondering what you people think? Perhaps some kind of heuristic determining noise level, crest/fall bias and so on might be useful to stop pick noise displacing notes, etc.

nexekho · July 31, 2011, 07:58:17 PM

I've had a go at implementing it in C using PortAudio. It's a hacked around fuzz example. My wave picking is very crude at the moment, and without sufficient input the effect loops whatever it previously had going in. Also hand-hacked the latency values to be usable. Can get some weird Gameboyish stuff from this. Would record but Stereo Mix won't work while Portaudio is running.

Code Select

/** @file pa_fuzz.c
	@ingroup test_src
    @brief Distort input like a fuzz box.
	@author Phil Burk  http://www.softsynth.com
*/
/*
 * $Id: pa_fuzz.c 1368 2008-03-01 00:38:27Z rossb $
 *
 * This program uses the PortAudio Portable Audio Library.
 * For more information see: http://www.portaudio.com
 * Copyright (c) 1999-2000 Ross Bencina and Phil Burk
 *
 * Permission is hereby granted, free of charge, to any person obtaining
 * a copy of this software and associated documentation files
 * (the "Software"), to deal in the Software without restriction,
 * including without limitation the rights to use, copy, modify, merge,
 * publish, distribute, sublicense, and/or sell copies of the Software,
 * and to permit persons to whom the Software is furnished to do so,
 * subject to the following conditions:
 *
 * The above copyright notice and this permission notice shall be
 * included in all copies or substantial portions of the Software.
 *
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
 * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
 * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
 * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR
 * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF
 * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
 * WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
 */

/*
 * The text above constitutes the entire PortAudio license; however,
 * the PortAudio community also makes the following non-binding requests:
 *
 * Any person wishing to distribute modifications to the Software is
 * requested to send the modifications to the original developer so that
 * they can be incorporated into the canonical version. It is also
 * requested that these non-binding requests be included along with the
 * license above.
 */

#include <stdio.h>
#include <math.h>
#include "portaudio.h"
#include <windows.h>
/*
** Note that many of the older ISA sound cards on PCs do NOT support
** full duplex audio (simultaneous record and playback).
** And some only support full duplex at lower sample rates.
*/
#define SAMPLE_RATE         (44100)
#define PA_SAMPLE_TYPE      paFloat32
#define FRAMES_PER_BUFFER   (64)

typedef float SAMPLE;

float CubicAmplifier( float input );
static int fuzzCallback( const void *inputBuffer, void *outputBuffer,
                         unsigned long framesPerBuffer,
                         const PaStreamCallbackTimeInfo* timeInfo,
                         PaStreamCallbackFlags statusFlags,
                         void *userData );

/* Non-linear amplifier with soft distortion curve. */
float CubicAmplifier( float input )
{
    float output, temp;
    if( input < 0.0 )
    {
        temp = input + 1.0f;
        output = (temp * temp * temp) - 1.0f;
    }
    else
    {
        temp = input - 1.0f;
        output = (temp * temp * temp) + 1.0f;
    }

    return output;
}
#define FUZZ(x) CubicAmplifier(CubicAmplifier(CubicAmplifier(CubicAmplifier(x))))
int iLFO = 0;
static int gNumNoInputs = 0;
/* This routine will be called by the PortAudio engine when audio is needed.
** It may be called at interrupt level on some machines so don't do anything
** that could mess up the system like calling malloc() or free().
*/

int iSince = 0;
int ui8Above =0;

#define BUFFERS 8

#define SCRATCH_LENGTH 4096
float fScratch[ SCRATCH_LENGTH ];

float fBuffers[ BUFFERS ][ SCRATCH_LENGTH ];
int iBuffers[ BUFFERS ];
int iReady = 0, iCurrent = BUFFERS;

float fSpeed = 4.0f, fProgress = 0.0f;

static int fuzzCallback( const void *inputBuffer, void *outputBuffer,
                         unsigned long framesPerBuffer,
                         const PaStreamCallbackTimeInfo* timeInfo,
                         PaStreamCallbackFlags statusFlags,
                         void *userData )
{
    SAMPLE *out = (SAMPLE*)outputBuffer;
    const SAMPLE *in = (const SAMPLE*)inputBuffer;
    unsigned int i;
    (void) timeInfo; /* Prevent unused variable warnings. */
    (void) statusFlags;
    (void) userData;

    if( inputBuffer == NULL )
    {
        for( i=0; i<framesPerBuffer; i++ )
        {
            *out++ = 0;  /* left - silent */
            *out++ = 0;  /* right - silent */
        }
        gNumNoInputs += 1;
    }
    else
    {
        for( i=0; i<framesPerBuffer; i++ )
        {

            switch( ui8Above )
            {

                case 0:
                {

                    if( ( * in ) > 0.02f ) ui8Above = 1;


                }
                break;

                case 1:
                {

                    if( ( * in ) < -0.02f )
                    {

                        if( iSince <= SCRATCH_LENGTH )
                        {

                            if( iReady < BUFFERS )
                            {

                                int iFind = 0;

                                while( iBuffers[ iFind ] ){ iFind++; }

                                iBuffers[ iFind ] = iSince;

                                memcpy( fBuffers[ iFind ], fScratch, sizeof( float ) * iSince );

                                iReady++;

                            }
                            else
                            {

                                int iFind = rand() % BUFFERS;

                                if( iFind == iCurrent ) iFind = ( iFind + 1 ) % BUFFERS;

                                iBuffers[ iFind ] = iSince;

                                memcpy( fBuffers[ iFind ], fScratch, sizeof( float ) * iSince );

                            }

                        }

                        ui8Above = 0;

                        iSince = 0;

                    }

                }
                break;

            }

            if( iSince < SCRATCH_LENGTH )
            {

                fScratch[ iSince++ ] = * in++;in ++;

            } else if( iSince == SCRATCH_LENGTH ) {iSince++; in+=2; }

            fSpeed = sin( 4.0f * ( float ) ( iLFO++ ) / ( float ) SAMPLE_RATE ) + 2.0f;

            if( iReady )
            {

                int iDrop;

                fProgress += fSpeed;

                while( 1 )
                {

                    if( iCurrent != BUFFERS )
                    {

                        iDrop = ( int ) floor( fProgress );

                        if( iDrop > iBuffers[ iCurrent ] )
                        {

                            fProgress -= iBuffers[ iCurrent ];

                            iCurrent = BUFFERS;

                        } else break;

                    }

                    if( iCurrent == BUFFERS )
                    {

                        int iChoice = rand() % iReady;

                        iCurrent = 0;

                        while( 1 )
                        {

                            if( iBuffers[ iCurrent ] )
                            {

                                if( iChoice == 0 ) break;

                                iChoice--;

                            }

                            iCurrent++;

                        }

                    }

                }

                *out++ = fBuffers[ iCurrent ][ iDrop ];
                *out++ = fBuffers[ iCurrent ][ iDrop ];

            }
            else
            {

                *out++ = 0;
                *out++ = 0;

            }

        }
    }

    return paContinue;
}

/*******************************************************************/
int main(void);
int main(void)
{
    PaStreamParameters inputParameters, outputParameters;
    PaStream *stream;
    PaError err;

    int iBuffer = 0;

    while( iBuffer < BUFFERS )
    {

        iBuffers[ iBuffer ] = 0;

        iBuffer++;

    }

    err = Pa_Initialize();
    if( err != paNoError ) goto error;

    inputParameters.device = Pa_GetDefaultInputDevice(); /* default input device */
    if (inputParameters.device == paNoDevice) {
      fprintf(stderr,"Error: No default input device.\n");
      goto error;
    }
    inputParameters.channelCount = 2;       /* stereo input */
    inputParameters.sampleFormat = PA_SAMPLE_TYPE;
    inputParameters.suggestedLatency = 0.0;
    inputParameters.hostApiSpecificStreamInfo = NULL;

    outputParameters.device = Pa_GetDefaultOutputDevice(); /* default output device */
    if (outputParameters.device == paNoDevice) {
      fprintf(stderr,"Error: No default output device.\n");
      goto error;
    }
    outputParameters.channelCount = 2;       /* stereo output */
    outputParameters.sampleFormat = PA_SAMPLE_TYPE;
    outputParameters.suggestedLatency = 0.02;
    outputParameters.hostApiSpecificStreamInfo = NULL;

    err = Pa_OpenStream(
              &stream,
              &inputParameters,
              &outputParameters,
              SAMPLE_RATE,
              FRAMES_PER_BUFFER,
              0, /* paClipOff, */  /* we won't output out of range samples so don't bother clipping them */
              fuzzCallback,
              NULL );
    if( err != paNoError ) goto error;

    err = Pa_StartStream( stream );
    if( err != paNoError ) goto error;

    printf("Hit ENTER to stop program.\n");
    getchar();
    err = Pa_CloseStream( stream );
    if( err != paNoError ) goto error;

    printf("Finished. gNumNoInputs = %d\n", gNumNoInputs );
    Pa_Terminate();
    return 0;

error:
    Pa_Terminate();
    fprintf( stderr, "An error occured while using the portaudio stream\n" );
    fprintf( stderr, "Error number: %d\n", err );
    fprintf( stderr, "Error message: %s\n", Pa_GetErrorText( err ) );
    return -1;
}

Current behaviour is to LFO with a sine wave between 1 and 3x pitch.

Hides-His-Eyes · July 31, 2011, 08:13:35 PM

Anything using zero crossing is immediately suspect because the harmonics of a plucked guitar string cause it to cross zero many more times than the fundamental would cause, especially during plucking.

nexekho · July 31, 2011, 08:18:41 PM

Indeed. Playing with it though and getting some wacky stuff. Certain LFO speeds and buffer counts combined with string scratching make the same noise Borg do when they die :p

potul · August 01, 2011, 12:17:12 PM

What's the usage of the zero-crossing check? Is the intention to record in each buffer just a single Period of the signal? I don't this this will produce a smooth output signal, and in addition zero-crossing in the real world is usually not very reliable.

I've been investigation some pitch shifting algorithms in the last months, and I've seen typically 2 different approaches:

-Circular Buffer: Similar to what you describe, but the typical approach is to have a couple of buffers with a fixed lenght and mix them with some windowing (fade-in fade-out) so that the transition (when the read pointer reaches the write pointer) is not heard as "click".

-Vocoder approach. That's a much more sophisticated approach... Using an STFT to go to the frequency domain and then go back to the time domain with a different overlap (and taking care of phase corrections)

If you are interested in the topic I can post some links... I have them stored somewhere in my Favorites...

Regards,

Mat

Hides-His-Eyes · August 01, 2011, 12:26:48 PM

I would like to read about the vocoder approach if you have anything; equations are quite acceptable!

potul · August 01, 2011, 04:53:14 PM

Here you have 2 links with good information about the phase vocoder:

http://www.guitarpitchshifter.com/algorithm.html

http://www.dspdimension.com/admin/pitch-shifting-using-the-ft/

It took me some time to digest all the details, specially to get the relationship between the phase difference and the real frequency, which is clue for the whole thing to work. But there is a lot of interesting info there.

In these other links there is also some explanation about phase vocoder, although it is targeted for MAX/MSP, but the background idea is the same and you can get good info as well.

http://cycling74.com/2006/11/02/the-phase-vocoder-%E2%80%93-part-i/
http://cycling74.com/2007/07/02/the-phase-vocoder-part-ii/

I'm trying to implement something like this in a DSPIC, although I don't have clear if it will have enough horsepower for it. But I have nothing coded so far, I'm just learning how the FFT/IFFT works in C30.

Mat

Hides-His-Eyes · August 01, 2011, 05:58:36 PM

That's a fair bit to get my teeth into! Thanks very much.

bilwit · March 02, 2012, 05:59:11 PM

Any progress with this? I'm trying to implement the phase vocoder with my dsPIC33F but I'm still noob to the environment. I figure I could squeeze it between the ADC/DAC loopback driver found on the Microchip website. I also have a working code in MATLAB that I'm trying to translate to C30 in MPLAB atm.

I don't know if any of you would find this helpful but here's the MATLAB code. It's a lot shorter/clean than some other phase vocoder examples, as it eliminates the need for a lot of FOR-loops by working with vectors extremely well.

Code Select

    fs = 44.1e3;                            % sampling rate
    rtime = 3;                              % recording duration (seconds)
    alpha = 24/12;                          % pitch-shift factor
    N = 2^12;                               % length of FFT frame
    overlap = .75;                          % overlap fraction
    window = hanning(N)';                   % apodization interval for FFT
    
% *************************** INPUT DEFINITIONS *************************** 
 
    input = wavread('tuning_fork_A4.wav')'; % input WAV
   
% **************************** LOCAL VARIABLES **************************** 
 
    input_length = length(input);           % length of input signal
    frame_count = floor((input_length-2*N)/(N*(1-overlap)));
                                            % total frame number of input
    Ra = floor(N*(1-overlap));              % analysis time hop
    Rs = floor(alpha*Ra);                   % synthesis time hop
    Wk = (0:(N-1))*2*pi/N;                  % center bin frequencies
    output = zeros(1, input_length*alpha);  % preallocate for output
 
% ************************** ANALYSIS/SYNTHESIS *************************** 
 
                                      % start runtime 
    
    Xu_current = fft(window.*input(1:N));   % calculate FFT of first frame
    PhiY_current = angle(Xu_current);       % store phases of first frame
 
    for u=1:frame_count
        tic;
        Xu_prev = Xu_current;               % store previous STFT frame
        PhiY_prev = PhiY_current;           % store phase of previous frame
        Xu_current = fft(window.*input(u*Ra:u*Ra+N-1));
                                            % calculate FFT of current frame
        DPhi = angle(Xu_current) - angle(Xu_prev) - Ra*Wk;
                                            % est. phase propagation
        DPhip = mod(DPhi+pi, 2*pi) - pi;    % principle determination (+/- pi)
        w_hatk = Wk + (1/Ra)*DPhip;         % est. actual frequency
        PhiY_current = PhiY_prev + Rs*w_hatk;   
                                            % adjusted phase propgation
        Yu = abs(Xu_current).*exp(1i*PhiY_current);
                                            % output STFT
        output(u*Rs:u*Rs+N-1) = output(u*Rs:u*Rs+N-1) + real(ifft(Yu));
            toc;                                     % add current frame to output
    end
 
    norm_output = output./max(output);      % normalize the output amplitude
    [t,d]=rat(alpha);                       % integer shift ratio
    shifted = resample(output,d,t);         % resample for pitch shift

News:

Pitch Shifter Idea