|
|
TITLE:: FluidBufPitch
|
|
|
SUMMARY:: A Selection of Pitch Descriptors on a Buffer
|
|
|
CATEGORIES:: Libraries>FluidDecomposition
|
|
|
RELATED:: Guides/FluCoMa, Guides/FluidDecomposition, Classes/SpecCentroid, Classes/SpecFlatness, Classes/SpecCentroid, Classes/SpecPcile
|
|
|
|
|
|
|
|
|
DESCRIPTION::
|
|
|
This class implements three popular pitch descriptors, computed as frequency and the confidence in its value. It is part of the Fluid Decomposition Toolkit of the FluCoMa project.FOOTNOTE:: This was made possible thanks to the FluCoMa project ( http://www.flucoma.org/ ) funded by the European Research Council ( https://erc.europa.eu/ ) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 725899).::
|
|
|
|
|
|
The process will return a multichannel buffer with two channels per input channel, one for pitch and one for the pitch tracking confidence. Each sample represents a value, which is every hopSize. Its sampling rate is sourceSR / hopSize.
|
|
|
|
|
|
CLASSMETHODS::
|
|
|
|
|
|
METHOD:: process
|
|
|
This is the method that calls for the pitch descriptor to be calculated on a given source buffer.
|
|
|
|
|
|
ARGUMENT:: server
|
|
|
The server on which the buffers to be processed are allocated.
|
|
|
|
|
|
ARGUMENT:: source
|
|
|
The index of the buffer to use as the source material to be pitch-tracked. The different channels of multichannel buffers will be processing sequentially.
|
|
|
|
|
|
ARGUMENT:: startFrame
|
|
|
Where in the srcBuf should the process start, in sample.
|
|
|
|
|
|
ARGUMENT:: numFrames
|
|
|
How many frames should be processed.
|
|
|
|
|
|
ARGUMENT:: startChan
|
|
|
For multichannel srcBuf, which channel should be processed first.
|
|
|
|
|
|
ARGUMENT:: numChans
|
|
|
For multichannel srcBuf, how many channel should be processed.
|
|
|
|
|
|
ARGUMENT:: features
|
|
|
The destination buffer for the pitch descriptors.
|
|
|
|
|
|
ARGUMENT:: algorithm
|
|
|
The algorithm to estimate the pitch. The options are:
|
|
|
TABLE::
|
|
|
## 0 || Cepstrum: Returns a pitch estimate as the location of the second highest peak in the Cepstrum of the signal (after DC).
|
|
|
## 1 || Harmonic Product Spectrum: Implements the Harmonic Product Spectrum algorithm for pitch detection . See e.g. FOOTNOTE:: A. Lerch, "An Introduction to Audio Content Analysis: Applications in Signal Processing and Music Informatics." John Wiley & Sons, 2012.https://onlinelibrary.wiley.com/doi/book/10.1002/9781118393550 ::
|
|
|
## 2 || YinFFT: Implements the frequency domain version of the YIN algorithm, as described in FOOTNOTE::P. M. Brossier, "Automatic Annotation of Musical Audio for Interactive Applications.” QMUL, London, UK, 2007. :: See also https://essentia.upf.edu/documentation/reference/streaming_PitchYinFFT.html
|
|
|
::
|
|
|
|
|
|
ARGUMENT:: windowSize
|
|
|
The window size. As sinusoidal estimation relies on spectral frames, we need to decide what precision we give it spectrally and temporally, in line with Gabor Uncertainty principles. http://www.subsurfwiki.org/wiki/Gabor_uncertainty
|
|
|
|
|
|
ARGUMENT:: hopSize
|
|
|
The window hop size. As sinusoidal estimation relies on spectral frames, we need to move the window forward. It can be any size but low overlap will create audible artefacts.
|
|
|
|
|
|
ARGUMENT:: fftSize
|
|
|
The inner FFT/IFFT size. It should be at least 4 samples long, at least the size of the window, and a power of 2. Making it larger allows an oversampling of the spectral precision.
|
|
|
|
|
|
ARGUMENT:: action
|
|
|
A Function to be evaluated once the offline process has finished and all Buffer's instance variables have been updated on the client side. The function will be passed [features] as an argument.
|
|
|
|
|
|
RETURNS::
|
|
|
Nothing, as the destination buffer is declared in the function call.
|
|
|
|
|
|
EXAMPLES::
|
|
|
|
|
|
code::
|
|
|
// create a buffer with a short clicking sinusoidal burst (220Hz) starting at frame 8192 for 1024 frames
|
|
|
(
|
|
|
b = Buffer.sendCollection(s, (Array.fill(8192,{0}) ++ (Signal.sineFill(1203,[0,0,0,0,0,1],[0,0,0,0,0,0.5pi]).takeThese({|x,i|i>1023})) ++ Array.fill(8192,{0})));
|
|
|
c = Buffer.new(s);
|
|
|
)
|
|
|
|
|
|
// listen to the source and look at the buffer
|
|
|
b.play; b.plot;
|
|
|
|
|
|
// run the process with basic parameters
|
|
|
(
|
|
|
Routine{
|
|
|
t = Main.elapsedTime;
|
|
|
FluidBufPitch.process(s, b, features: c);
|
|
|
(Main.elapsedTime - t).postln;
|
|
|
}.play
|
|
|
)
|
|
|
|
|
|
// look at the analysis
|
|
|
c.plot(separately:true)
|
|
|
|
|
|
// The values are interleaved [pitch,confidence] in the buffer as they are on 2 channels: to get to the right frame, divide the SR of the input by the hopSize, then multiply by 2 because of the channel interleaving
|
|
|
// here we are querying from one frame before (the signal starts at 8192, which is frame 16 (8192/512), therefore starting the query at frame 15, which is index 30.
|
|
|
c.getn(30,10,{|x|x.postln})
|
|
|
|
|
|
// observe that the first frame is silent, as expected. The next frame's confidence is low-ish, because the window is half full (window of 1024, overlap of 512). Then a full window is analysed, with strong confidence. Then another half full window, then silence, as expected.
|
|
|
::
|
|
|
|
|
|
STRONG::A stereo buffer example.::
|
|
|
CODE::
|
|
|
|
|
|
// load two very different files
|
|
|
(
|
|
|
b = Buffer.read(s,File.realpath(FluidBufPitch.class.filenameSymbol).dirname.withTrailingSlash ++ "../AudioFiles/Tremblay-SA-UprightPianoPedalWide.wav");
|
|
|
c = Buffer.read(s,File.realpath(FluidBufPitch.class.filenameSymbol).dirname.withTrailingSlash ++ "../AudioFiles/Tremblay-AaS-AcousticStrums-M.wav");
|
|
|
)
|
|
|
|
|
|
// composite one on left one on right as test signals
|
|
|
FluidBufCompose.process(s, c, numFrames:b.numFrames, startFrame:555000,destStartChan:1, destination:b)
|
|
|
b.play
|
|
|
|
|
|
// create a buffer as destinations
|
|
|
c = Buffer.new(s);
|
|
|
|
|
|
//run the process on them
|
|
|
(
|
|
|
Routine{
|
|
|
t = Main.elapsedTime;
|
|
|
FluidBufPitch.process(s, b, features: c);
|
|
|
(Main.elapsedTime - t).postln;
|
|
|
}.play
|
|
|
)
|
|
|
|
|
|
// look at the buffer: [pitch,confidence] for left then [pitch,confidence] for right
|
|
|
c.plot(separately:true)
|
|
|
::
|
|
|
|
|
|
STRONG::A musical example.::
|
|
|
code::
|
|
|
// create some buffers
|
|
|
(
|
|
|
b = Buffer.read(s,File.realpath(FluidBufPitch.class.filenameSymbol).dirname.withTrailingSlash ++ "../AudioFiles/Tremblay-ASWINE-ScratchySynth-M.wav");
|
|
|
c = Buffer.new(s);
|
|
|
)
|
|
|
|
|
|
// run the process with basic parameters and retrieve the array in the langage side
|
|
|
(
|
|
|
Routine{
|
|
|
t = Main.elapsedTime;
|
|
|
FluidBufPitch.process(s, b, features: c,action:{c.loadToFloatArray(action: {|x| d = x.reshape((x.size()/2).asInt, 2)})});
|
|
|
(Main.elapsedTime - t).postln;
|
|
|
}.play
|
|
|
)
|
|
|
|
|
|
//look at the retrieved formatted array of [pitch,confidence] values
|
|
|
d.postln
|
|
|
|
|
|
//iterate and make an array of the indices which are fitting the conditions
|
|
|
(
|
|
|
e = Array.new;
|
|
|
d.do({
|
|
|
arg val, i;
|
|
|
if ((val[0] > 500) && (val[1] > 0.666)) {e = e.add(i)}; // if pitch is greater than 500Hz and confidence higher than 0.666, keep the index
|
|
|
});
|
|
|
)
|
|
|
e.postln;
|
|
|
|
|
|
//granulate only the frames that are in our buffer
|
|
|
// We need to convert our indices to frame start. Their position was (index * hopSize) - (windowSize) in FluidBufPitch
|
|
|
f = e.collect({arg i; (i * 512) - 1024});
|
|
|
|
|
|
// define a basic grain synth
|
|
|
(
|
|
|
SynthDef(\grain,
|
|
|
{ arg out=0, buf =0 , ind = 0, pan = 0;
|
|
|
var env;
|
|
|
env = EnvGen.kr(Env.new([0,1,0],[512/s.sampleRate].dup,\sine), doneAction: Done.freeSelf);
|
|
|
Out.ar(out, Pan2.ar(PlayBuf.ar(1,buf,startPos:ind),pan));
|
|
|
}).add;
|
|
|
)
|
|
|
|
|
|
// start the sequence
|
|
|
(
|
|
|
a = Pxrand(f, inf).asStream;
|
|
|
Routine({
|
|
|
loop({
|
|
|
Synth(\grain, [\buf, b, \ind, a.next, \pan, (2.0.rand - 1)]);
|
|
|
(256/s.sampleRate).wait;
|
|
|
})
|
|
|
}).play;
|
|
|
)
|
|
|
::
|