You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
194 lines
8.7 KiB
Plaintext
194 lines
8.7 KiB
Plaintext
TITLE:: FluidBufPitch
|
|
SUMMARY:: A Selection of Pitch Descriptors on a Buffer
|
|
CATEGORIES:: Libraries>FluidCorpusManipulation
|
|
RELATED:: Guides/FluidCorpusManipulation, Guides/FluidBufMultiThreading, Classes/SpecCentroid, Classes/SpecFlatness, Classes/SpecCentroid, Classes/SpecPcile
|
|
|
|
|
|
DESCRIPTION::
|
|
This class implements three popular pitch descriptors, computed as frequency and the confidence in its value. It is part of the LINK:: Guides/FluidCorpusManipulation##Fluid Corpus Manipulation Toolkit::. For more explanations, learning material, and discussions on its musicianly uses, visit http://www.flucoma.org/
|
|
|
|
The process will return a multichannel buffer with two channels per input channel, one for pitch and one for the pitch tracking confidence. A pitch of 0 Hz is yield (or -999.0 when the unit is in MIDI note) when the algorithm cannot find a fundamental at all. Each sample represents a value, which is every hopSize. Its sampling rate is sourceSR / hopSize.
|
|
|
|
STRONG::Threading::
|
|
|
|
By default, this UGen spawns a new thread to avoid blocking the server command queue, so it is free to go about with its business. For a more detailed discussion of the available threading and monitoring options, including the two undocumented Class Methods below (.processBlocking and .kr) please read the guide LINK::Guides/FluidBufMultiThreading::.
|
|
|
|
CLASSMETHODS::
|
|
|
|
METHOD:: process, processBlocking
|
|
This is the method that calls for the pitch descriptor to be calculated on a given source buffer.
|
|
|
|
ARGUMENT:: server
|
|
The server on which the buffers to be processed are allocated.
|
|
|
|
ARGUMENT:: source
|
|
The index of the buffer to use as the source material to be pitch-tracked. The different channels of multichannel buffers will be processing sequentially.
|
|
|
|
ARGUMENT:: startFrame
|
|
Where in the srcBuf should the process start, in sample.
|
|
|
|
ARGUMENT:: numFrames
|
|
How many frames should be processed.
|
|
|
|
ARGUMENT:: startChan
|
|
For multichannel srcBuf, which channel should be processed first.
|
|
|
|
ARGUMENT:: numChans
|
|
For multichannel srcBuf, how many channel should be processed.
|
|
|
|
ARGUMENT:: features
|
|
The destination buffer for the pitch descriptors.
|
|
|
|
ARGUMENT:: algorithm
|
|
The algorithm to estimate the pitch. The options are:
|
|
TABLE::
|
|
## 0 || Cepstrum: Returns a pitch estimate as the location of the second highest peak in the Cepstrum of the signal (after DC).
|
|
## 1 || Harmonic Product Spectrum: Implements the Harmonic Product Spectrum algorithm for pitch detection . See e.g. FOOTNOTE:: A. Lerch, "An Introduction to Audio Content Analysis: Applications in Signal Processing and Music Informatics." John Wiley & Sons, 2012.https://onlinelibrary.wiley.com/doi/book/10.1002/9781118393550 ::
|
|
## 2 || YinFFT: Implements the frequency domain version of the YIN algorithm, as described in FOOTNOTE::P. M. Brossier, "Automatic Annotation of Musical Audio for Interactive Applications." QMUL, London, UK, 2007. :: See also https://essentia.upf.edu/documentation/reference/streaming_PitchYinFFT.html
|
|
::
|
|
|
|
ARGUMENT:: minFreq
|
|
The minimum frequency that the algorithm will search for an estimated fundamental. This sets the lowest value that will be generated.
|
|
|
|
ARGUMENT:: maxFreq
|
|
The maximum frequency that the algorithm will search for an estimated fundamental. This sets the highest value that will be generated.
|
|
|
|
ARGUMENT:: unit
|
|
The unit of the estimated value. The default of 0 is in Hz. A value of 1 will convert to MIDI note values.
|
|
|
|
ARGUMENT:: windowSize
|
|
The window size. As sinusoidal estimation relies on spectral frames, we need to decide what precision we give it spectrally and temporally, in line with Gabor Uncertainty principles. http://www.subsurfwiki.org/wiki/Gabor_uncertainty
|
|
|
|
ARGUMENT:: hopSize
|
|
The window hop size. As sinusoidal estimation relies on spectral frames, we need to move the window forward. It can be any size but low overlap will create audible artefacts. The -1 default value will default to half of windowSize (overlap of 2).
|
|
|
|
ARGUMENT:: fftSize
|
|
The inner FFT/IFFT size. It should be at least 4 samples long, at least the size of the window, and a power of 2. Making it larger allows an oversampling of the spectral precision. The -1 default value will use the next power of 2 equal or above the windowSize.
|
|
|
|
ARGUMENT:: padding
|
|
Controls the zero-padding added to either end of the source buffer or segment. Possible values are 0 (no padding), 1 (default, half the window size), or 2 (window size - hop size). Padding ensures that all input samples are completely analysed: with no padding, the first analysis window starts at time 0, and the samples at either end will be tapered by the STFT windowing function. Mode 1 has the effect of centering the first sample in the analysis window and ensuring that the very start and end of the segment are accounted for in the analysis. Mode 2 can be useful when the overlap factor (window size / hop size) is greater than 2, to ensure that the input samples at either end of the segment are covered by the same number of analysis frames as the rest of the analysed material.
|
|
|
|
ARGUMENT:: freeWhenDone
|
|
Free the server instance when processing complete. Default true
|
|
|
|
ARGUMENT:: action
|
|
A Function to be evaluated once the offline process has finished and all Buffer's instance variables have been updated on the client side. The function will be passed [features] as an argument.
|
|
|
|
returns:: an instance of the processor
|
|
|
|
EXAMPLES::
|
|
|
|
code::
|
|
// create a buffer with a short clicking sinusoidal burst (220Hz) starting at frame 8192 for 1024 frames
|
|
(
|
|
b = Buffer.sendCollection(s, (Array.fill(8192,{0}) ++ (Signal.sineFill(1203,[0,0,0,0,0,1],[0,0,0,0,0,0.5pi]).takeThese({|x,i|i>1023})) ++ Array.fill(8192,{0})));
|
|
c = Buffer.new(s);
|
|
)
|
|
|
|
// listen to the source and look at the buffer
|
|
b.play; b.plot;
|
|
|
|
// run the process with basic parameters
|
|
(
|
|
Routine{
|
|
t = Main.elapsedTime;
|
|
FluidBufPitch.process(s, b, features: c).wait;
|
|
(Main.elapsedTime - t).postln;
|
|
}.play
|
|
)
|
|
|
|
// look at the analysis
|
|
c.plot(separately:true)
|
|
|
|
// The values are interleaved [pitch,confidence] in the buffer as they are on 2 channels: to get to the right frame, divide the SR of the input by the hopSize, then multiply by 2 because of the channel interleaving
|
|
// here we are querying from one frame before (the signal starts at 8192, which is frame 16 (8192/512), therefore starting the query at frame 15, which is index 30.
|
|
c.getn(30,10,{|x|x.postln})
|
|
|
|
// observe that the first frame is silent, as expected. The next frame's confidence is low-ish, because the window is half full (window of 1024, overlap of 512). Then a full window is analysed, with strong confidence. Then another half full window, then silence, as expected.
|
|
::
|
|
|
|
STRONG::A stereo buffer example.::
|
|
CODE::
|
|
|
|
// load two very different files
|
|
(
|
|
b = Buffer.read(s,FluidFilesPath("Tremblay-SA-UprightPianoPedalWide.wav"));
|
|
c = Buffer.read(s,FluidFilesPath("Tremblay-AaS-AcousticStrums-M.wav"));
|
|
)
|
|
|
|
// composite one on left one on right as test signals
|
|
FluidBufCompose.process(s, c, numFrames:b.numFrames, startFrame:555000,destStartChan:1, destination:b)
|
|
b.play
|
|
|
|
// create a buffer as destinations
|
|
c = Buffer.new(s);
|
|
|
|
//run the process on them, with limited bandwidth and units in MIDI notes
|
|
(
|
|
Routine{
|
|
t = Main.elapsedTime;
|
|
FluidBufPitch.process(s, b, features: c, minFreq:200, maxFreq:2000, unit:1).wait;
|
|
(Main.elapsedTime - t).postln;
|
|
}.play
|
|
)
|
|
|
|
// look at the buffer: [pitch,confidence] for left then [pitch,confidence] for right
|
|
c.plot(separately:true)
|
|
::
|
|
|
|
STRONG::A musical example.::
|
|
code::
|
|
// create some buffers
|
|
(
|
|
b = Buffer.read(s,FluidFilesPath("Tremblay-ASWINE-ScratchySynth-M.wav"));
|
|
c = Buffer.new(s);
|
|
)
|
|
|
|
// run the process with basic parameters and retrieve the array in the langage side
|
|
(
|
|
Routine{
|
|
t = Main.elapsedTime;
|
|
FluidBufPitch.process(s, b, features: c,action:{c.loadToFloatArray(action: {|x| d = x.reshape((x.size()/2).asInteger, 2)})}).wait;
|
|
(Main.elapsedTime - t).postln;
|
|
}.play
|
|
)
|
|
|
|
//look at the retrieved formatted array of [pitch,confidence] values
|
|
d.postln
|
|
|
|
//iterate and make an array of the indices which are fitting the conditions
|
|
(
|
|
e = Array.new;
|
|
d.do({
|
|
arg val, i;
|
|
if ((val[0] > 500) && (val[1] > 0.98)) {e = e.add(i)}; // if pitch is greater than 500Hz and confidence higher than 0.98, keep the index
|
|
});
|
|
)
|
|
e.postln;
|
|
|
|
//granulate only the frames that are in our buffer
|
|
// We need to convert our indices to frame start. Their position was (index * hopSize) - (hopSize) in FluidBufPitch
|
|
f = e.collect({arg i; (i * 512) - 512});
|
|
|
|
// define a basic grain synth
|
|
(
|
|
SynthDef(\grain,
|
|
{ arg out=0, buf =0 , ind = 0, pan = 0;
|
|
var env;
|
|
env = EnvGen.kr(Env.new([0,1,0],[512/s.sampleRate].dup,\sine), doneAction: Done.freeSelf);
|
|
Out.ar(out, Pan2.ar(PlayBuf.ar(1,buf,startPos:ind),pan));
|
|
}).add;
|
|
)
|
|
|
|
// start the sequence
|
|
(
|
|
a = Pxrand(f, inf).asStream;
|
|
Routine({
|
|
loop({
|
|
Synth(\grain, [\buf, b, \ind, a.next, \pan, (2.0.rand - 1)]);
|
|
(256/s.sampleRate).wait;
|
|
})
|
|
}).play;
|
|
)
|
|
::
|