You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

177 lines
7.2 KiB
Plaintext

TITLE:: FluidBufNoveltySlice
SUMMARY:: Buffer-Based Novelty-Based Slicer
CATEGORIES:: Libraries>FluidDecomposition, UGens>Buffer
RELATED:: Guides/FluCoMa, Guides/FluidDecomposition, Guides/FluidBufMultiThreading
DESCRIPTION::
This class implements a non-real-time slicer using an algorithm assessing novelty in the signal to estimate the slicing points. A novelty curve is being derived from running a kernel across the diagonal of the similarity matrix, and looking for peak of changes. It implements the seminal results published in 'Automatic Audio Segmentation Using a Measure of Audio Novelty' by J Foote. It is part of the LINK:: Guides/FluidDecomposition:: of LINK:: Guides/FluCoMa::. For more explanations, learning material, and discussions on its musicianly uses, visit http://www.flucoma.org/
The process will return a buffer which contains indices (in sample) of estimated starting points of different slices.
STRONG::Threading::
By default, this UGen spawns a new thread to avoid blocking the server command queue, so it is free to go about with its business. For a more detailed discussion of the available threading and monitoring options, including the two undocumented Class Methods below (.processBlocking and .kr) please read the guide LINK::Guides/FluidBufMultiThreading::.
CLASSMETHODS::
METHOD:: process
This is the method that calls for the slicing to be calculated on a given source buffer.
ARGUMENT:: server
The server on which the buffers to be processed are allocated.
ARGUMENT:: source
The index of the buffer to use as the source material to be sliced through novelty identification. The different channels of multichannel buffers will be summed.
ARGUMENT:: startFrame
Where in the srcBuf should the slicing process start, in sample.
ARGUMENT:: numFrames
How many frames should be processed.
ARGUMENT:: startChan
For multichannel srcBuf, which channel should be processed.
ARGUMENT:: numChans
For multichannel srcBuf, how many channel should be summed.
ARGUMENT:: indices
The index of the buffer where the indices (in sample) of the estimated starting points of slices will be written. The first and last points are always the boundary points of the analysis.
ARGUMENT:: feature
The feature on which novelty is computed.
table::
##0 || Spectrum || The magnitude of the full spectrum.
##1 || MFCC || 13 Mel-Frequency Cepstrum Coefficients.
##2 || Pitch || The pitch and its confidence.
##3 || Loudness || The TruePeak and Loudness.
::
ARGUMENT:: kernelSize
The granularity of the window in which the algorithm looks for change, in samples. A small number will be sensitive to short term changes, and a large number should look for long term changes.
ARGUMENT:: threshold
The normalised threshold, between 0 an 1, on the novelty curve to consider it a segmentation point.
ARGUMENT:: filterSize
The size of a smoothing filter that is applied on the novelty curve. A larger filter filter size allows for cleaner cuts on very sharp changes.
ARGUMENT:: minSliceLength
The minimum duration of a slice in number of hopSize.
ARGUMENT:: windowSize
The window size. As novelty estimation relies on spectral frames, we need to decide what precision we give it spectrally and temporally, in line with Gabor Uncertainty principles. http://www.subsurfwiki.org/wiki/Gabor_uncertainty
ARGUMENT:: hopSize
The window hop size. As novelty estimation relies on spectral frames, we need to move the window forward. It can be any size but low overlap will create audible artefacts. The -1 default value will default to half of windowSize (overlap of 2).
ARGUMENT:: fftSize
The inner FFT/IFFT size. It should be at least 4 samples long, at least the size of the window, and a power of 2. Making it larger allows an oversampling of the spectral precision. The -1 default value will use the next power of 2 equal or above the windowSize.
ARGUMENT:: action
A Function to be evaluated once the offline process has finished and indices instance variables have been updated on the client side. The function will be passed indices as an argument.
RETURNS::
Nothing, as the various destination buffers are declared in the function call.
EXAMPLES::
code::
// load some buffers
(
b = Buffer.read(s,File.realpath(FluidBufNoveltySlice.class.filenameSymbol).dirname.withTrailingSlash ++ "../AudioFiles/Tremblay-AaS-AcousticStrums-M.wav");
c = Buffer.new(s);
)
(
// with basic params, with a minimum slight length to avoid over
Routine{
t = Main.elapsedTime;
FluidBufNoveltySlice.process(s,b, indices: c, threshold:0.4,filterSize: 4, minSliceLength: 8).wait;
(Main.elapsedTime - t).postln;
}.play
)
//check the number of slices: it is the number of frames in the transBuf minus the boundary index.
c.query;
//loops over a splice with the MouseX
(
{
BufRd.ar(1, b,
Phasor.ar(0,1,
BufRd.kr(1, c,
MouseX.kr(0, BufFrames.kr(c) - 1), 0, 1),
BufRd.kr(1, c,
MouseX.kr(1, BufFrames.kr(c)), 0, 1),
BufRd.kr(1,c,
MouseX.kr(0, BufFrames.kr(c) - 1), 0, 1)), 0, 1);
}.play;
)
::
STRONG::Examples of the impact of the filterSize::
CODE::
// load some buffers
(
b = Buffer.read(s,File.realpath(FluidBufNoveltySlice.class.filenameSymbol).dirname.withTrailingSlash ++ "../AudioFiles/Tremblay-AaS-AcousticStrums-M.wav");
c = Buffer.new(s);
)
// process with a given filterSize
(
Routine{
FluidBufNoveltySlice.process(s,b, indices: c, kernelSize:31, threshold:0.1, filterSize:1).wait;
//check the number of slices: it is the number of frames in the transBuf minus the boundary index.
c.query;
}.play;
)
//play slice number 3
(
{
BufRd.ar(1, b,
Line.ar(
BufRd.kr(1, c, DC.kr(3), 0, 1),
BufRd.kr(1, c, DC.kr(4), 0, 1),
(BufRd.kr(1, c, DC.kr(4)) - BufRd.kr(1, c, DC.kr(3), 0, 1) + 1) / s.sampleRate),
0,1);
}.play;
)
// change the filterSize in the code above to 4. Then to 12. Listen in between to the differences.
// What's happening? In the first instance (filterSize = 1), the novelty line is jittery and therefore overtriggers on the arpegiated guitar. We also can hear attacks at the end of the segment. Setting the threshold higher (like in the 'Basic Example' pane) misses some more subtle variations.
// So in the second settings (filterSize = 4), we smooth the novelty line a little, which allows us to catch small differences that are not jittery. It also corrects the ending cutting by the same trick: the averaging of the sharp pick is sliding up, crossing the threshold slightly earlier.
// If we smooth too much, like the third settings (filterSize = 12), we start to loose precision and miss attacks. Have fun with different values of theshold then will allow you to find the perfect segment for your signal.
::
STRONG::A stereo buffer example.::
CODE::
// make a stereo buffer
b = Buffer.alloc(s,88200,2);
// add some stereo clicks and listen to them
((0..3)*22050+11025).do({|item,index| b.set(item+(index%2), 1.0)});
b.play
// create a new buffer as destinations
c = Buffer.new(s);
//run the process on them
(
// with basic params
Routine{
t = Main.elapsedTime;
FluidBufNoveltySlice.process(s,b, indices: c, threshold:0.3).wait;
(Main.elapsedTime - t).postln;
}.play
)
// list the indicies of detected attacks - the two input channels have been summed
c.getn(0,c.numFrames,{|item|(item * 2).postln;})
::