@ -31,10 +31,7 @@ The whole process can be related to a channel vocoder where, instead of fixed ba
More information on possible musicianly uses of NMF are availabe in LINK::Guides/FluCoMa:: overview file.
FluidBufNMF is part of the Fluid Decomposition Toolkit of the FluCoMa project. footnote::
This was made possible thanks to the FluCoMa project ( http://www.flucoma.org/ ) funded by the European Research Council ( https://erc.europa.eu/ ) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 725899).
::
This was made possible thanks to the FluCoMa project ( http://www.flucoma.org/ ) funded by the European Research Council ( https://erc.europa.eu/ ) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 725899). ::
CLASSMETHODS::
@ -113,6 +110,58 @@ RETURNS::
EXAMPLES::
STRONG::A didactic example::
CODE::
(
// create buffers
b = Buffer.alloc(s,44100);
c = Buffer.alloc(s, 44100);
d = Buffer.new(s);
e = Buffer.new(s);
f = Buffer.new(s);
g = Buffer.new(s);
)
(
// fill them with 2 clearly segregated sine waves and composite a buffer where they are consecutive
The granularity of the window in which the algorithm looks for change, in samples. A small number will be sensitive to short term changes, and a large number should look for long term changes.
ARGUMENT:: thresh
The normalised threshold, between 0 an 1, to consider a peak as a sinusoidal component from the in the novelty curve.
The normalised threshold, between 0 an 1, on the novelty curve to consider it a segmentation point.
ARGUMENT:: filterSize
The size of a smoothing filter that is applied on the novelty curve. A larger filter filter size allows for cleaner cuts on very sharp changes.
ARGUMENT:: winSize
The window size. As novelty estimation relies on spectral frames, we need to decide what precision we give it spectrally and temporally, in line with Gabor Uncertainty principles. http://www.subsurfwiki.org/wiki/Gabor_uncertainty
@ -91,3 +94,39 @@ c.query;
}.play;
)
::
STRONG::Examples of the impact of the filterSize::
CODE::
// load some buffers
(
b = Buffer.read(s,File.realpath(FluidBufNoveltySlice.class.filenameSymbol).dirname.withTrailingSlash ++ "../AudioFiles/Tremblay-AaS-AcousticStrums-M.wav");
// change the filterSize in the code above to 4. Then to 8. Listen in between to the differences.
// What's happening? In the first instance (filterSize = 0), the novelty line is jittery and therefore overtriggers on the arpegiated guitar. We also can hear attacks at the end of the segment. Setting the threshold higher (like in the 'Basic Example' pane) misses some more subtle variations.
// So in the second settings (filterSize = 4), we smooth the novelty line a little, which allows us to catch small differences that are not jittery. It also corrects the ending cutting by the same trick: the averaging of the sharp pick is sliding up, crossing the threshold slightly earlier.
// If we smooth too much, like the third settings (filterSize = 8), we start to loose precision. Have fun with different values of theshold then will allow you to find the perfect segment for your signal.
The averaging window of the error detection function. It needs smoothing as it is very jittery. The longer the window, the less precise, but the less false positives.
ARGUMENT:: debounce
The window size in sample within which positive detections will be clumped together to avoid overdetecting in time. No slice will be shorter than this duration.
The window size in sample within which positive detections will be clumped together to avoid overdetecting in time.
ARGUMENT:: minSlice
The minimum duration of a slice in samples.
RETURNS::
Nothing, as the destination buffer is declared in the function call.
The FluidBufNMF object provides the activation (linked to amplitude) for each pre-defined dictionaries (similar to spectra) predefined in a buffer. These dictionaries would have usually be computed through an offline Non-Negative Matrix Factorisation (NMF) footnote:: Lee, Daniel D., and H. Sebastian Seung. 1999. ‘Learning the Parts of Objects by Non-Negative Matrix Factorization’. Nature 401 (6755): 788–91. https://doi.org/10.1038/44565 :: with the link::Classes/FluidBufNMF:: UGen. NMF has been a popular technique in signal processing research for things like source separation and transcription footnote:: Smaragdis and Brown, Non-Negative Matrix Factorization for Polyphonic Music Transcription.::, although its creative potential is so far relatively unexplored.
The FluidNMFMatch object matches an incoming audio signal against a set of spectral templates using an slimmed-down version of Nonnegative Matrix Factorisation (NMF) footnote:: Lee, Daniel D., and H. Sebastian Seung. 1999. ‘Learning the Parts of Objects by Non-Negative Matrix Factorization’. Nature 401 (6755): 788–91. https://doi.org/10.1038/44565. ::
The algorithm takes a buffer in which provides a spectral definition of a number of components, determined by the rank argument and the dictionary buffer channel count. It works iteratively, by trying to find a combination of amplitudes ('activations') that yield the original magnitude spectrogram of the audio input when added together. By and large, there is no unique answer to this question (i.e. there are different ways of accounting for an evolving spectrum in terms of some set of templates and envelopes). In its basic form, NMF is a form of unsupervised learning: it starts with some random data and then converges towards something that minimizes the distance between its generated data and the original:it tends to converge very quickly at first and then level out. Fewer iterations mean less processing, but also less predictable results.
It outputs at kr the degree of detected match for each template (the activation amount, in NMF-terms). The spectral templates are presumed to have been produced by the offline NMF process (link::Classes/FluidBufNMF::), and must be the correct size with respect to the FFT settings being used (FFT size / 2 + 1 frames long). The rank of the decomposition is determined by the number of channels in the supplied buffer of templates, up to a maximum set by the STRONG::maxRank:: parameter.
NMF has been a popular technique in signal processing research for things like source separation and transcription footnote:: Smaragdis and Brown, Non-Negative Matrix Factorization for Polyphonic Music Transcription.::, although its creative potential is so far relatively unexplored. It works iteratively, by trying to find a combination of amplitudes ('activations') that yield the original magnitude spectrogram of the audio input when added together. By and large, there is no unique answer to this question (i.e. there are different ways of accounting for an evolving spectrum in terms of some set of templates and envelopes). In its basic form, NMF is a form of unsupervised learning: it starts with some random data and then converges towards something that minimizes the distance between its generated data and the original:it tends to converge very quickly at first and then level out. Fewer iterations mean less processing, but also less predictable results.
The whole process can be related to a channel vocoder where, instead of fixed bandpass filters, we get more complex filter shapes and the activations correspond to channel envelopes.
More information on possible musicianly uses of NMF are availabe in LINK::Guides/FluCoMa:: overview file.
FluidBufNMF is part of the Fluid Decomposition Toolkit of the FluCoMa project. footnote::
This was made possible thanks to the FluCoMa project ( http://www.flucoma.org/ ) funded by the European Research Council ( https://erc.europa.eu/ ) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 725899).
::
FluidBufNMF is part of the Fluid Decomposition Toolkit of the FluCoMa project. footnote::This was made possible thanks to the FluCoMa project ( http://www.flucoma.org/ ) funded by the European Research Council ( https://erc.europa.eu/ ) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 725899). ::
CLASSMETHODS::
METHOD:: kr
The real-time processing method. It takes an audio or control input, and will yield a control stream in the form of a multichannel array of size STRONG::rank::.
The real-time processing method. It takes an audio or control input, and will yield a control stream in the form of a multichannel array of size STRONG::maxRank:: . If the dictionary buffer has fewer than maxRank channels, the remaining outputs will be zeroed.
ARGUMENT:: in
The input to the factorisation process.
The signal input to the factorisation process.
ARGUMENT:: dictBufNum
The index of the buffer where the different dictionaries will be matched against. Dictionaries must be STRONG::(fft size / 2) + 1:: frames and STRONG::rank:: channels
The server index of the buffer containing the different dictionaries that the input signal will be matched against. Dictionaries must be STRONG::(fft size / 2) + 1:: frames. If the buffer has more than STRONG::maxRank:: channels, the excess will be ignored.
ARGUMENT:: rank
The number of elements the NMF algorithm will try to divide the spectrogram of the source in. This should match the number of channels of the dictBuf defined above.
ARGUMENT::maxRank
The maximum number of elements the NMF algorithm will try to divide the spectrogram of the source in. This dictates the number of output channelsfor the ugen.
ARGUMENT:: nIter
The NMF process is iterative, trying to converge to the smallest error in its factorisation. The number of iterations will decide how many times it tries to adjust its estimates. Higher numbers here will be more CPU expensive, lower numbers will be more unpredictable in quality.
The NMF process is iterative, trying to converge to the smallest error in its factorisation. The number of iterations will decide how many times it tries to adjust its estimates. Higher numbers here will be more CPU intensive, lower numbers will be more unpredictable in quality.
ARGUMENT:: winSize
The window size. As NMF relies on spectral frames, we need to decide what precision we give it spectrally and temporally, in line with Gabor Uncertainty principles. http://www.subsurfwiki.org/wiki/Gabor_uncertainty
The number of samples that are analysed at a time. A lower number yields greater temporal resolution, at the expense of spectral resoultion, and vice-versa.
ARGUMENT:: hopSize
The window hope size. As NMF relies on spectral frames, we need to move the window forward. It can be any size but low overlap will create audible artefacts.
The window hope size. As NMF relies on spectral frames, we need to move the window forward. It can be any size but low overlap will create audible artefacts. Default = winSize / 2
ARGUMENT:: fftSize
The inner FFT/IFFT size. It should be at least 4 samples long, at least the size of the window, and a power of 2. Making it larger allows an oversampling of the spectral precision.
The FFT/IFFT size. It should be at least 4 samples long, at least the size of the window, and a power of 2. Making it larger allows an oversampling of the spectral precision. Default = winSize
returns::
A multichannel array, giving for each dictionary the activation value.
RETURNS::
A multichannel kr output, giving for each dictionary component the activation amount.
EXAMPLES::
yes
STRONG::A didactic example::
CODE::
(
// create buffers
b= Buffer.alloc(s,44100);
c = Buffer.alloc(s, 44100);
d = Buffer.new(s);
e= Buffer.new(s);
)
(
// fill them with 2 clearly segregated sine waves and composite a buffer where they are consecutive
b = Buffer.read(s,File.realpath(FluidNMFMatch.class.filenameSymbol).dirname.withTrailingSlash ++ "../AudioFiles/Tremblay-BaB-SoundscapeGolcarWithDog.wav");
//load in the sound in and a pretrained dictionary
(
b = Buffer.read(s,File.realpath(FluidNMFMatch.class.filenameSymbol).dirname.withTrailingSlash ++ "../AudioFiles/Tremblay-SA-UprightPianoPedalWide.wav");
c = Buffer.read(s,File.realpath(FluidNMFMatch.class.filenameSymbol).dirname.withTrailingSlash ++ "../AudioFiles/filters/piano-dicts.wav");
)
b.play
c.query
//use the pretrained dictionary to compute activations of each notes to drive the amplitude of a resynth
The averaging window of the error detection function. It needs smoothing as it is very jittery. The longer the window, the less precise, but the less false positives.
ARGUMENT:: debounce
The window size in sample within with positive detections will be clumped together to avoid overdetecting in time. No slice will be shorter than this duration.
The window size in sample within with positive detections will be clumped together to avoid overdetecting in time.
ARGUMENT:: minSlice
The minimum duration of a slice in samples.
RETURNS::
An audio stream with impulses at detected transients. The latency between the input and the output is (blockSize + padSize - order) samples.
@ -54,18 +57,17 @@ b = Buffer.read(s,File.realpath(FluidTransientSlice.class.filenameSymbol).dirnam
// A tool from the FluCoMa project, funded by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 725899)
// A tool from the FluCoMa project, funded by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 725899)
// A tool from the FluCoMa project, funded by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 725899)
// A tool from the FluCoMa project, funded by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 725899)