|
|
TITLE:: FluidNMFMatch
|
|
|
SUMMARY:: Real-Time Non-Negative Matrix Factorisation with Fixed Dictionaries
|
|
|
CATEGORIES:: Libraries>FluidDecomposition
|
|
|
RELATED:: Guides/FluCoMa, Guides/FluidDecomposition, Classes/FluidBufNMF
|
|
|
|
|
|
DESCRIPTION::
|
|
|
The FluidNMFMatch object matches an incoming audio signal against a set of spectral templates using an slimmed-down version of Nonnegative Matrix Factorisation (NMF)footnote:: Lee, Daniel D., and H. Sebastian Seung. 1999. ‘Learning the Parts of Objects by Non-Negative Matrix Factorization’. Nature 401 (6755): 788–91. https://doi.org/10.1038/44565.
|
|
|
|
|
|
It outputs at kr the degree of detected match for each template (the activation amount, in NMF-terms). The spectral templates are presumed to have been produced by the offline NMF process (link::Classes/FluidBufNMF::), and must be the correct size with respect to the FFT settings being used (FFT size / 2 + 1 frames long). The rank of the decomposition is determined by the number of channels in the supplied buffer of templates, up to a maximum set by the ::maxrank:: parameter.
|
|
|
|
|
|
NMF has been a popular technique in signal processing research for things like source separation and transcription footnote:: Smaragdis and Brown, Non-Negative Matrix Factorization for Polyphonic Music Transcription.::, although its creative potential is so far relatively unexplored. It works iteratively, by trying to find a combination of amplitudes ('activations') that yield the original magnitude spectrogram of the audio input when added together. By and large, there is no unique answer to this question (i.e. there are different ways of accounting for an evolving spectrum in terms of some set of templates and envelopes). In its basic form, NMF is a form of unsupervised learning: it starts with some random data and then converges towards something that minimizes the distance between its generated data and the original:it tends to converge very quickly at first and then level out. Fewer iterations mean less processing, but also less predictable results.
|
|
|
|
|
|
The whole process can be related to a channel vocoder where, instead of fixed bandpass filters, we get more complex filter shapes and the activations correspond to channel envelopes.
|
|
|
|
|
|
More information on possible musicianly uses of NMF are availabe in LINK::Guides/FluCoMa:: overview file.
|
|
|
|
|
|
FluidBufNMF is part of the Fluid Decomposition Toolkit of the FluCoMa project. footnote::
|
|
|
This was made possible thanks to the FluCoMa project ( http://www.flucoma.org/ ) funded by the European Research Council ( https://erc.europa.eu/ ) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 725899).
|
|
|
::
|
|
|
|
|
|
|
|
|
CLASSMETHODS::
|
|
|
|
|
|
METHOD:: kr
|
|
|
The real-time processing method. It takes an audio or control input, and will yield a control stream in the form of a multichannel array of size STRONG::rank::.
|
|
|
|
|
|
ARGUMENT:: in
|
|
|
The input to the factorisation process.
|
|
|
|
|
|
ARGUMENT:: dictBufNum
|
|
|
The index of the buffer where the different dictionaries will be matched against. Dictionaries must be STRONG::(fft size / 2) + 1:: frames and STRONG::rank:: channels
|
|
|
|
|
|
ARGUMENT:: rank
|
|
|
The number of elements the NMF algorithm will try to divide the spectrogram of the source in. This should match the number of channels of the dictBuf defined above.
|
|
|
|
|
|
ARGUMENT:: nIter
|
|
|
The NMF process is iterative, trying to converge to the smallest error in its factorisation. The number of iterations will decide how many times it tries to adjust its estimates. Higher numbers here will be more CPU expensive, lower numbers will be more unpredictable in quality.
|
|
|
|
|
|
ARGUMENT:: winSize
|
|
|
The window size. As NMF relies on spectral frames, we need to decide what precision we give it spectrally and temporally, in line with Gabor Uncertainty principles. http://www.subsurfwiki.org/wiki/Gabor_uncertainty
|
|
|
|
|
|
ARGUMENT:: hopSize
|
|
|
The window hope size. As NMF relies on spectral frames, we need to move the window forward. It can be any size but low overlap will create audible artefacts.
|
|
|
|
|
|
ARGUMENT:: fftSize
|
|
|
The inner FFT/IFFT size. It should be at least 4 samples long, at least the size of the window, and a power of 2. Making it larger allows an oversampling of the spectral precision.
|
|
|
|
|
|
returns::
|
|
|
A multichannel array, giving for each dictionary the activation value.
|
|
|
|
|
|
|
|
|
EXAMPLES::
|
|
|
yes
|
|
|
CODE::
|
|
|
//indeed
|
|
|
::
|
|
|
|