diff --git a/release-packaging/Classes/FluidNMFMatch.sc b/release-packaging/Classes/FluidNMFMatch.sc index 230f946..c34b730 100644 --- a/release-packaging/Classes/FluidNMFMatch.sc +++ b/release-packaging/Classes/FluidNMFMatch.sc @@ -1,31 +1,20 @@ FluidNMFMatch : MultiOutUGen { - var FluidDecomposition +RELATED:: Guides/FluCoMa, Guides/FluidDecomposition, Classes/FluidBufNMF + +DESCRIPTION:: +The FluidBufNMF object provides the activation (linked to amplitude) for each pre-defined dictionaries (similar to spectra) predefined in a buffer. These dictionaries would have usually be computed through an offline Non-Negative Matrix Factorisation (NMF) footnote:: Lee, Daniel D., and H. Sebastian Seung. 1999. ‘Learning the Parts of Objects by Non-Negative Matrix Factorization’. Nature 401 (6755): 788–91. https://doi.org/10.1038/44565 :: with the link::Classes/FluidBufNMF:: UGen. NMF has been a popular technique in signal processing research for things like source separation and transcription footnote:: Smaragdis and Brown, Non-Negative Matrix Factorization for Polyphonic Music Transcription.::, although its creative potential is so far relatively unexplored. + +The algorithm takes a buffer in which provides a spectral definition of a number of components, determined by the rank argument and the dictionary buffer channel count. It works iteratively, by trying to find a combination of amplitudes ('activations') that yield the original magnitude spectrogram of the audio input when added together. By and large, there is no unique answer to this question (i.e. there are different ways of accounting for an evolving spectrum in terms of some set of templates and envelopes). In its basic form, NMF is a form of unsupervised learning: it starts with some random data and then converges towards something that minimizes the distance between its generated data and the original:it tends to converge very quickly at first and then level out. Fewer iterations mean less processing, but also less predictable results. + +The whole process can be related to a channel vocoder where, instead of fixed bandpass filters, we get more complex filter shapes and the activations correspond to channel envelopes. + +More information on possible musicianly uses of NMF are availabe in LINK::Guides/FluCoMa:: overview file. + +FluidBufNMF is part of the Fluid Decomposition Toolkit of the FluCoMa project. footnote:: +This was made possible thanks to the FluCoMa project ( http://www.flucoma.org/ ) funded by the European Research Council ( https://erc.europa.eu/ ) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 725899). +:: + + +CLASSMETHODS:: + +METHOD:: kr +The real-time processing method. It takes an audio or control input, and will yield a control stream in the form of a multichannel array of size STRONG::rank::. + +ARGUMENT:: in +The input to the factorisation process. + +ARGUMENT:: dictBufNum + The index of the buffer where the different dictionaries will be matched against. Dictionaries must be STRONG::(fft size / 2) + 1:: frames and STRONG::rank:: channels + +ARGUMENT:: rank + The number of elements the NMF algorithm will try to divide the spectrogram of the source in. This should match the number of channels of the dictBuf defined above. + +ARGUMENT:: nIter + The NMF process is iterative, trying to converge to the smallest error in its factorisation. The number of iterations will decide how many times it tries to adjust its estimates. Higher numbers here will be more CPU expensive, lower numbers will be more unpredictable in quality. + +ARGUMENT:: winSize + The window size. As NMF relies on spectral frames, we need to decide what precision we give it spectrally and temporally, in line with Gabor Uncertainty principles. http://www.subsurfwiki.org/wiki/Gabor_uncertainty + +ARGUMENT:: hopSize + The window hope size. As NMF relies on spectral frames, we need to move the window forward. It can be any size but low overlap will create audible artefacts. + +ARGUMENT:: fftSize + The inner FFT/IFFT size. It should be at least 4 samples long, at least the size of the window, and a power of 2. Making it larger allows an oversampling of the spectral precision. + +returns:: + A multichannel array, giving for each dictionary the activation value. + + +EXAMPLES:: +yes + CODE:: + //indeed +:: + \ No newline at end of file diff --git a/src/FluidNMFMatch/test.scd b/src/FluidNMFMatch/test.scd index 6145ad1..7ec2850 100644 --- a/src/FluidNMFMatch/test.scd +++ b/src/FluidNMFMatch/test.scd @@ -1,21 +1,41 @@ s.reboot; - + //from Fixed NMF example: ( -b = Buffer.read(s,"../../release-packaging/AudioFiles/Tremblay-AaS-AcousticStrums-M.wav".resolveRelative); +b = Buffer.read(s,File.realpath(FluidBufNMF.class.filenameSymbol).dirname.withTrailingSlash ++ "../AudioFiles/Tremblay-AaS-AcousticStrums-M.wav"); c = Buffer.new(s); x = Buffer.new(s); +e = Buffer.alloc(s,1,1); +y = Buffer.alloc(s,1,1); ) - + // train only 2 seconds ( Routine { - FluidBufNMF.process(s,b.bufnum,0,88200,0,1, c.bufnum, x.bufnum, rank:10); - s.sync; - c.query; + FluidBufNMF.process(s,b.bufnum,0,88200,0,1, c.bufnum, x.bufnum, rank:10); + s.sync; + c.query; }.play; ) +// find the rank that has the picking sound by changing which channel to listen to +( + ~element = 9; + {PlayBuf.ar(10,c.bufnum)[~element]}.play +) + +// copy all the other ranks on itself and the picking dictionnary as the sole component of the 1st channel +( +Routine{ + (0..9).remove(~element).do({|chan|FluidBufCompose.process(s,srcBufNumA: x.bufnum, startChanA:chan, nChansA: 1, srcBufNumB: e.bufnum, dstBufNum: e.bufnum)}); + s.sync; + e.query; + s.sync; + FluidBufCompose.process(s,srcBufNumA: x.bufnum, startChanA: ~element, nChansA: 1, srcBufNumB: e.bufnum, dstStartChanB: 1, dstBufNum: e.bufnum); + s.sync; + e.query; +}.play; +) -{FluidNMFMatch.kr(PlayBuf.ar(1,b.bufnum),x.bufnum,10)}.play +{DelayN.ar(PlayBuf.ar(1,b.bufnum),0.1,1024/44100, FluidNMFMatch.kr(PlayBuf.ar(1,b.bufnum),e.bufnum,2))}.play