You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

86 lines
4.7 KiB
Plaintext

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

TITLE:: FluidOnsetSlice
SUMMARY:: Spectral Difference-Based Real-Time Audio Slicer
CATEGORIES:: Libraries>FluidDecomposition
RELATED:: Guides/FluCoMa, Guides/FluidDecomposition
DESCRIPTION::
This class implements many spectral based onset detection metrics, most of them taken from the literature. (http://www.dafx.ca/proceedings/papers/p_133.pdf) Some are already available in SuperCollider's LINK::Classes/Onsets:: object. It is part of the Fluid Decomposition Toolkit of the FluCoMa project.footnote::This was made possible thanks to the FluCoMa project ( http://www.flucoma.org/ ) funded by the European Research Council ( https://erc.europa.eu/ ) under the European Unions Horizon 2020 research and innovation programme (grant agreement No 725899).::
The process will return an audio steam with sample-long impulses at estimated starting points of the different slices.
CLASSMETHODS::
METHOD:: ar
The audio rate version of the object.
ARGUMENT:: in
The audio to be processed.
ARGUMENT:: metric
The metric used to derive a difference curve between spectral frames. It can be any of the following:
TABLE::
##0 || Energy || thresholds on (sum of squares of magnitudes / nBins) (like Onsets \power)
##1 || HFC || thresholds on (sum of (squared magnitudes * binNum) / nBins)
##2 || SpectralFlux || thresholds on (diffence in magnitude between consecutive frames, half rectified)
##3 || MKL || thresholds on (sum of log of magnitude ratio per bin) (or equivalent: sum of difference of the log magnitude per bin) (like Onsets \mkl)
##4 || IS || (WILL PROBABLY BE REMOVED) Itakura - Saito divergence (see literature)
##5 || Cosine || thresholds on (cosine distance between comparison frames)
##6 || PhaseDev || takes the past 2 frames, projects to the current, as anticipated if it was a steady state, then compute the sum of the differences, on which it thresholds (like Onsets \phase)
##7 || WPhaseDev || same as PhaseDev, but weighted by the magnitude in order to remove chaos noise floor (like Onsets \wphase)
##8 || ComplexDev || same as PhaseDev, but in the complex domain - the anticipated amp is considered steady, and the phase is projected, then a complex subtraction is done with the actual present frame. The sum of magnitudes is used to threshold (like Onsets \complex)
##9 || RComplexDev || same as above, but rectified (like Onsets \rcomplex)
::
ARGUMENT:: threshold
The thresholding of a new slice. Value ranges are different for each metric, from 0 upwards.
ARGUMENT:: minSliceLength
The minimum duration of a slice in number of hopSize.
ARGUMENT:: filterSize
The size of a smoothing filter that is applied on the novelty curve. A larger filter filter size allows for cleaner cuts on very sharp changes.
ARGUMENT:: frameDelta
For certain metrics (HFC, SpectralFlux, MKL, Cosine), the distance does not have to be computed between consecutive frames. By default (0) it is, otherwise this sets the distane between the comparison window in samples.
ARGUMENT:: windowSize
The window size. As sinusoidal estimation relies on spectral frames, we need to decide what precision we give it spectrally and temporally, in line with Gabor Uncertainty principles. http://www.subsurfwiki.org/wiki/Gabor_uncertainty
ARGUMENT:: hopSize
The window hop size. As sinusoidal estimation relies on spectral frames, we need to move the window forward. It can be any size but low overlap will create audible artefacts. The -1 default value will default to half of windowSize (overlap of 2).
ARGUMENT:: fftSize
The inner FFT/IFFT size. It should be at least 4 samples long, at least the size of the window, and a power of 2. Making it larger allows an oversampling of the spectral precision. The -1 default value will default to windowSize.
ARGUMENT:: maxFFTSize
How large can the FFT be, by allocating memory at instantiation time. This cannot be modulated.
RETURNS::
An audio stream with impulses at detected transients. The latency between the input and the output is windowSize at maximum.
EXAMPLES::
code::
//load some sounds
b = Buffer.read(s,File.realpath(FluidOnsetSlice.class.filenameSymbol).dirname.withTrailingSlash ++ "../AudioFiles/Nicol-LoopE-M.wav");
// basic param (the process add a latency of windowSize samples
{var sig = PlayBuf.ar(1,b,loop:1); [FluidOnsetSlice.ar(sig) * 0.5, DelayN.ar(sig, 1, 1024/ s.sampleRate)]}.play
// other parameters
{var sig = PlayBuf.ar(1,b,loop:1); [FluidOnsetSlice.ar(sig, 2, 0.06, 55, 7, 0, 128, 64) * 0.5, DelayN.ar(sig, 1, (128)/ s.sampleRate)]}.play
// more musical trans-trigged autopan
(
{
var sig, trig, syncd, pan;
sig = PlayBuf.ar(1,b,loop:1);
trig = FluidOnsetSlice.ar(sig, 1, 1.8, 100, 8, 0, 128);
syncd = DelayN.ar(sig, 1, ( 128 / s.sampleRate));
pan = TRand.ar(-1,1,trig);
Pan2.ar(syncd,pan);
}.play
)
::