Improving help files: Dataset, Labelset, KDTree and KMeans

nix
Owen Green 6 years ago
parent 00a238dae7
commit ddce5473a8

@ -4,137 +4,138 @@ categories:: UGens>FluidManipulation
related:: Classes/FluidLabelSet, Classes/FluidKDTree, Classes/FluidKNN, Classes/FluidKMeans related:: Classes/FluidLabelSet, Classes/FluidKDTree, Classes/FluidKNN, Classes/FluidKMeans
DESCRIPTION:: DESCRIPTION::
A server-side container associating labels with multi-dimensional data. FluidDataSet is identified by its name, and multiple instances of the object with the same name point to the same instance on the server. A server-side container associating labels with multi-dimensional data. FluidDataSet is identified by its name.
CLASSMETHODS:: CLASSMETHODS::
PRIVATE::kr PRIVATE:: asUGenInput
METHOD:: new METHOD:: new
Create a new instance of the dataset, with the given name and dimensionality. If an instance already exists on the server, then the existing dimensionality takes precedence. Create a new instance of the dataset, with the given name. If a Dataset with this name already exists, an exception will be thrown (see link::Classes/FluidDataSet#at:: to access an extant Dataset)
ARGUMENT:: server ARGUMENT:: server
The link::Classes/Server:: on which to create the data set The link::Classes/Server:: on which to create the data set
ARGUMENT:: name ARGUMENT:: name
A symbol or string with the name of the dataset. A symbol or string with the name of the dataset.
ARGUMENT:: dims
An integer number of dimensions
returns:: The new instance returns:: The new instance
METHOD:: at
Retreives a cached instance of a FluidDataSet with the given name, or returns nil if no such object exists.
ARGUMENT:: server
The server associated with this dataset instance
ARGUMENT:: id
The name of the Dataset to retreive from the cache
INSTANCEMETHODS:: INSTANCEMETHODS::
PRIVATE::init,id PRIVATE:: init,id,cache
METHOD:: synth METHOD:: addPoint
The internal synth the object uses to communicate with the server Add a new point to the data set. The dimensionality of the dataset is governed by the size of the first point added.
Will report an error if the label already exists, or if the size of the data does not match the dimensionality of the dataset.
returns:: A link::Classes/Synth:: ARGUMENT:: label
A symbol or string with the label for the new point
METHOD:: server ARGUMENT:: buffer
The server instance the object uses A link::Classes/Buffer:: with the new data point
ARGUMENT:: action
returns:: A link::Classes/Server:: A function to run when the point has been added
METHOD:: updatePoint METHOD:: updatePoint
Update an existing label's data. Will report an error if the label doesn't exist, or if the size of the data does not match the given dimensionality of the dataset. Update an existing label's data. Will report an error if the label doesn't exist, or if the size of the data does not match the given dimensionality of the dataset.
ARGUMENT:: label ARGUMENT:: label
symbol or string with the label symbol or string with the label
ARGUMENT:: buffer ARGUMENT:: buffer
A link::Classes/Buffer:: containing the updated data A link::Classes/Buffer:: containing the updated data
ARGUMENT:: action ARGUMENT:: action
A function to run when the server has updated A function to run when the server has updated
METHOD:: size METHOD:: size
Report the number of items currently in the data set Report the number of items currently in the data set
ARGUMENT:: action METHOD:: getPoint
A function to run when the server responds, whose argument is the data set size Retreive a point from the data set into a link::Classes/Buffer::. Will report an error if the label or buffer doesn't exist
METHOD:: addPoint
Add a new point to the data set. Will report an error if the label already exists, or if the size of the data does not match the given dimensionality of the dataset.
ARGUMENT:: label ARGUMENT:: label
A symbol or string with the label for the new point symbol or string with the label to retreive
ARGUMENT:: buffer ARGUMENT:: buffer
A link::Classes/Buffer:: with the new data point link::Classes/Buffer:: to fill
ARGUMENT:: action
A function to run when the point has been added
METHOD:: write
Write the data set to disk as a JSON file. Will not overwrite existing files
ARGUMENT:: filename
Absolute path for the new file
ARGUMENT:: action ARGUMENT:: action
A function to run when the file has been written function to run when the point has been retreived
METHOD:: asString
returns:: The name of the data set as a string
METHOD:: deletePoint METHOD:: deletePoint
Remove a point from the data set. Will report an error if the label doesn't exist. Remove a point from the data set. Will report an error if the label doesn't exist.
ARGUMENT:: label ARGUMENT:: label
symbol or string with the label to remove symbol or string with the label to remove
ARGUMENT:: action ARGUMENT:: action
Function to run when the point has been deleted Function to run when the point has been deleted
METHOD:: clear METHOD:: clear
Empty the data set Empty the data set
ARGUMENT:: action ARGUMENT:: action
Function to run when the data set has been emptied Function to run when the data set has been emptied
METHOD:: getPoint METHOD:: free
Retreive a point from the data set into a link::Classes/Buffer::. Will report an error if the label or buffer doesn't exist Destroy the object on the server
ARGUMENT:: label METHOD:: cols
symbol or string with the label to retreive Report the dimensionality of the data set. If action is nil, will default to posting result.
ARGUMENT:: buffer
link::Classes/Buffer:: to fill
ARGUMENT:: action ARGUMENT:: action
function to run when the point has been retreived A function to run when the server responds, whose argument is the data set dimensionality. By default, the method will print the response to the post window.
METHOD:: size
Report the number of points in the data set. If action is nil, will default to posting result.
ARGUMENT:: action
A function to run when the server responds, whose argument is the data set size. By default, the method will print the response to the post window.
METHOD:: read METHOD:: read
Read a data set from a JSON file on disk Read a data set from a JSON file on disk
ARGUMENT:: filename ARGUMENT:: filename
The absolute path of the JSON file to read The absolute path of the JSON file to read
ARGUMENT:: action ARGUMENT:: action
A function to run when the file has been read A function to run when the file has been read
METHOD:: write
Write the data set to disk as a JSON file.
ARGUMENT:: filename
Absolute path for the new file
ARGUMENT:: action
A function to run when the file has been written
METHOD:: cols METHOD:: asString
Report the dimensionality of the data set Responds with the name of the data set as a pretty(ish) string
METHOD:: asSymbol
Responds with the name of the data set as a symbol
METHOD:: synth
The internal synth the object uses to communicate with the server
returns:: A link::Classes/Synth::
METHOD:: server
The server instance the object uses
returns:: A link::Classes/Server::
EXAMPLES:: EXAMPLES::
CODE:: CODE::
(
// Make a one-dimensional data set called 'simple1data' // Make a one-dimensional data set called 'simple1data'
~ds = FluidDataSet.new(s,\simple1data,1) ~ds = FluidDataSet.new(s,\simple1data,1)
// Make a buffer to use for adding points // Make a buffer to use for adding points
~point = Buffer.alloc(s,1,1) ~point = Buffer.alloc(s,1,1)
//Add 10 points, using the index as a label. //Add 10 points, using the index as a label.
(
Routine{ Routine{
10.do{|i| 10.do{|i|
~point.set(0,i); ~point.set(0,i);
s.sync; ~ds.addPoint(i.asString,~point,{("addPoint"+i).postln});
~ds.addPoint(i.asString,~point,{("addPoint"+i).postln}) s.sync;
} }
}.play }.play
) )

@ -4,20 +4,41 @@ categories:: FluidManipulation
related:: Classes/FluidDataSet related:: Classes/FluidDataSet
DESCRIPTION:: DESCRIPTION::
A server-side K-Dimensional tree for efficient neighbourhood searches of multi-dimensional data. See https://scikit-learn.org/stable/modules/neighbors.html#nearest-neighbor-algorithms for more on KD Trees A server-side K-Dimensional tree for efficient neighbourhood searches of multi-dimensional data.
See https://scikit-learn.org/stable/modules/neighbors.html#nearest-neighbor-algorithms for more on KD Trees
CLASSMETHODS:: CLASSMETHODS::
METHOD:: new
Make a new KDTree model for the given server
ARGUMENT:: server
The server on which to make the model
INSTANCEMETHODS:: INSTANCEMETHODS::
METHOD:: read METHOD:: fit
Set the object's state from a JSON file Build the tree by scanning the points of a LINK::Classes/FluidDataSet::
ARGUMENT:: filename ARGUMENT:: dataset
The location of a JSON file on disk The LINK::Classes/FluidDataSet:: of interest. This can either be a data set object itself, or the name of one.
ARGUMENT:: action ARGUMENT:: action
function to run when the data is loaded A function to run when indexing is complete
METHOD:: kNearest
Returns the IDs of the CODE::k:: points nearest to the one passed
ARGUMENT:: buffer
A LINK::Classes/Buffer:: containing a data point to match against. The number of frames in the buffer must match the dimensionality of the LINK::Classes/FluidDataSet:: the tree was fitted to.
ARGUMENT:: k
The number of neighbours to return
ARGUMENT:: action
A function that will run when the query returns, whose argument is an array of point IDs from the tree's LINK::Classes/FluidDataSet::
METHOD:: kNearestDist METHOD:: kNearestDist
Get the distances of the K nearest neighbours to a point Get the distances of the K nearest neighbours to a point
@ -31,16 +52,22 @@ The number of neighbours to search
ARGUMENT:: action ARGUMENT:: action
A function that will run when the query returns, whose argument is an array of distances A function that will run when the query returns, whose argument is an array of distances
returns:: nothing, but could return an array if you like METHOD:: cols
Get the dimensionality of the data that the tree is indexed against
METHOD:: fit ARGUMENT:: action
Build the tree by scanning the points of a LINK::Classes/FluidDataSet:: A function that runs when the query returns, whose argument is the dimensionality
ARGUMENT:: dataset
The LINK::Classes/FluidDataSet:: of interest. This can either be a data set object itself, or the name of one. METHOD:: read
Set the object's state from a JSON file
ARGUMENT:: filename
The location of a JSON file on disk
ARGUMENT:: action ARGUMENT:: action
A function to run when indexing is complete function to run when the data is loaded
METHOD:: write METHOD:: write
Write the index of the tree to disk. Currently this will not overwrite extant files. Write the index of the tree to disk. Currently this will not overwrite extant files.
@ -51,28 +78,50 @@ The path of a JSON file to write
ARGUMENT:: action ARGUMENT:: action
A function to run when writing is complete A function to run when writing is complete
METHOD:: kNearest
Returns the IDs of the CODE::k:: points nearest to the one passed
ARGUMENT:: buffer
A LINK::Classes/Buffer:: containing a data point to match against. The number of frames in the buffer must match the dimensionality of the LINK::Classes/FluidDataSet:: the tree was fitted to.
ARGUMENT:: k
The number of neighbours to return
ARGUMENT:: action
A function that will run when the query returns, whose argument is an array of point IDs from the tree's LINK::Classes/FluidDataSet::
returns:: Nothing, but could return an array of IDs if you like
METHOD:: cols
Get the dimensionality of the data that the tree is indexed against
ARGUMENT:: action
A function that runs when the query returns, whose argument is the dimensionality
EXAMPLES:: EXAMPLES::
code:: code::
(some example code) //Make some 2D points and place into a dataset
(
~points = 100.collect{ [ 1.0.linrand,1.0.linrand] };
~dataset= FluidDataSet(s,\kdtree_help_rand2d);
~dataset.free
~tmpbuf = Buffer.alloc(s,2) ;
fork{
~dataset.ready.wait;
~points.do{|x,i|
(""++(i+1)++"/100").postln;
~tmpbuf.setn(0,x);
~dataset.addPoint(i,~tmpbuf);
s.sync
}
}
)
//Make a new tree, and fit it to the dataset
(
fork{
~tree = FluidKDTree(s);
~tree.ready.wait;
s.sync;
~tree.fit(~dataset);
}
)
//Dims of tree should match dataset
~tree.cols
//Return labels of k nearest points to new data
(
~tmpbuf.setn(0,[ 1.0.linrand,1.0.linrand ]);
~tree.kNearest(~tmpbuf,5, { |a| a.postln });
)
//or the distances
~tree.kNearestDist(~tmpbuf,5, { |a| a.postln });
:: ::

@ -10,87 +10,168 @@ https://scikit-learn.org/stable/tutorial/statistical_inference/unsupervised_lear
CLASSMETHODS:: CLASSMETHODS::
METHOD:: new
Construct a new K Means model on the passed server
ARGUMENT:: server
If nil will use Server.default
INSTANCEMETHODS:: INSTANCEMETHODS::
PRIVATE::k PRIVATE::k
METHOD:: predictPoint
Given a trained object, return the cluster ID for a data point in a link::Classes/Buffer::
ARGUMENT:: buffer
a link::Classes/Buffer:: containing a data point
ARGUMENT:: action
A function to run when the server responds, taking the ID of the cluser as its argument
METHOD:: fit METHOD:: fit
Identify code::k:: clusters in a link::Classes/FluidDataSet:: Identify code::k:: clusters in a link::Classes/FluidDataSet::
ARGUMENT:: dataset ARGUMENT:: dataset
A link::Classes/FluidDataSet:: of data points A link::Classes/FluidDataSet:: of data points
ARGUMENT:: k ARGUMENT:: k
The number of clusters to identify in the data set The number of clusters to identify in the data set
ARGUMENT:: maxIter ARGUMENT:: maxIter
Maximum number of iterations to use partitioning the data Maximum number of iterations to use partitioning the data
ARGUMENT:: buffer ARGUMENT:: buffer
Seed centroids for clusters WARNING:: Not yet implemented :: Seed centroids for clusters WARNING:: Not yet implemented ::
ARGUMENT:: action ARGUMENT:: action
A function to run when fitting is complete, taking as its argument an array with the number of data points for each cluster A function to run when fitting is complete, taking as its argument an array with the number of data points for each cluster
METHOD:: write METHOD:: predict
write learned clusters to disk as a JSON file. Will not overwrite existing files Given a trained object, return the cluster ID for each data point in a dataset to a label set.
ARGUMENT:: dataset
ARGUMENT:: filename a link::Classes/FluidDataSet:: containing the data to predict
Absolute path for file ARGUMENT:: labelset
a link::Classes/FluidLabelSet:: to reveive the predicted clusters
ARGUMENT:: action
A function to run when the file is written
METHOD:: read
Read a learned clustering of a data set from a JSON file
ARGUMENT:: filename
Absolute path of the JSON file
ARGUMENT:: action ARGUMENT:: action
Function to run when the file has been read A function to run when the server responds
METHOD:: getClusters
Fill a link::Classes/FluidLabelSet:: with the assignments for each point in the passed link::Classes/FluidDataSet:: that was used to train this instance
METHOD:: fitPredict
Run link::Classes/FluidKMeans#*fit:: and link::Classes/FluidKMeans#*predict:: in a single pass: i.e. train the model on the incoming link::Classes/FluidDataSet:: and then return the learned clustering to the passed link::Classes/FluidLabelSet::
ARGUMENT:: dataset ARGUMENT:: dataset
The link::Classes/FluidDataSet:: used to train this instance a link::Classes/FluidDataSet:: containing the data to fit and predict
ARGUMENT:: labelset ARGUMENT:: labelset
A link::Classes/FluidLabelSet:: to fill with assignments a link::Classes/FluidLabelSet:: to reveive the predicted clusters
ARGUMENT:: k
The number of clusters to identify in the data set
ARGUMENT:: maxIter
Maximum number of iterations to use partitioning the data
ARGUMENT:: action
A function to run when the server responds
METHOD:: predictPoint
Given a trained object, return the cluster ID for a data point in a link::Classes/Buffer::
ARGUMENT:: buffer
a link::Classes/Buffer:: containing a data point
ARGUMENT:: action ARGUMENT:: action
A function to run when the operation is complete A function to run when the server responds, taking the ID of the cluser as its argument
METHOD:: cols METHOD:: cols
Retreive the dimentionality of the dataset this instance is trained on Retreive the dimentionality of the dataset this instance is trained on
ARGUMENT:: action ARGUMENT:: action
A function to run when the server responds, taking the dimensionality as its argument A function to run when the server responds, taking the dimensionality as its argument
METHOD:: predict METHOD:: predict
Report cluster assignments for previously unseen data Report cluster assignments for previously unseen data
ARGUMENT:: dataset ARGUMENT:: dataset
A link::Classes/FluidDataSet:: of data points A link::Classes/FluidDataSet:: of data points
ARGUMENT:: labelset ARGUMENT:: labelset
A link::Classes/FluidLabelSet:: to contain assigments A link::Classes/FluidLabelSet:: to contain assigments
ARGUMENT:: action ARGUMENT:: action
A function to run when complete, taking an array of the counts for each catgegory as its argument A function to run when complete, taking an array of the counts for each catgegory as its argument
EXAMPLES::
METHOD:: write
write learned clusters to disk as a JSON file. Will not overwrite existing files
ARGUMENT:: filename
Absolute path for file
ARGUMENT:: action
A function to run when the file is written
METHOD:: read
Read a learned clustering of a data set from a JSON file
ARGUMENT:: filename
Absolute path of the JSON file
ARGUMENT:: action
Function to run when the file has been read
EXAMPLES::
Server.default.options.outDevice = "Built-in Output"
code:: code::
(some example code)
//A dataset for our points, a labelset for cluster labels
(
~dataset= FluidDataSet(s,\kdtree_help_rand2d);
~clusters = FluidLabelSet(s,\kmeans_help_clusters);
)
//Make some clumped 2D points and place into a dataset
(
~points = (4.collect{64.collect{(1.sum3rand) + [1,-1].choose}.clump(2)}).flatten(1) * 0.5;
~dataset.clear;
~tmpbuf = Buffer.alloc(s,2);
fork{
s.sync;
~points.do{|x,i|
(""++(i+1)++"/128").postln;
~tmpbuf.setn(0,x);
~dataset.addPoint(i,~tmpbuf);
s.sync
}
}
)
//Make a new k means model, fit it to the dataset and return the discovered clusters to a labelset
(
fork{
~clusters.clear;
~kmeans = FluidKMeans(s);
s.sync;
~kmeans.fitPredict(~dataset,~clusters, 4,action: {|c|
"Fitted.\n # Points in each cluster:".postln;
c.do{|x,i|
("Cluster" + i + "->" + x.asInteger + "points").postln;
}
});
}
)
//Dims of kmeans should match dataset
~kmeans.cols
//Return labels of clustered points
(
~assignments = Array.new(128);
fork{
128.do{ |i|
~clusters.getLabel(i,{|clusterID|
(i.asString+clusterID).postln;
~assignments.add(clusterID)
});
s.sync;
}
}
)
//Visualise: we're hoping to see colours neatly mapped to quandrants...
(
d = ((~points + 1) * 0.5).flatten(1).unlace;
// d = [20.collect{1.0.rand}, 20.collect{1.0.rand}];
w = Window("scatter", Rect(128, 64, 200, 200));
~colours = [Color.blue,Color.red,Color.green,Color.magenta];
w.drawFunc = {
Pen.use {
d[0].size.do{|i|
var x = (d[0][i]*200);
var y = (d[1][i]*200);
var r = Rect(x,y,5,5);
Pen.fillColor = ~colours[~assignments[i].asInteger];
Pen.fillOval(r);
}
}
};
w.refresh;
w.front;
)
:: ::

@ -12,84 +12,85 @@ CLASSMETHODS::
PRIVATE:: kr PRIVATE:: kr
METHOD:: new METHOD:: new
Make a new instance of a label set, uniquely identified by its name. Multiple instances to of this class with the same name refer to the same server-side entity. Make a new instance of a label set, uniquely identified by its name. Creating an instance with a name already in use will throw an exception. Use link::Classes/FluidLabelSet#*at:: or free the existing instance.
ARGUMENT:: server ARGUMENT:: server
The link::Classes/Server:: on which to create the label set The link::Classes/Server:: on which to create the label set
ARGUMENT:: name ARGUMENT:: name
symbol or string with the label set's name symbol or string with the label set's name
METHOD:: at
Retreive a label set from the cache
ARGUMENT:: server
The link::Classes/Server:: on which to create the label set
ARGUMENT:: id
symbol or string with the label set's name
INSTANCEMETHODS:: INSTANCEMETHODS::
PRIVATE:: init, id, server, synth PRIVATE:: init, id
METHOD:: addLabel
Add a label to the label set
ARGUMENT:: id
symbol or string with the ID for this label
ARGUMENT:: label
symbol or string with the label to add
ARGUMENT:: action
function to run when the operation completes
METHOD:: updateLabel
Change a label in the label set
ARGUMENT:: id
symbol or string with the ID for this label
ARGUMENT:: label
symbol or string with the label to add
ARGUMENT:: action
function to run when the operation completes
METHOD:: getLabel METHOD:: getLabel
Retreive the label associated with an ID. Will report an error if the ID isn't present in the set Retreive the label associated with an ID. Will report an error if the ID isn't present in the set
ARGUMENT:: id ARGUMENT:: id
symbol or string with the ID to retreive. symbol or string with the ID to retreive.
ARGUMENT:: action ARGUMENT:: action
A function to run when the server responds, with the label as its argument A function to run when the server responds, with the label as its argument
METHOD:: deleteLabel
Remove a id-label pair from the label set
ARGUMENT:: id
symbol or string with the ID to remove
ARGUMENT:: action
A function to run when the label has been removed
METHOD:: clear
Empty the label set
ARGUMENT:: action
Function to run whrn the action completes
METHOD:: size
Report the number of items in the label set
ARGUMENT:: action
A function to run when the server responds, taking the size as its argument
METHOD:: cols METHOD:: cols
Returns the dimensionality of the link::Classes/FluidDataSet:: associated with this label set Returns the dimensionality of the link::Classes/FluidDataSet:: associated with this label set
ARGUMENT:: action ARGUMENT:: action
A function to run when the server responds, with the dimensionality as its argument A function to run when the server responds, with the dimensionality as its argument
METHOD:: write METHOD:: write
Write this label set to disk as a JSON file. Will not overwrite existing files. Write this label set to disk as a JSON file.
ARGUMENT:: filename ARGUMENT:: filename
Absolute path of file to write Absolute path of file to write
ARGUMENT:: action ARGUMENT:: action
A function to run when the file is written A function to run when the file is written
METHOD:: read METHOD:: read
Read a label set from a JSON file on disk Read a label set from a JSON file on disk
ARGUMENT:: filename ARGUMENT:: filename
Absolute path of the file to read Absolute path of the file to read
ARGUMENT:: action ARGUMENT:: action
A function to run when the file is read A function to run when the file is read
METHOD:: deleteLabel
Remove a id-label pair from the label set
ARGUMENT:: id
symbol or string with the ID to remove
ARGUMENT:: action
A function to run when the label has been removed
METHOD:: size
Report the num er of items in the label set
ARGUMENT:: action
A function to run when the server responds, taking the size as its argument
METHOD:: addLabel
Add a label to the label set
ARGUMENT:: id
symbol or string with the ID for this label
ARGUMENT:: label
symbol or string with the label to add
ARGUMENT:: action
function to run when the operation completes
METHOD:: clear
Empty the label set
ARGUMENT:: action
Function to run whrn the action completes
EXAMPLES:: EXAMPLES::
code:: code::

Loading…
Cancel
Save