You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

68 lines
2.0 KiB
Plaintext

title:: The Fluid Corpus Manipulation Data Tools
summary:: Tools for organising, exploring and querying corpora
categories:: Libraries>FluidDecomposition,Guides>FluCoMa
related:: Guides/FluCoMa, Guides/FluidDecomposition, Classes/FluidDataSet,Classes/FluidLabelSet
The suite of Fluid Corpus Manipulation data tools offer facilities for building, exploring, transforming and playing with corpora. The tools are built around two container classes, link::Classes/FluidDataSet:: and link::Classes/FluidLabelSet::, which provides a way to build up and stored collections of labelled data, and a suite of objects that act on these containers.
The design and interface of many of these objects is heavily based on the Python library link::https://scikit-learn.org/stable/##scikit-learn::, a mature and well developed machine learning toolkit that is comparatively quick to get going with. As our documentation continues to develop, we will also lean quite heavily on sci-learn's!
section:: Containers
Map id labels to data points, or to other labels
link::Classes/FluidDataSet::
link::Classes/FluidLabelSet::
section:: DataSet Filtering
Select and filter items from FluidDataSet by building queries
link::Classes/FluidDataSetQuery::
section:: Data Structure
Perform nearest neighbour searches
link::Classes/FluidKDTree::
section:: Data Conditioning
Pre-process data
link::Classes/FluidNormalize::
link::Classes/FluidStandardize::
link::Classes/FluidRobustScale::
section:: Dimension Reduction
Compress data to fewer dimensions for visualisation / efficiency / preprocessing
link::Classes/FluidPCA::
link::Classes/FluidMDS::
section:: Supervised Learning
Train supervised learning models using either K nearest neighbours or a simple neural network
subsection:: Classification
Map input data points to categories
link::Classes/FluidKNNClassifier::
link::Classes/FluidMLPClassifier::
subsection:: Regression
Map input data points to continuous output
link::Classes/FluidKNNRegressor::
link::Classes/FluidMLPRegressor::