You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
68 lines
2.0 KiB
Plaintext
68 lines
2.0 KiB
Plaintext
title:: The Fluid Corpus Manipulation Data Tools
|
|
summary:: Tools for organising, exploring and querying corpora
|
|
categories:: Libraries>FluidDecomposition,Guides>FluCoMa
|
|
related:: Guides/FluCoMa, Guides/FluidDecomposition, Classes/FluidDataSet,Classes/FluidLabelSet
|
|
|
|
The suite of Fluid Corpus Manipulation data tools offer facilities for building, exploring, transforming and playing with corpora. The tools are built around two container classes, link::Classes/FluidDataSet:: and link::Classes/FluidLabelSet::, which provides a way to build up and stored collections of labelled data, and a suite of objects that act on these containers.
|
|
|
|
The design and interface of many of these objects is heavily based on the Python library link::https://scikit-learn.org/stable/##scikit-learn::, a mature and well developed machine learning toolkit that is comparatively quick to get going with. As our documentation continues to develop, we will also lean quite heavily on sci-learn's!
|
|
|
|
section:: Containers
|
|
|
|
Map id labels to data points, or to other labels
|
|
|
|
link::Classes/FluidDataSet::
|
|
|
|
link::Classes/FluidLabelSet::
|
|
|
|
|
|
section:: DataSet Filtering
|
|
|
|
Select and filter items from FluidDataSet by building queries
|
|
|
|
link::Classes/FluidDataSetQuery::
|
|
|
|
section:: Data Structure
|
|
|
|
Perform nearest neighbour searches
|
|
|
|
link::Classes/FluidKDTree::
|
|
|
|
section:: Data Conditioning
|
|
|
|
Pre-process data
|
|
|
|
link::Classes/FluidNormalize::
|
|
|
|
link::Classes/FluidStandardize::
|
|
|
|
link::Classes/FluidRobustScale::
|
|
|
|
section:: Dimension Reduction
|
|
|
|
Compress data to fewer dimensions for visualisation / efficiency / preprocessing
|
|
|
|
link::Classes/FluidPCA::
|
|
|
|
link::Classes/FluidMDS::
|
|
|
|
section:: Supervised Learning
|
|
|
|
Train supervised learning models using either K nearest neighbours or a simple neural network
|
|
|
|
subsection:: Classification
|
|
|
|
Map input data points to categories
|
|
|
|
link::Classes/FluidKNNClassifier::
|
|
|
|
link::Classes/FluidMLPClassifier::
|
|
|
|
subsection:: Regression
|
|
|
|
Map input data points to continuous output
|
|
|
|
link::Classes/FluidKNNRegressor::
|
|
|
|
link::Classes/FluidMLPRegressor::
|