Jack Dermody

Data

Text Clustering Four Ways

The data that we want to cluster is a list of accepted papers to an AI conference.

First, download the data set to your computer, create a new console application and add a reference to Bright Wire.

The data is parsed from the CSV file into a data table and then converted to a list of strongly typed AAAIDocuments.

Non Negative Matrix Factorisation (NNMF) is a technique that is only applicable to data that is uniformly positive, such as our counts of the document strings.