Tuesday, April 30, 2013

Simplifying clustering visualization with mlboost


Are you looking for a simple way to visualized your supervised or semi-supervised data clusters with different dimension reduction algorithms like PCA, LDA, isomap, LLE ,mds, random trees, spectral embedding  etc.?
Here is an output example on 4 newsgroups dataset.

If you are following sklearn loading standard, with mlboost, you can do it by changing 2 lines of code (line #5 and #6) or modify this example. (python yourvisu.py -m y)

1
2
3
4
5
6
7
import sys
from mlboost.clustering import visu

# add your data loading function that return data_train and data_test
from X import LOAD_DATASET_Y
visu.add_loading_dataset_fct('y', LOAD_DATASET_Y)
visu.main(sys.argv[1:])
Btw, if you click on the legend, it will remove the class as you can see here when I remove the green class 2. In the context of semi-supervised, simply set samples class to "?" (dataset.target[i]). 
 

Without scikit-learn and matplotlib, it won't be that easy to experiment visualization. 

No comments:

Post a Comment