Using Manifolds for Visualisation

Visualisation of data which has high dimensionality is challenging, and one solution is to provide visualisations in low-dimension representations of the space actually spanned by the data. Here we provide an example of visualisation of classification predictions and relative prediction certainty (using entropy across predicted probability for each individual class) for a toy sklearn dataset.

import matplotlib.pyplot as plt
import numpy as np
import sklearn.datasets

from pyrolite.util.plot import DEFAULT_DISC_COLORMAP
from pyrolite.util.skl.pipeline import SVC_pipeline
from pyrolite.util.skl.vis import plot_mapping

np.random.seed(82)
wine = sklearn.datasets.load_wine()
data, target = wine["data"], wine["target"]
# data = data[:, np.random.random(data.shape[1]) > 0.4]  # randomly remove fraction of dimensionality
svc = SVC_pipeline(probability=True)
gs = svc.fit(data, target)
Fitting 10 folds for each of 1 candidates, totalling 10 fits
fig, ax = plt.subplots(1, 2, figsize=(8, 4))

a, tfm, mapped = plot_mapping(data, gs.best_estimator_, ax=ax[1], s=50, init="pca")
ax[0].scatter(*mapped.T, c=DEFAULT_DISC_COLORMAP(gs.predict(data)), s=50)

ax[0].set_title("Predicted Classes")
ax[1].set_title("With Relative Certainty")

for a in ax:
    a.set_xticks([])
    a.set_yticks([])
manifold vis

Total running time of the script: (0 minutes 3.162 seconds)