I’m working to do color clustering (aka color palette, common colors, dominant colors) of images. One very useful technique is k-means clustering which is implemented in Scholar.
After k-means clustering I’d like to be able to get the silhouette_score which is implemented at Scholar.Metrics.Clustering.silhouette_score/3. That would seem to be a very natural step to validate or refuge the selection of the cluster size that was used (num_clusters: 16 below).
However the output generated by fit/2
doesn’t seem to play well with with silhouette_score/2
despite them being in a similar domain.
iex> kmeans = Image.Nx.kmeans(i, num_clusters: 16)
%Scholar.Cluster.KMeans{
num_iterations: #Nx.Tensor<
s64
EXLA.Backend<host:0, 0.2302565275.2944008210.1119>
24
>,
clusters: #Nx.Tensor<
f32[16][3]
EXLA.Backend<host:0, 0.2302565275.2944008210.1120>
[
[29.555273056030273, 22.40241813659668, 18.988496780395508],
[221.6188507080078, 186.17681884765625, 115.16548919677734],
[158.05177307128906, 123.88009643554688, 53.33770751953125],
[127.1727294921875, 95.57713317871094, 41.13895034790039],
[62.95863342285156, 47.129539489746094, 33.71294021606445],
[97.46968841552734, 79.16098022460938, 61.60565185546875],
[204.48558044433594, 216.12306213378906, 231.07981872558594],
[161.0527801513672, 182.42922973632812, 208.91531372070312],
[131.0211944580078, 107.85687255859375, 78.05144500732422],
[190.57359313964844, 153.97003173828125, 62.70090103149414],
[193.35650634765625, 161.45425415039062, 100.00911712646484],
[162.95822143554688, 135.37939453125, 89.99324035644531],
[222.13687133789062, 185.55056762695312, 75.35253143310547],
[95.11575317382812, 67.52027893066406, 28.772855758666992],
[250.12782287597656, 223.23638916015625, 149.09202575683594],
[244.928466796875, 210.7081298828125, 104.9043960571289]
]
>,
inertia: #Nx.Tensor<
f32
EXLA.Backend<host:0, 0.2302565275.2944008210.1121>
33850068.0
>,
labels: #Nx.Tensor<
s64[91767]
EXLA.Backend<host:0, 0.2302565275.2944008210.1122>
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...]
>
}
The function Image.Nx.kmeans/2
is just:
def kmeans(%Vimage{} = image, options \\ []) do
{_count, colors} = unique_colors(image)
Scholar.Cluster.KMeans.fit(colors, options)
end
It feels like it would be reasonable to be able to call Scholar.Metrics.Clustering.silhouette_score(colors, kmeans.clusters)
but that’s very definitely not the signature.
Any advice or guidance on how to apply the results from fit/2
to silhouette_score/2
would be greatly appreciated.