About
K-means in R.
Articles Related
Steps
Generate Data
K-means works in any dimension, but in two dimension, we can plot data.
KMeans
Kmeans is in the stats package.
km.out=kmeans(xclustered,3,nstart=15)
km.out
where:
- 3 means that we search 3 cluster
K-means clustering with 3 clusters of sizes 33, 28, 39
Cluster means:
[,1] [,2]
1 -1.107234 -6.7087012
2 1.803277 -0.0341333
3 -3.900942 -2.1654215
Clustering vector:
[1] 1 2 3 3 2 1 3 1 2 2 2 3 3 1 3 3 1 1 2 1 1 2 3 3 3 3 3 3 3 3 1 2 3 3 1 2 1 1 3 2 3 1 3 1 3 3 2 2 2 1 1 3 1 1
[55] 1 1 3 1 1 2 3 1 3 1 2 1 3 3 2 3 1 3 2 2 1 2 3 3 1 3 1 3 2 2 3 2 2 2 2 2 2 3 3 3 1 3 1 2 1 1
Within cluster sum of squares by cluster:
[1] 63.30585 54.25311 114.05169
(between_SS / total_SS = 84.5 %)
Available components:
[1] "cluster" "centers" "totss" "withinss" "tot.withinss" "betweenss" "size"
[8] "iter" "ifault"
Plot
Plot the data:
- with the colours of the kmeans cluster output (km.outcluster)
- a empty circle (pch=1)
- growth (magnified) two times (cex=2)
in order to add in it the points of the original generated grouping.
plot(xclustered,col=km.out$cluster,pch=1,cex=2)
Add the points in its originally grouping
points(xclustered,col=which,pch=19)
points(xclustered,col=c(1,3,2)[which],pch=19)
where
col=c(1,3,2)[which]
will map the colour of the groups.