# R - K-means clustering

K-means in R.

## Steps

### Generate Data

K-means works in any dimension, but in two dimension, we can plot data.

### KMeans

Kmeans is in the stats package.

km.out=kmeans(xclustered,3,nstart=15)
km.out

where:

• 3 means that we search 3 cluster
K-means clustering with 3 clusters of sizes 33, 28, 39

Cluster means:
[,1]       [,2]
1 -1.107234 -6.7087012
2  1.803277 -0.0341333
3 -3.900942 -2.1654215

Clustering vector:
[1] 1 2 3 3 2 1 3 1 2 2 2 3 3 1 3 3 1 1 2 1 1 2 3 3 3 3 3 3 3 3 1 2 3 3 1 2 1 1 3 2 3 1 3 1 3 3 2 2 2 1 1 3 1 1
[55] 1 1 3 1 1 2 3 1 3 1 2 1 3 3 2 3 1 3 2 2 1 2 3 3 1 3 1 3 2 2 3 2 2 2 2 2 2 3 3 3 1 3 1 2 1 1

Within cluster sum of squares by cluster:
[1]  63.30585  54.25311 114.05169
(between_SS / total_SS =  84.5 %)

Available components:

[1] "cluster"      "centers"      "totss"        "withinss"     "tot.withinss" "betweenss"    "size"
[8] "iter"         "ifault"

### Plot

Plot the data:

• with the colours of the kmeans cluster output (km.outcluster)
• a empty circle (pch=1)
• growth (magnified) two times (cex=2)

in order to add in it the points of the original generated grouping.

plot(xclustered,col=km.out\$cluster,pch=1,cex=2)

Add the points in its originally grouping

points(xclustered,col=which,pch=19)
points(xclustered,col=c(1,3,2)[which],pch=19)

where

col=c(1,3,2)[which]

will map the colour of the groups.