How to generate cluster data.
To generate clustered data, the mean of random generated group of data is shifted.
set.seed(101)
x=matrix(rnorm(100*2),100,2)
where:
x[1:100,]
[,1] [,2]
[1,] -0.56843578 0.24912228
[2,] 0.77859810 -0.16461954
[3,] -0.15684682 0.37593032
[4,] -1.81059190 -0.79511759
[5,] -1.90281490 -0.13780093
[6,] 2.33700231 1.88560945
[7,] -0.46189692 -0.93481448
[8,] 0.54721322 1.26122751
....................
plot(x,pch=19)
which=sample(1:3,100,replace=TRUE)
where:
[1] 1 3 3 3 1 3 2 1 2 3 3 2 1 1 2 3 2 3 3 1 2 3 2 2 1 3 2 2 1 1 3 3 3 1 3 1 1 1 1 2 3 3 1 2 1 2 1 2 2 3 2 3 3 1
[55] 1 2 1 1 2 2 3 2 2 1 1 3 2 3 3 2 1 3 3 1 3 3 3 3 1 2 2 3 1 3 3 3 1 2 3 3 2 1 2 1 1 3 2 1 3 3
plot(x,col=which,pch=19)
xmean=matrix(rnorm(3*2,sd=4),3,2)
where:
[,1] [,2]
[1,] -4.235016 -1.84473873
[2,] 1.632360 -0.03466352
[3,] -1.100477 -7.02588458
xclusterd=x+xmean[which,]
plot(xclusterd,col=which,pch=19)
where: