====== Cluster analysis ====== Get the stamped bricks' XRD dataset by Shawn Graham [[http://electricarchaeologist.wordpress.com/2008/04/21/xrd-results-of-british-school-at-rome-stamped-bricks/|here]] and save it in a plain text file. > xrd <- read.delim("bricks_xrd.txt", sep="\t") > summary(xrd) Sample Quartz Augite Haematite Gehlenite Calcite Analcime Muscovite se 14 : 2 Min. : 11.00 Min. : 10.0 Min. : 3.00 Min. : 2.00 Min. : 2.0 Min. : 5 Min. : 3.00 fal 1 : 1 1st Qu.: 36.75 1st Qu.: 29.5 1st Qu.: 8.00 1st Qu.:11.00 1st Qu.: 40.0 1st Qu.:11 1st Qu.: 9.00 fal 2 : 1 Median : 52.00 Median : 54.0 Median :11.00 Median :30.00 Median : 66.0 Median :15 Median :12.00 fal 3 : 1 Mean : 59.24 Mean : 60.1 Mean :12.98 Mean :30.46 Mean : 64.9 Mean :17 Mean :14.24 fnv13 : 1 3rd Qu.: 86.00 3rd Qu.: 91.5 3rd Qu.:17.00 3rd Qu.:47.00 3rd Qu.: 95.5 3rd Qu.:20 3rd Qu.:17.00 fnv14 : 1 Max. :117.00 Max. :120.0 Max. :34.00 Max. :90.00 Max. :127.0 Max. :50 Max. :42.00 (Other):89 NA's : 9.0 NA's : 8.00 NA's :37.00 NA's : 33.0 NA's :51 NA's :34.00 Dolomite Anorthoclase Sanidine Albite Min. : 3.00 Min. : 20.00 Min. : 5.0 Min. : 3.0 1st Qu.:17.00 1st Qu.: 42.00 1st Qu.: 30.0 1st Qu.: 50.0 Median :23.50 Median : 66.00 Median : 50.0 Median : 65.5 Mean :24.43 Mean : 65.33 Mean : 49.5 Mean : 65.2 3rd Qu.:32.00 3rd Qu.: 90.00 3rd Qu.: 65.0 3rd Qu.: 86.5 Max. :48.00 Max. :115.00 Max. :114.0 Max. :117.0 NA's :26.00 NA's : 47.00 NA's : 70.0 NA's : 50.0 ===== hclust ===== Before creating the actual cluster dendrogram, we have to calculate the distance matrix from our data frame. For this task we use the ''dist()'' function: > dist_xrd <- dist(xrd[-1]) //(Note that the first column (label) is left intentionally out with the ''xrd[-1]'' syntax, i.e. all columns but the first)// We are ready to create the dendrogram. The syntax is quite plain, even though the console output is not very satisfying. The cluster object is saved to another variable because we are going to plot it. > clust_xrd <- hclust(dist_xrd) > clust_xrd Call: hclust(d = d_xrd) Cluster method : complete Distance : euclidean Number of objects: 96 And now plot it: > plot(clust_xrd) Maybe adding the right label to each leaf: > plot(clust_xrd, labels = xrd$Sample) And here's the result: {{ :archaeometry:cluster.png?500px |}} Once you get acquainted with these functions, you can also get the plot with one single line: > plot(hclust(dist(xrd[-1])), xrd$Sample, hang = -1, cex = 0.7)