====== Contingency Tables ====== A contingency is obtained when by crossing two qualitatives nominal variables, typically artifacts type and archaeological assemblage. Let's assume we have a data.frame object named ''MyData''. Each line corresponds to an artifact. The first column contains a label, the second the assemblage and the last the artifact type : label assemblage type CLXIV-001 CL XIV 5+6 t CLXIV-002 CL XIV 3+4 plh ... ... ... **''R''** provides two functions for computing contingency table, here assemblage against type: MyCrossTable <- table(MyData[,2], MyData[,3]) an alternative : MyCrossTable2 <- xtabs(~., NMB_2006[,c(1,6)] ) The latter can be used as argument to the ''corresp()'' function from the MASS package which computes correspondence analysis, a factorial data reduction method suitable for contingency tables. ==== Frequency Tables ==== The ''prop.table()'' function calculates the frequency table (percentages). Its first argument is an objet of class table. The second is the margin : 1 for row, 2 for columns. MyCrosFreq <- prop.table(MyCrossTable, 1) Here is an example : TYPE_REC_ PHASE type 1 type 2 type 3 type 4 1 0.29242348 0.2506487 0.1904153 0.2665125 2 0.44952914 0.2752219 0.1045417 0.1707073 3 0.00000000 0.5199755 0.1823170 0.2977075 4 0.13439854 0.5759938 0.1875329 0.1020748 5 0.07930212 0.1942093 0.1844235 0.5420651 6 0.14209591 0.2609925 0.1652278 0.4316838 ==== Plotting ==== Now we can plot our frequency table. A graphical representation allows us to have a feel of the trends, even if there are many artifacts types and assemblages. A common and popular way to represent a frequency table is Ford's Battleship diagram. Is is derived from the barplot. Here is a code that implements it (Jammet-Reynal, 2006): ford <- function(x, cex.row.labels=1) { ################################################# ## FORD'S "BATTLESHIP" DIAGRAM ## ## Loic JAMMET-REYNAL, may 2006 ## ## Departement d'Anthropologie et d'Ecologie ## ## University of Geneva ## ## jammetr1[at]etu.unige.ch ## ################################################# dim(x)[2] -> jmax # colonnes j dim(x)[1] -> imax # lignes i set.up <- function(xlim, ylim) { # setting up coord. system plot( xlim, # x ylim, # y type="n", # no plotting axes = FALSE, asp = NA, xlab = "", ylab = "") } ## initialisation du device ## on divise par le nombre de colonnes + 1 ## 1ere colonne : labels op <- par(mfrow=c(1, jmax+1), mar=c(5,0,2,0)) # labels des lignes (colonne 1) set.up(c(0,1), # x c(0.9, imax+1.10) ) # y for (i in 1:imax) { text(0.5, i+0.5, row.names(x)[i], font = 2, # boldface cex = cex.row.labels) } for (j in 1:jmax) { # colonnes j set.up(xlim = c(-60,60)*max(x), # x ylim = c(0.9, imax+1.10) ) # y title(sub=colnames(x)[j], font.sub=2, # boldface cex.sub = 1.5) for (i in 1: imax) { # lignes i # le plus important. boite multipliee # par les parametres X <- c(-50,+50,+50,-50,-50)*x[i,j] Y <- c(i,i,i+1,i+1,i) polygon(X, Y, xpd=FALSE, col="black", mar=c(0,0,0,0) ) } } } You first have to run the above code. A new function called ''ford()'' will be available. Its argument is a frequency table. In order to represent a chronological hypothesis, you have to rearrange the order of rows and columns. A way to do it is giving two a vector of indices between brackets right after the frequency table object: ford(MyCrosFreq[c(1,2,4,3), c(2,4,5,3,1,6)]) You can obtain optimal ordering by use of [[seriation]] techniques. This is an output example : {{ford.png?600|Ford diagram}} ==== Reference ==== **Jammet-Reynal, L. (2006).-** //La céramique de Clairvaux VII (Jura, France) : typologie, étude quantitative et sériation.// Genève : Département d'anthropologie et d'écologie de l'Université. Unpublished Master thesis.