# Quantitative Archaeology Wiki

### Site Tools

contingency_tables

# Contingency Tables

A contingency is obtained when by crossing two qualitatives nominal variables, typically artifacts type and archaeological assemblage.

Let's assume we have a data.frame object named `MyData`. Each line corresponds to an artifact. The first column contains a label, the second the assemblage and the last the artifact type :

```label		assemblage	type
CLXIV-001	CL XIV 5+6	t
CLXIV-002	CL XIV 3+4	plh
...		...		... ```

`R` provides two functions for computing contingency table, here assemblage against type:

`MyCrossTable <- table(MyData[,2], MyData[,3])`

an alternative :

`MyCrossTable2 <- xtabs(~., NMB_2006[,c(1,6)] )`

The latter can be used as argument to the `corresp()` function from the MASS package which computes correspondence analysis, a factorial data reduction method suitable for contingency tables.

### Frequency Tables

The `prop.table()` function calculates the frequency table (percentages). Its first argument is an objet of class table. The second is the margin : 1 for row, 2 for columns.

`MyCrosFreq <- prop.table(MyCrossTable, 1)`

Here is an example :

```     TYPE_REC_
PHASE     type 1    type 2    type 3    type 4
1 0.29242348 0.2506487 0.1904153 0.2665125
2 0.44952914 0.2752219 0.1045417 0.1707073
3 0.00000000 0.5199755 0.1823170 0.2977075
4 0.13439854 0.5759938 0.1875329 0.1020748
5 0.07930212 0.1942093 0.1844235 0.5420651
6 0.14209591 0.2609925 0.1652278 0.4316838```

### Plotting

Now we can plot our frequency table. A graphical representation allows us to have a feel of the trends, even if there are many artifacts types and assemblages.

A common and popular way to represent a frequency table is Ford's Battleship diagram. Is is derived from the barplot.

Here is a code that implements it (Jammet-Reynal, 2006):

```ford <- function(x, cex.row.labels=1) {
#################################################
##  FORD'S "BATTLESHIP" DIAGRAM                ##
##  Loic JAMMET-REYNAL, may 2006               ##
##  Departement d'Anthropologie et d'Ecologie  ##
##  University of Geneva                       ##
##  jammetr1[at]etu.unige.ch                   ##
#################################################

dim(x)[2] -> jmax # colonnes j
dim(x)[1] -> imax # lignes i

set.up <- function(xlim, ylim) {
# setting up coord. system
plot(    xlim,    # x
ylim,     # y
type="n", # no plotting
axes = FALSE,
asp = NA,
xlab = "",
ylab = "")
}

## initialisation du device
## on divise par le nombre de colonnes + 1
## 1ere colonne : labels
op <- par(mfrow=c(1, jmax+1), mar=c(5,0,2,0))

# labels des lignes (colonne 1)
set.up(c(0,1),             # x
c(0.9, imax+1.10) ) # y

for (i in 1:imax) {
text(0.5,
i+0.5,
row.names(x)[i],
font = 2, # boldface
cex = cex.row.labels)
}

for (j in 1:jmax) { # colonnes j
set.up(xlim = c(-60,60)*max(x),   # x
ylim = c(0.9, imax+1.10) ) # y

title(sub=colnames(x)[j],
font.sub=2, # boldface
cex.sub = 1.5)

for (i in 1: imax) { # lignes i
# le plus important. boite multipliee
# par les parametres
X <- c(-50,+50,+50,-50,-50)*x[i,j]
Y <- c(i,i,i+1,i+1,i)
polygon(X,
Y,
xpd=FALSE,
col="black",
mar=c(0,0,0,0) )
}
}
}```

You first have to run the above code. A new function called `ford()` will be available. Its argument is a frequency table.

In order to represent a chronological hypothesis, you have to rearrange the order of rows and columns. A way to do it is giving two a vector of indices between brackets right after the frequency table object:

`ford(MyCrosFreq[c(1,2,4,3), c(2,4,5,3,1,6)])`

You can obtain optimal ordering by use of seriation techniques.

This is an output example :

### Reference

Jammet-Reynal, L. (2006).- La céramique de Clairvaux VII (Jura, France) : typologie, étude quantitative et sériation. Genève : Département d'anthropologie et d'écologie de l'Université. Unpublished Master thesis.

contingency_tables.txt · Last modified: 2012/11/30 09:59 (external edit)