User Tools

Site Tools


contingency_tables

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
contingency_tables [2008/11/22 17:32]
94.102.60.182 iFqmtajJkdw
contingency_tables [2009/09/07 19:48]
steko link to seriation in the proper place
Line 1: Line 1:
-HeNqDZ ​ <a href="http://​abbrqchwcrcm.com/">abbrqchwcrcm</a>, [url=http://​grxpfewjbrdh.com/]grxpfewjbrdh[/url], [link=http://​rkrcezhxmocu.com/]rkrcezhxmocu[/link], http://ilvnozgcvqah.com/+====== Contingency Tables ====== 
 + 
 +A contingency is obtained when by crossing two qualitatives nominal variables, typically artifacts type and archaeological assemblage. 
 + 
 +Let's assume we have a data.frame object named ''​MyData''​. Each line corresponds to an artifact. The first column contains a label, the second the assemblage and the last the artifact type : 
 + 
 + 
 +<​file>​label assemblage type 
 +CLXIV-001 CL XIV 5+6 t 
 +CLXIV-002 CL XIV 3+4 plh 
 +... ... ... </file> 
 + 
 +**''​R''​** provides two functions for computing contingency table, here assemblage against type: 
 + 
 +<code C>​MyCrossTable <- table(MyData[,​2],​ MyData[,​3]) 
 +</code> 
 + 
 +an alternative : 
 +<code C>​MyCrossTable2 <- xtabs(~.NMB_2006[,c(1,6)] ) 
 +</​code>​ 
 + 
 +The latter can be used as argument to the ''​corresp()''​ function from the MASS package which computes correspondence analysis, a factorial data reduction method suitable for contingency tables. 
 + 
 +==== Frequency Tables ==== 
 + 
 +The ''​prop.table()''​ function calculates the frequency table (percentages). Its first argument is an objet of class table. The second is the margin ​1 for row, 2 for columns. 
 + 
 +<code C>​MyCrosFreq <- prop.table(MyCrossTable,​ 1) 
 +</code> 
 + 
 +Here is an example : 
 +<​file>​ 
 +     ​TYPE_REC_ 
 +PHASE     type 1    type 2    type 3    type 4 
 +    1 0.29242348 0.2506487 0.1904153 0.2665125 
 +    2 0.44952914 0.2752219 0.1045417 0.1707073 
 +    3 0.00000000 0.5199755 0.1823170 0.2977075 
 +    4 0.13439854 0.5759938 0.1875329 0.1020748 
 +    5 0.07930212 0.1942093 0.1844235 0.5420651 
 +    6 0.14209591 0.2609925 0.1652278 0.4316838 
 +</file> 
 +==== Plotting ​ ==== 
 + 
 +Now we can plot our frequency table. A graphical representation allows us to have a feel of the trends, even if there are many artifacts types and assemblages. 
 + 
 +A common and popular way to represent a frequency table is Ford's Battleship diagram. Is is derived from the barplot. 
 + 
 +Here is a code that implements it (Jammet-Reynal,​ 2006): 
 +<code C>ford <- function(x, cex.row.labels=1) { 
 +#################################################​ 
 +##  FORD'S "​BATTLESHIP"​ DIAGRAM ​               ## 
 +##  Loic JAMMET-REYNAL,​ may 2006               ## 
 +##  Departement d'​Anthropologie et d'​Ecologie ​ ## 
 +##  University of Geneva ​                      ## 
 +##  jammetr1[at]etu.unige.ch ​                  ## 
 +#################################################​ 
 + 
 +    dim(x)[2-> jmax # colonnes j 
 +    dim(x)[1] -> imax # lignes i 
 +     
 +    set.up <- function(xlimylim) { 
 +        # setting up coord. system 
 +        plot(    xlim,    # x 
 +                ylim,     # y 
 +                type="​n",​ # no plotting 
 +                axes = FALSE, 
 +                asp = NA, 
 +                xlab = "",​ 
 +                ylab = ""​) 
 +    } 
 +     
 +    ## initialisation du device 
 +    ## on divise par le nombre de colonnes + 1 
 +    ## 1ere colonne : labels 
 +    op <- par(mfrow=c(1,​ jmax+1), mar=c(5,​0,​2,​0)) 
 +     
 +    # labels des lignes (colonne 1) 
 +    set.up(c(0,​1), ​            # x 
 +           ​c(0.9,​ imax+1.10) ) # y 
 +     
 +    for (i in 1:imax) { 
 +        text(0.5, 
 +               ​i+0.5,​ 
 +              row.names(x)[i], 
 +              font 2, # boldface 
 +              cex = cex.row.labels) 
 +    } 
 + 
 +    for (j in 1:jmax) { # colonnes j 
 +        set.up(xlim = c(-60,​60)*max(x), ​  # x 
 +               ylim = c(0.9, imax+1.10) ) # y 
 +         
 +        title(sub=colnames(x)[j]
 +              font.sub=2, # boldface 
 +              cex.sub = 1.5) 
 +         
 +        for (i in 1: imax) { # lignes i 
 +            # le plus important. boite multipliee 
 +            # par les parametres 
 +            X <- c(-50,​+50,​+50,​-50,​-50)*x[i,j] 
 +            Y <- c(i,i,​i+1,​i+1,​i) 
 +            polygon(X,​ 
 +                    Y, 
 +                    xpd=FALSE,  
 +                    col="​black",​ 
 +                    mar=c(0,​0,​0,​0) ) 
 +        } 
 +    } 
 +
 +</​code>​ 
 + 
 +You first have to run the above code. A new function called ''​ford()''​ will be available. Its argument is a frequency table. 
 + 
 +In order to represent a chronological hypothesis, you have to rearrange the order of rows and columns. A way to do it is giving two a vector of indices between brackets right after the frequency table object: 
 +<code C>​ford(MyCrosFreq[c(1,​2,​4,​3),​ c(2,​4,​5,​3,​1,​6)]) 
 +</code> 
 + 
 +You can obtain optimal ordering by use of [[seriation]] techniques. 
 + 
 +This is an output example : 
 + 
 +{{ford.png?​600|Ford diagram}} 
 + 
 +==== Reference ==== 
 +**Jammet-Reynal,​ L. (2006).-** ​//La céramique de Clairvaux VII (Jura, France) : typologie, étude quantitative et sériation.// Genève : Département d'​anthropologie et d'​écologie de l'​Université. Unpublished Master thesis.
contingency_tables.txt · Last modified: 2018/08/04 00:01 (external edit)