# Quantitative Archaeology Wiki

### Site Tools

diggingnumbers:tests_of_association

# Tests of association

## Chi-squared test

The Chi-squared test is a common test of association between nominal or ordinal data. More powerful tests exist (see below), but the X-squared is by far the most simple one.

The statistics value is calculated as: <m>chi^2 = sum{}{}{{(O - E)^2}/E}</m>

Where O means Observed values and E means Expected values. See the book for a detailed explanation.

### Contingency table

Produce a contingency table of `Mat` by `Period`, a new variable made from `Date` to have three categories, and calculate chi-squared.

Creating the `Period` variable from `Date` has already been covered in “Transforming variables”:

```> Period <- Date
> Period[(Date>650)&(Date<=1200)] <- 1
> Period[(Date>100)&(Date<=650)] <- 2
> Period[(Date<=100)] <- 3```

You should probably tell R these values aren't numbers but categories. Notice the difference:

```> summary(Period)
Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
1.00    1.00    1.50    1.55    2.00    3.00
> Period <- factor(Period)
> Mat <- factor(Mat)
> summary(Period)
1  2  3
20 18  2 ```

This hasn't really effect on the following operations, but it helps you keeping a clean working environment.

```> table(Mat,Period)
Period
Mat   1  2  3
1 20  0  0
2  0 18  2```

See http://finzi.psych.upenn.edu/R/Rhelp02a/archive/2847.html for another method using `xtabs()`.

### Chi-squared test

We are now ready to perform the Chi-squared test:

```> crosstab <- table(Mat,Period)
> xtabs() # similar to table, but different results
> chisq.test(crosstab)

Pearson`s Chi-squared test

data:  table(Mat, Period)
X-squared = 40, df = 2, p-value = 2.061e-09

Warning message:
In chisq.test(table(Mat, Period)) :
Chi-squared approximation may be incorrect```

This result is OK, but has some differences from the one you would get doing all the operations by hand:

• the `p-value` is not a fixed one (because you're not using tables), but rather a floating point number, expressed in scientific notation. It is very low however.
• there's a warning about a possible approximation of the `χ-squared` value

## Other tests

Other tests of association mentioned in Digging Numbers don't seem so widely used, and this is probably the reason why they are not part of the standard R distribution.

### Cramer's V

This test is included in the `cramer` contributed package

### Kendall's tau-b

This test is included in the contributed package `Kendall`. See

### Kendall's tau-c

Kendall's tau-c is not included in any package, but it can be defined as a custom function. See https://stat.ethz.ch/pipermail/r-help/2006-September/112806.html 