User Tools

Site Tools


Tests of association

Chi-squared test

The Chi-squared test is a common test of association between nominal or ordinal data. More powerful tests exist (see below), but the X-squared is by far the most simple one.

The statistics value is calculated as: <m>chi^2 = sum{}{}{{(O - E)^2}/E}</m>

Where O means Observed values and E means Expected values. See the book for a detailed explanation.

Contingency table

Produce a contingency table of Mat by Period, a new variable made from Date to have three categories, and calculate chi-squared.

Creating the Period variable from Date has already been covered in “Transforming variables”:

> Period <- Date
> Period[(Date>650)&(Date<=1200)] <- 1
> Period[(Date>100)&(Date<=650)] <- 2
> Period[(Date<=100)] <- 3

You should probably tell R these values aren't numbers but categories. Notice the difference:

> summary(Period)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   1.00    1.00    1.50    1.55    2.00    3.00 
> Period <- factor(Period)
> Mat <- factor(Mat)
> summary(Period)
 1  2  3 
20 18  2 

This hasn't really effect on the following operations, but it helps you keeping a clean working environment.

> table(Mat,Period)
Mat   1  2  3
   1 20  0  0
   2  0 18  2

See for another method using xtabs().

Chi-squared test

We are now ready to perform the Chi-squared test:

> crosstab <- table(Mat,Period)
> xtabs() # similar to table, but different results
> chisq.test(crosstab)
	Pearson`s Chi-squared test
data:  table(Mat, Period) 
X-squared = 40, df = 2, p-value = 2.061e-09
Warning message:
In chisq.test(table(Mat, Period)) :
  Chi-squared approximation may be incorrect

This result is OK, but has some differences from the one you would get doing all the operations by hand:

  • the p-value is not a fixed one (because you're not using tables), but rather a floating point number, expressed in scientific notation. It is very low however.
  • there's a warning about a possible approximation of the χ-squared value

Other tests

Other tests of association mentioned in Digging Numbers don't seem so widely used, and this is probably the reason why they are not part of the standard R distribution.

Cramer's V

This test is included in the cramer contributed package

Guttman's lambda

Kendall's tau-b

This test is included in the contributed package Kendall. See

Kendall's tau-c

Kendall's tau-c is not included in any package, but it can be defined as a custom function. See

Start · Data description · Transforming variables · Tables · Pictorial displays · Measures of position and variability · Sampling · Tests of difference · Tests of distribution · Correlation · Tests of association

diggingnumbers/tests_of_association.txt · Last modified: 2012/11/30 09:58 (external edit)