User Tools

Site Tools


diggingnumbers:tests_of_association

Tests of association

Chi-squared test

The Chi-squared test is a common test of association between nominal or ordinal data. More powerful tests exist (see below), but the X-squared is by far the most simple one.

The statistics value is calculated as: <m>chi^2 = sum{}{}{{(O - E)^2}/E}</m> Where **O** means //Observed values// and **E** means //Expected values//. See the book for a detailed explanation. ==== Contingency table ==== Produce a contingency table of ''Mat'' by ''Period'', a new variable made from ''Date'' to have three categories, and calculate chi-squared. Creating the ''Period'' variable from ''Date'' has already been covered in “[[Transforming variables]]”: <code C> > Period <- Date > Period[(Date>650)&(Date<=1200)] <- 1 > Period[(Date>100)&(Date<=650)] <- 2 > Period[(Date<=100)] <- 3 </code> You should probably tell R these values aren't numbers but categories. Notice the difference: <code C> > summary(Period) Min. 1st Qu. Median Mean 3rd Qu. Max. 1.00 1.00 1.50 1.55 2.00 3.00 > Period <- factor(Period) > Mat <- factor(Mat) > summary(Period) 1 2 3 20 18 2 </code> This hasn't really effect on the following operations, but it helps you keeping a clean working environment. <code C> > table(Mat,Period) Period Mat 1 2 3 1 20 0 0 2 0 18 2 </code> See http://finzi.psych.upenn.edu/R/Rhelp02a/archive/2847.html for another method using ''xtabs()''. ==== Chi-squared test ==== We are now ready to perform the Chi-squared test: <code C> > crosstab <- table(Mat,Period) > xtabs() # similar to table, but different results > chisq.test(crosstab) Pearson`s Chi-squared test data: table(Mat, Period) X-squared = 40, df = 2, p-value = 2.061e-09 Warning message: In chisq.test(table(Mat, Period)) : Chi-squared approximation may be incorrect </code> This result is OK, but has some differences from the one you would get doing all the operations by hand: * the //''p-value''// is not a fixed one (because you're not using tables), but rather a floating point number, expressed in scientific notation. It is very low however. * there's a warning about a possible approximation of the //''χ-squared''// value ===== Other tests ===== Other tests of association mentioned in //Digging Numbers// don't seem so widely used, and this is probably the reason why they are not part of the standard R distribution. ==== Cramer's V ==== This test is included in the ''[[http://cran.r-project.org/web/packages/cramer/index.html|cramer]]'' contributed package ==== Guttman's lambda ==== ==== Kendall's tau-b ==== This test is included in the contributed package ''[[http://cran.r-project.org/web/packages/Kendall/index.html|Kendall]]''. See ==== Kendall's tau-c ==== Kendall's tau-c is not included in any package, but it can be defined as a custom function. See https://stat.ethz.ch/pipermail/r-help/2006-September/112806.html ---- [[Start]] · [[Data description]] · [[Transforming variables]] · [[Tables]] · [[Pictorial displays]] · [[Measures of position and variability]] · [[Sampling]] · [[Tests of difference]] · [[Tests of distribution]] · [[Correlation]] · **Tests of association**

diggingnumbers/tests_of_association.txt · Last modified: 2018/08/04 00:01 (external edit)