Chi-Square Test: When and How to Use It
# Chi-Square Test: When and How to Use It
When you have two survey questions with predefined answer options and you want to know whether responses to one are related to responses to the other, the chi-square test is your tool. It is one of the oldest and most frequently used statistical tests, and its elegance lies in its simplicity: it compares what you observed with what you would expect if there were no relationship at all.
What Is the Chi-Square Test?
The chi-square test (also written as the chi-squared test) is a nonparametric statistical test used to analyze categorical (nominal or ordinal) variables. Its fundamental logic is the comparison of observed frequencies with expected frequencies.
There are two main types.
1. Test of Independence (Test of Association)
This is the most common use. It tests whether there is a statistically significant association between two categorical variables.
Examples:
- •Is there a relationship between gender and preference for online vs. in-person instruction?
- •Is the choice of academic program associated with place of origin (urban vs. rural)?
- •Does type of high school (academic vs. vocational) influence the intention to enroll in university?
2. Goodness of Fit Test
This tests whether the distribution of a single categorical variable differs from some theoretical distribution.
Examples:
- •Are students evenly distributed across years of study?
- •Does the distribution of responses to a question match a normal distribution?
- •Are responses on a Likert scale evenly distributed?
For the remainder of this article, we will focus on the test of independence, as it is used far more frequently in research practice.
The Contingency Table: Foundation of the Chi-Square Test
A contingency table (cross-tabulation) displays frequencies for each combination of categories across two variables.
Example: We are examining whether there is a relationship between gender (male, female) and preference for type of instruction (online, in-person).
Observed Frequencies (O)
| Online | In-person | Total | |
|---|---|---|---|
| Male | 60 | 40 | 100 |
| Female | 45 | 55 | 100 |
| **Total** | **105** | **95** | **200** |
How Are Expected Frequencies (E) Calculated?
Expected frequencies show what we would expect if there were no relationship between the variables. They are calculated using the formula:
E = (row total x column total) / grand total N
For the cell "Male, Online": E = (100 x 105) / 200 = 52.5
Expected Frequencies (E)
| Online | In-person | Total | |
|---|---|---|---|
| Male | 52.5 | 47.5 | 100 |
| Female | 52.5 | 47.5 | 100 |
| **Total** | **105** | **95** | **200** |
The Chi-Square Formula
chi-square = sum of (O - E)squared / E
For each cell, you calculate how much the observed frequency deviates from the expected, square that difference, divide by the expected frequency, and sum across all cells.
For our example:
- •Cell (Male, Online): (60 - 52.5)squared / 52.5 = 1.07
- •Cell (Male, In-person): (40 - 47.5)squared / 47.5 = 1.18
- •Cell (Female, Online): (45 - 52.5)squared / 52.5 = 1.07
- •Cell (Female, In-person): (55 - 47.5)squared / 47.5 = 1.18
chi-square = 1.07 + 1.18 + 1.07 + 1.18 = 4.50
Degrees of Freedom
df = (number of rows - 1) x (number of columns - 1) = (2 - 1) x (2 - 1) = 1
With df = 1, the critical value for p = .05 is 3.84. Our chi-square = 4.50 > 3.84, so the result is statistically significant.
Assumptions of the Chi-Square Test
Before applying the chi-square test, verify these assumptions.
1. Categorical Variables
Both variables must be categorical (nominal or ordinal). You cannot use the chi-square test on continuous variables. If you have a continuous variable, you need to either categorize it (e.g., "low, medium, high score") or use a different test.
If you are working with continuous variables and want to test differences between groups, see the articles on the t-test or ANOVA.
2. Expected Frequencies Greater Than 5
In every cell of the contingency table, the expected frequency must be at least 5. If it is not, the chi-square test is unreliable because the distribution of the test statistic does not adequately follow the chi-square distribution.
What to do if this assumption is violated:
- •Collapse categories (e.g., instead of 5 categories, create 3)
- •Use Fisher's exact test (see below)
- •Increase your sample size
3. Independence of Observations
Each participant may appear only once in the table. This means you cannot use the chi-square test for repeated measures (the same participants tested twice). For such data, use the McNemar test.
Cramer's V: Effect Size
Statistical significance tells you that a relationship exists, but not how strong it is. That is what Cramer's V is for.
| Cramer's V | Effect Size (for df* = 1) |
|---|---|
| .10 | Small effect |
| .30 | Medium effect |
| .50 | Large effect |
*For tables larger than 2x2, thresholds depend on the number of degrees of freedom.
For our example: V = sqrt(chi-square / (N x df*)) = sqrt(4.50 / (200 x 1)) = sqrt(0.0225) = .15
This is a small to medium effect, meaning there is a statistically significant but not particularly strong relationship between gender and instruction preference.
Fisher's Exact Test: Alternative for Small Samples
When you have small samples and expected frequencies below 5, Fisher's exact test is the better choice. Instead of using an approximation (like the chi-square), it calculates the exact probability of obtaining the observed table or one more extreme.
When to use Fisher's test:
- •When total N is less than 20
- •When any expected frequency falls below 5
- •For 2x2 tables (for larger tables, the Fisher-Freeman-Halton test is used)
Fisher's test is more conservative than the chi-square test, meaning you are less likely to obtain a false positive result.
Practical Example: Gender and Instruction Preference
Let us return to our example in more detail. A researcher wants to examine whether there is a relationship between student gender and preference for online vs. in-person instruction. Data are collected via a survey of 200 psychology students.
Hypotheses:
- •H0: There is no relationship between gender and instruction preference (the variables are independent).
- •H1: There is a relationship between gender and instruction preference (the variables are not independent).
Checking assumptions:
- Both variables are categorical (gender: male/female; instruction type: online/in-person). Check.
- All expected frequencies are greater than 5 (minimum is 47.5). Check.
- Each student was surveyed once; observations are independent. Check.
Results:
chi-square(1, N = 200) = 4.50, p = .034, V = .15
Interpretation:
Results of the chi-square test of independence indicate a statistically significant association between gender and instruction preference, chi-square(1, N = 200) = 4.50, p = .034. Cramer's V = .15 indicates a small effect. Inspection of observed frequencies reveals that male students more frequently chose online instruction (60% vs. 45%), while female students more frequently preferred in-person instruction (55% vs. 40%).
Limitations:
This finding does not tell us why this difference exists. Possible explanations include differences in learning styles, different social needs, or confounding variables such as academic program or year of study.
APA Format for Reporting
The chi-square test has a specific APA reporting format.
For the test of independence:
chi-square(df, N = total) = value, p = value, V = value
Examples:
- •chi-square(1, N = 200) = 5.83, p = .016, V = .17
- •chi-square(2, N = 350) = 12.45, p = .002, V = .19
- •chi-square(4, N = 500) = 8.21, p = .084 (not significant; V is not reported)
For the goodness of fit test:
chi-square(df, N = total) = value, p = value
Important: Always report effect size (Cramer's V) alongside significant results. Statistical significance without effect size does not tell you much. This is a general rule for all statistical tests, not just the chi-square.
Special Cases and Variants
Yates' Continuity Correction
For 2x2 tables, some software packages automatically apply Yates' correction, which reduces the chi-square value. This is more conservative and reduces the risk of false positives, but some statisticians consider it overly conservative. Check whether your software applies the correction and be consistent.
Chi-Square with More Than 2 Categories
When you have a table larger than 2x2 (e.g., 3x4), a significant chi-square test tells you that a relationship exists, but not where. For a more detailed analysis, use:
- •Adjusted standardized residuals (values > |2| are significant)
- •Post-hoc analysis (compare each pair of categories separately with Bonferroni correction)
Common Mistake
Using the chi-square test on continuous data.
Students sometimes make the mistake of taking a continuous variable (e.g., an intelligence test score), splitting it into categories (low, medium, high IQ), and then using the chi-square test instead of a t-test or ANOVA.
Why is this problematic? Categorizing continuous variables leads to a loss of information and reduced statistical power. Instead of using the full range of scores (from 80 to 140), you reduce everything to three categories and lose the nuances in your data.
Example of the mistake: A researcher measures anxiety on a 0-to-60 scale and divides participants into "low anxiety" (0-20), "moderate" (21-40), and "high anxiety" (41-60). Then a chi-square test is used to see whether the distribution differs by gender. This is unjustified because anxiety is a continuous variable. The correct approach would be a t-test to compare mean anxiety between genders, or ANOVA if you have more than two groups.
When is categorization justified? Only when the categories have inherent meaning (e.g., "pass/fail," "satisfied/unsatisfied/neutral" as a response to a multiple-choice question). If the categories are artificial divisions of a continuous variable, use the appropriate parametric test.
Alternative Tests
Here is an overview of when to use which test for categorical data:
| Situation | Test |
|---|---|
| 2 categorical variables, N > 20 | Chi-square test of independence |
| 2 categorical variables, N < 20 | Fisher's exact test |
| Repeated measures, 2x2 table | McNemar test |
| Repeated measures, larger table | Cochran's Q test |
| Ordinal variables | Gamma coefficient, Kendall's tau |
| One variable, theoretical distribution | Chi-square goodness of fit test |
Key Takeaways
- The chi-square test is used for categorical variables.
- Its core logic: comparing observed and expected frequencies.
- Assumptions: expected frequencies greater than 5, independence of observations.
- Cramer's V is the measure of effect size.
- For small samples, use Fisher's exact test.
- APA format: chi-square(df, N = ...) = ..., p = ..., V = ...
- Do not use the chi-square test on categorized continuous variables.
Try the Istrazimo Platform
Istrazimo automatically detects categorical variables and offers the chi-square test with Cramer's V and contingency table visualization. When your survey includes multiple-choice questions, the platform lets you test the association between any two questions with a single click, complete with an APA-formatted report and graphical display. Get started.
Try this in Istražimo
From creating surveys to statistical analysis, all in one place. Free for students and researchers.
Start for free →