Statistics

Chi-Square Test: When and How to Use It

March 20, 202610 min readIstražimo Team

# Chi-Square Test: When and How to Use It

When you have two survey questions with predefined answer options and you want to know whether responses to one are related to responses to the other, the chi-square test is your tool. It is one of the oldest and most frequently used statistical tests, and its elegance lies in its simplicity: it compares what you observed with what you would expect if there were no relationship at all.

What Is the Chi-Square Test?

The chi-square test (also written as the chi-squared test) is a nonparametric statistical test used to analyze categorical (nominal or ordinal) variables. Its fundamental logic is the comparison of observed frequencies with expected frequencies.

There are two main types.

1. Test of Independence (Test of Association)

This is the most common use. It tests whether there is a statistically significant association between two categorical variables.

Examples:

•Is there a relationship between gender and preference for online vs. in-person instruction?
•Is the choice of academic program associated with place of origin (urban vs. rural)?
•Does type of high school (academic vs. vocational) influence the intention to enroll in university?

2. Goodness of Fit Test

This tests whether the distribution of a single categorical variable differs from some theoretical distribution.

Examples:

•Are students evenly distributed across years of study?
•Does the distribution of responses to a question match a normal distribution?
•Are responses on a Likert scale evenly distributed?

For the remainder of this article, we will focus on the test of independence, as it is used far more frequently in research practice.

The Contingency Table: Foundation of the Chi-Square Test

A contingency table (cross-tabulation) displays frequencies for each combination of categories across two variables.

Example: We are examining whether there is a relationship between gender (male, female) and preference for type of instruction (online, in-person).

Observed Frequencies (O)

Online	In-person	Total
Male	60	40	100
Female	45	55	100
Total	105	95	200

How Are Expected Frequencies (E) Calculated?

Expected frequencies show what we would expect if there were no relationship between the variables. They are calculated using the formula:

E = (row total x column total) / grand total N

For the cell "Male, Online": E = (100 x 105) / 200 = 52.5

Expected Frequencies (E)

Online	In-person	Total
Male	52.5	47.5	100
Female	52.5	47.5	100
Total	105	95	200

The Chi-Square Formula

chi-square = sum of (O - E)squared / E

For each cell, you calculate how much the observed frequency deviates from the expected, square that difference, divide by the expected frequency, and sum across all cells.

For our example:

•Cell (Male, Online): (60 - 52.5)squared / 52.5 = 1.07
•Cell (Male, In-person): (40 - 47.5)squared / 47.5 = 1.18
•Cell (Female, Online): (45 - 52.5)squared / 52.5 = 1.07
•Cell (Female, In-person): (55 - 47.5)squared / 47.5 = 1.18

chi-square = 1.07 + 1.18 + 1.07 + 1.18 = 4.50

Degrees of Freedom

df = (number of rows - 1) x (number of columns - 1) = (2 - 1) x (2 - 1) = 1

With df = 1, the critical value for p = .05 is 3.84. Our chi-square = 4.50 > 3.84, so the result is statistically significant.

Assumptions of the Chi-Square Test

Before applying the chi-square test, verify these assumptions.

1. Categorical Variables

Both variables must be categorical (nominal or ordinal). You cannot use the chi-square test on continuous variables. If you have a continuous variable, you need to either categorize it (e.g., "low, medium, high score") or use a different test.

If you are working with continuous variables and want to test differences between groups, see the articles on the t-test or ANOVA.

2. Expected Frequencies Greater Than 5

In every cell of the contingency table, the expected frequency must be at least 5. If it is not, the chi-square test is unreliable because the distribution of the test statistic does not adequately follow the chi-square distribution.

What to do if this assumption is violated:

•Collapse categories (e.g., instead of 5 categories, create 3)
•Use Fisher's exact test (see below)
•Increase your sample size

3. Independence of Observations

Each participant may appear only once in the table. This means you cannot use the chi-square test for repeated measures (the same participants tested twice). For such data, use the McNemar test.

Cramer's V: Effect Size

Statistical significance tells you that a relationship exists, but not how strong it is. That is what Cramer's V is for.

Cramer's V	Effect Size (for df* = 1)
.10	Small effect
.30	Medium effect
.50	Large effect

*For tables larger than 2x2, thresholds depend on the number of degrees of freedom.

For our example: V = sqrt(chi-square / (N x df*)) = sqrt(4.50 / (200 x 1)) = sqrt(0.0225) = .15

This is a small to medium effect, meaning there is a statistically significant but not particularly strong relationship between gender and instruction preference.

Fisher's Exact Test: Alternative for Small Samples

When you have small samples and expected frequencies below 5, Fisher's exact test is the better choice. Instead of using an approximation (like the chi-square), it calculates the exact probability of obtaining the observed table or one more extreme.

When to use Fisher's test:

•When total N is less than 20
•When any expected frequency falls below 5
•For 2x2 tables (for larger tables, the Fisher-Freeman-Halton test is used)

Fisher's test is more conservative than the chi-square test, meaning you are less likely to obtain a false positive result.

Practical Example: Gender and Instruction Preference

Let us return to our example in more detail. A researcher wants to examine whether there is a relationship between student gender and preference for online vs. in-person instruction. Data are collected via a survey of 200 psychology students.

Hypotheses:

•H0: There is no relationship between gender and instruction preference (the variables are independent).
•H1: There is a relationship between gender and instruction preference (the variables are not independent).

Checking assumptions:

Both variables are categorical (gender: male/female; instruction type: online/in-person). Check.
All expected frequencies are greater than 5 (minimum is 47.5). Check.
Each student was surveyed once; observations are independent. Check.

Results:

chi-square(1, N = 200) = 4.50, p = .034, V = .15

Interpretation:

Results of the chi-square test of independence indicate a statistically significant association between gender and instruction preference, chi-square(1, N = 200) = 4.50, p = .034. Cramer's V = .15 indicates a small effect. Inspection of observed frequencies reveals that male students more frequently chose online instruction (60% vs. 45%), while female students more frequently preferred in-person instruction (55% vs. 40%).

Limitations:

This finding does not tell us why this difference exists. Possible explanations include differences in learning styles, different social needs, or confounding variables such as academic program or year of study.

APA Format for Reporting

The chi-square test has a specific APA reporting format.

For the test of independence:

chi-square(df, N = total) = value, p = value, V = value

Examples:

•chi-square(1, N = 200) = 5.83, p = .016, V = .17
•chi-square(2, N = 350) = 12.45, p = .002, V = .19
•chi-square(4, N = 500) = 8.21, p = .084 (not significant; V is not reported)

For the goodness of fit test:

chi-square(df, N = total) = value, p = value

Important: Always report effect size (Cramer's V) alongside significant results. Statistical significance without effect size does not tell you much. This is a general rule for all statistical tests, not just the chi-square.

Special Cases and Variants

Yates' Continuity Correction

For 2x2 tables, some software packages automatically apply Yates' correction, which reduces the chi-square value. This is more conservative and reduces the risk of false positives, but some statisticians consider it overly conservative. Check whether your software applies the correction and be consistent.

Chi-Square with More Than 2 Categories

When you have a table larger than 2x2 (e.g., 3x4), a significant chi-square test tells you that a relationship exists, but not where. For a more detailed analysis, use:

•Adjusted standardized residuals (values > |2| are significant)
•Post-hoc analysis (compare each pair of categories separately with Bonferroni correction)

Common Mistake

Using the chi-square test on continuous data.

Students sometimes make the mistake of taking a continuous variable (e.g., an intelligence test score), splitting it into categories (low, medium, high IQ), and then using the chi-square test instead of a t-test or ANOVA.

Why is this problematic? Categorizing continuous variables leads to a loss of information and reduced statistical power. Instead of using the full range of scores (from 80 to 140), you reduce everything to three categories and lose the nuances in your data.

Example of the mistake: A researcher measures anxiety on a 0-to-60 scale and divides participants into "low anxiety" (0-20), "moderate" (21-40), and "high anxiety" (41-60). Then a chi-square test is used to see whether the distribution differs by gender. This is unjustified because anxiety is a continuous variable. The correct approach would be a t-test to compare mean anxiety between genders, or ANOVA if you have more than two groups.

When is categorization justified? Only when the categories have inherent meaning (e.g., "pass/fail," "satisfied/unsatisfied/neutral" as a response to a multiple-choice question). If the categories are artificial divisions of a continuous variable, use the appropriate parametric test.

Alternative Tests

Here is an overview of when to use which test for categorical data:

Situation	Test
2 categorical variables, N > 20	Chi-square test of independence
2 categorical variables, N < 20	Fisher's exact test
Repeated measures, 2x2 table	McNemar test
Repeated measures, larger table	Cochran's Q test
Ordinal variables	Gamma coefficient, Kendall's tau
One variable, theoretical distribution	Chi-square goodness of fit test

Key Takeaways

The chi-square test is used for categorical variables.
Its core logic: comparing observed and expected frequencies.
Assumptions: expected frequencies greater than 5, independence of observations.
Cramer's V is the measure of effect size.
For small samples, use Fisher's exact test.
APA format: chi-square(df, N = ...) = ..., p = ..., V = ...
Do not use the chi-square test on categorized continuous variables.

Try the Istrazimo Platform

Istrazimo automatically detects categorical variables and offers the chi-square test with Cramer's V and contingency table visualization. When your survey includes multiple-choice questions, the platform lets you test the association between any two questions with a single click, complete with an APA-formatted report and graphical display. Get started.

Try this in Istražimo

From creating surveys to statistical analysis, all in one place. Free for students and researchers.

Start for free →

Statistics

T-Test Explained: When to Use It and How to Interpret Results

Statistics

ANOVA Explained: Analysis of Variance for Beginners

Statistics

Correlation vs Causation: Why Association Isn't Proof

Statistics

Chi-Square Test: When and How to Use It

March 20, 202610 min readIstražimo Team

# Chi-Square Test: When and How to Use It

What Is the Chi-Square Test?

There are two main types.

1. Test of Independence (Test of Association)

This is the most common use. It tests whether there is a statistically significant association between two categorical variables.

Examples:

•Is there a relationship between gender and preference for online vs. in-person instruction?
•Is the choice of academic program associated with place of origin (urban vs. rural)?
•Does type of high school (academic vs. vocational) influence the intention to enroll in university?

2. Goodness of Fit Test

This tests whether the distribution of a single categorical variable differs from some theoretical distribution.

Examples:

•Are students evenly distributed across years of study?
•Does the distribution of responses to a question match a normal distribution?
•Are responses on a Likert scale evenly distributed?

For the remainder of this article, we will focus on the test of independence, as it is used far more frequently in research practice.

The Contingency Table: Foundation of the Chi-Square Test

A contingency table (cross-tabulation) displays frequencies for each combination of categories across two variables.

Example: We are examining whether there is a relationship between gender (male, female) and preference for type of instruction (online, in-person).

Observed Frequencies (O)

Online	In-person	Total
Male	60	40	100
Female	45	55	100
Total	105	95	200

How Are Expected Frequencies (E) Calculated?

Expected frequencies show what we would expect if there were no relationship between the variables. They are calculated using the formula:

E = (row total x column total) / grand total N

For the cell "Male, Online": E = (100 x 105) / 200 = 52.5

Expected Frequencies (E)

Online	In-person	Total
Male	52.5	47.5	100
Female	52.5	47.5	100
Total	105	95	200

The Chi-Square Formula

chi-square = sum of (O - E)squared / E

For each cell, you calculate how much the observed frequency deviates from the expected, square that difference, divide by the expected frequency, and sum across all cells.

For our example:

•Cell (Male, Online): (60 - 52.5)squared / 52.5 = 1.07
•Cell (Male, In-person): (40 - 47.5)squared / 47.5 = 1.18
•Cell (Female, Online): (45 - 52.5)squared / 52.5 = 1.07
•Cell (Female, In-person): (55 - 47.5)squared / 47.5 = 1.18

chi-square = 1.07 + 1.18 + 1.07 + 1.18 = 4.50

Degrees of Freedom

df = (number of rows - 1) x (number of columns - 1) = (2 - 1) x (2 - 1) = 1

With df = 1, the critical value for p = .05 is 3.84. Our chi-square = 4.50 > 3.84, so the result is statistically significant.

Assumptions of the Chi-Square Test

Before applying the chi-square test, verify these assumptions.

1. Categorical Variables

If you are working with continuous variables and want to test differences between groups, see the articles on the t-test or ANOVA.

2. Expected Frequencies Greater Than 5

What to do if this assumption is violated:

•Collapse categories (e.g., instead of 5 categories, create 3)
•Use Fisher's exact test (see below)
•Increase your sample size

3. Independence of Observations

Each participant may appear only once in the table. This means you cannot use the chi-square test for repeated measures (the same participants tested twice). For such data, use the McNemar test.

Cramer's V: Effect Size

Statistical significance tells you that a relationship exists, but not how strong it is. That is what Cramer's V is for.

Cramer's V	Effect Size (for df* = 1)
.10	Small effect
.30	Medium effect
.50	Large effect

*For tables larger than 2x2, thresholds depend on the number of degrees of freedom.

For our example: V = sqrt(chi-square / (N x df*)) = sqrt(4.50 / (200 x 1)) = sqrt(0.0225) = .15

This is a small to medium effect, meaning there is a statistically significant but not particularly strong relationship between gender and instruction preference.

Fisher's Exact Test: Alternative for Small Samples

When to use Fisher's test:

•When total N is less than 20
•When any expected frequency falls below 5
•For 2x2 tables (for larger tables, the Fisher-Freeman-Halton test is used)

Fisher's test is more conservative than the chi-square test, meaning you are less likely to obtain a false positive result.

Practical Example: Gender and Instruction Preference

Hypotheses:

•H0: There is no relationship between gender and instruction preference (the variables are independent).
•H1: There is a relationship between gender and instruction preference (the variables are not independent).

Checking assumptions:

Both variables are categorical (gender: male/female; instruction type: online/in-person). Check.
All expected frequencies are greater than 5 (minimum is 47.5). Check.
Each student was surveyed once; observations are independent. Check.

Results:

chi-square(1, N = 200) = 4.50, p = .034, V = .15

Interpretation:

Limitations:

APA Format for Reporting

The chi-square test has a specific APA reporting format.

For the test of independence:

chi-square(df, N = total) = value, p = value, V = value

Examples:

•chi-square(1, N = 200) = 5.83, p = .016, V = .17
•chi-square(2, N = 350) = 12.45, p = .002, V = .19
•chi-square(4, N = 500) = 8.21, p = .084 (not significant; V is not reported)

For the goodness of fit test:

chi-square(df, N = total) = value, p = value

Special Cases and Variants

Yates' Continuity Correction

Chi-Square with More Than 2 Categories

When you have a table larger than 2x2 (e.g., 3x4), a significant chi-square test tells you that a relationship exists, but not where. For a more detailed analysis, use:

•Adjusted standardized residuals (values > |2| are significant)
•Post-hoc analysis (compare each pair of categories separately with Bonferroni correction)

Common Mistake

Using the chi-square test on continuous data.

Alternative Tests

Here is an overview of when to use which test for categorical data:

Situation	Test
2 categorical variables, N > 20	Chi-square test of independence
2 categorical variables, N < 20	Fisher's exact test
Repeated measures, 2x2 table	McNemar test
Repeated measures, larger table	Cochran's Q test
Ordinal variables	Gamma coefficient, Kendall's tau
One variable, theoretical distribution	Chi-square goodness of fit test

Key Takeaways

The chi-square test is used for categorical variables.
Its core logic: comparing observed and expected frequencies.
Assumptions: expected frequencies greater than 5, independence of observations.
Cramer's V is the measure of effect size.
For small samples, use Fisher's exact test.
APA format: chi-square(df, N = ...) = ..., p = ..., V = ...
Do not use the chi-square test on categorized continuous variables.

Try the Istrazimo Platform

Try this in Istražimo

From creating surveys to statistical analysis, all in one place. Free for students and researchers.

Start for free →

Statistics

What Is the Chi-Square Test?

1. Test of Independence (Test of Association)

2. Goodness of Fit Test

The Contingency Table: Foundation of the Chi-Square Test

Observed Frequencies (O)

How Are Expected Frequencies (E) Calculated?

Expected Frequencies (E)

The Chi-Square Formula

Degrees of Freedom

Assumptions of the Chi-Square Test

1. Categorical Variables

2. Expected Frequencies Greater Than 5

3. Independence of Observations

Cramer's V: Effect Size

Fisher's Exact Test: Alternative for Small Samples

Practical Example: Gender and Instruction Preference

APA Format for Reporting

Special Cases and Variants

Yates' Continuity Correction

Chi-Square with More Than 2 Categories

Common Mistake

Alternative Tests

Key Takeaways

Try the Istrazimo Platform

Try this in Istražimo

Related posts

T-Test Explained: When to Use It and How to Interpret Results

ANOVA Explained: Analysis of Variance for Beginners

Correlation vs Causation: Why Association Isn't Proof

What Is the Chi-Square Test?

1. Test of Independence (Test of Association)

2. Goodness of Fit Test

The Contingency Table: Foundation of the Chi-Square Test

Observed Frequencies (O)

How Are Expected Frequencies (E) Calculated?

Expected Frequencies (E)

The Chi-Square Formula

Degrees of Freedom

Assumptions of the Chi-Square Test

1. Categorical Variables

2. Expected Frequencies Greater Than 5

3. Independence of Observations

Cramer's V: Effect Size

Fisher's Exact Test: Alternative for Small Samples

Practical Example: Gender and Instruction Preference

APA Format for Reporting

Special Cases and Variants

Yates' Continuity Correction

Chi-Square with More Than 2 Categories

Common Mistake

Alternative Tests

Key Takeaways

Try the Istrazimo Platform

Try this in Istražimo

Related posts

T-Test Explained: When to Use It and How to Interpret Results

ANOVA Explained: Analysis of Variance for Beginners

Correlation vs Causation: Why Association Isn't Proof