Psychometrics

Factor Analysis: A Guide to Exploratory Factor Analysis (EFA)

March 21, 202610 min readIstražimo Team

# Factor Analysis: A Guide to Exploratory Factor Analysis (EFA)

You have a questionnaire with 30 items and you are wondering: do these 30 items actually measure three or four different things? Do some items cluster together? Can you reduce 30 variables to a smaller number of meaningful dimensions? That is what factor analysis is for.

Factor analysis is one of the most powerful psychometric techniques, but also one of the most frequently misapplied. Understanding its principles and prerequisites is the difference between a quality paper and statistical chaos.

What Is Factor Analysis and Why Use It?

Factor analysis is a statistical method that identifies latent (hidden) constructs based on correlations among observed variables. In other words, it looks for structure in your data.

Imagine you have a questionnaire about attitudes toward technology in education with 24 items. Respondents rate each item on a scale from 1 to 5. When you look at the correlation matrix, you notice that some items cluster: items about the usefulness of technology correlate with each other, items about fear of technology correlate with each other, and items about ease of use correlate with each other.

Factor analysis formally identifies these groups and tells you: "Your 24 items reduce to 3 factors: perceived usefulness, technology anxiety, and perceived ease of use."

Two fundamental goals:

Variable reduction (reduce 24 items to 3 factor scores)
Identification of latent constructs (discover what your questionnaire actually measures)

EFA vs CFA: When to Use Which?

These are two fundamentally different approaches to factor analysis, and it is important to know the distinction.

Exploratory factor analysis (EFA) is used when you do not know in advance how many factors exist or which items belong to which factor. You let the data "speak" and discover the structure based on statistical criteria.

Use EFA when:

•You are developing a new questionnaire and testing its structure
•You are translating a questionnaire into another language and want to check whether the structure replicates
•You do not have a clear theoretical basis for the expected structure

Confirmatory factor analysis (CFA) is used when you have a precise hypothesis about how many factors exist and which items belong to which factor. You test whether the data fit your model.

Use CFA when:

•You are validating an existing questionnaire on a new sample
•You are testing a theoretical model
•You are comparing two or more alternative structural models

In practice, a good strategy is: EFA on one sample, then CFA on another (or split your sample in half).

Prerequisites for Factor Analysis

Before running EFA, you need to check three things.

1. Sample Size

There are multiple rules for minimum sample size:

•The 5:1 rule means you need at least 5 participants per item. For a 24-item questionnaire, that is a minimum of 120 participants.
•The 10:1 rule is a more conservative estimate and is recommended when communalities are weak.
•An absolute minimum of 100 participants, regardless of the number of items.

For more detailed guidance on planning your sample size, see the article on determining how many participants you need.

2. The KMO Test (Kaiser-Meyer-Olkin)

The KMO measure of sampling adequacy indicates how suitable your variables are for factor analysis. KMO ranges from 0 to 1.

KMO Value	Interpretation
< 0.50	Unacceptable
0.50 - 0.59	Poor
0.60 - 0.69	Mediocre
0.70 - 0.79	Good
0.80 - 0.89	Very good
>= 0.90	Excellent

Minimum for factor analysis: KMO > 0.60. If KMO falls below this threshold, your variables do not share enough common variance and factor analysis is not justified.

3. Bartlett's Test of Sphericity

This test checks whether the correlation matrix is sufficiently different from an identity matrix (a matrix in which all correlations are zero). If Bartlett's test is not significant (p > .05), it means your variables are not sufficiently correlated for factor analysis.

Requirement: p < .05. In practice, this test is almost always significant with a reasonable sample size, so KMO is generally considered the more informative indicator.

Extraction Method: PCA vs PAF

This is where the confusion that has puzzled students and researchers for years begins.

Principal Components Analysis (PCA)

PCA is a data reduction technique. It transforms variables into linear combinations (components) that explain maximum total variance. PCA does not assume latent constructs and does not differentiate between shared and unique variance.

Principal Axis Factoring (PAF)

PAF is a true factor analysis method. It assumes that observed variables are indicators of latent constructs and attempts to identify the shared variance among variables.

The key difference: PCA analyzes total variance (including measurement error), while PAF analyzes only shared variance.

Which Method to Choose?

•If you want data reduction without theoretical assumptions: PCA
•If you are looking for latent constructs and developing theory: PAF
•For psychometric purposes (questionnaire development): PAF is the better choice

Rotation: Varimax vs Oblimin

After extraction, factors are rotated to make them more interpretable. There are two categories of rotation.

Orthogonal Rotation (Varimax)

Varimax assumes that factors are independent (uncorrelated). After rotation, each variable has a high loading on one factor and low loadings on the others.

Use Varimax when:

•You theoretically expect the constructs to be independent
•You want a simpler structure for interpretation
•You are running an exploratory analysis without clear hypotheses

Oblique Rotation (Oblimin)

Oblimin allows factors to be correlated. This is a more realistic approach because in psychology and the social sciences, constructs are rarely completely independent (e.g., anxiety and depression are correlated).

Use Oblimin when:

•You expect factors to be correlated
•You are working with psychological constructs that naturally overlap
•You want a more realistic picture of the data structure

Practical tip: Run both rotations. If Oblimin shows inter-factor correlations below .32, the results will be nearly identical to Varimax, and you can use the simpler Varimax solution.

How to Determine the Number of Factors

This is one of the most important decisions in factor analysis, and unfortunately, there is no single correct answer. Three criteria are commonly used.

1. Kaiser's Criterion (Eigenvalue > 1)

Retain only factors with an eigenvalue greater than 1. This is the most commonly used criterion, but also the most criticized because it tends to overestimate the number of factors.

2. Scree Plot (Cattell's Scree Test)

Eigenvalues are plotted on a graph, and you look for the "elbow" (the point where the curve sharply changes slope). Factors above the elbow are retained. The problem is that identifying the elbow is subjective.

3. Parallel Analysis (Horn)

This is the most objective criterion. A large number of random matrices of the same dimensions as your data are generated, eigenvalues are computed for each, and only factors whose eigenvalues exceed the average eigenvalues from the random matrices are retained.

Recommendation: Use all three criteria and look for convergence. If Kaiser says 4, the scree plot says 3 or 4, and parallel analysis says 3, you probably have 3 factors.

How to Interpret Factor Loadings

A factor loading is the correlation between an item and a factor. The higher the loading, the better the item serves as an indicator of that factor.

Guidelines:

•> 0.70 = excellent loading
•0.55 - 0.70 = good loading
•0.40 - 0.55 = acceptable loading
•< 0.40 = the item does not belong to the factor (consider removing it)

Cross-loadings: If an item has a loading above 0.40 on two or more factors, that is problematic. Such an item is "ambiguous" and is typically removed from the questionnaire.

Practical Example: Attitudes Toward Technology in Education

Suppose you are developing a questionnaire about attitudes toward technology use in education. You started with 24 items and collected data from 250 students.

Step 1: Prerequisites

•KMO = 0.87 (very good)
•Bartlett's test: chi-square(276) = 2341.5, p < .001 (significant)
•Sample of 250 for 24 items = 10.4:1 ratio (excellent)

Step 2: Extraction (PAF) and Determining the Number of Factors

•Kaiser: 4 factors with eigenvalue > 1
•Scree plot: elbow at 3 or 4 factors
•Parallel analysis: 3 factors
•Decision: 3 factors

Step 3: Rotation (Oblimin)

Item	F1: Usefulness	F2: Anxiety	F3: Ease
Technology improves the quality of teaching	.78	.05	.12
Students learn better with digital tools	.73	-.08	.15
I feel nervous when using new applications	.02	.81	-.10
I am afraid of making mistakes on a computer	-.05	.76	-.14
I easily master new technologies	.11	-.12	.72
I intuitively understand how software works	.08	-.06	.69

Step 4: Interpretation

•Factor 1 (Perceived Usefulness): items about how useful technology is for learning
•Factor 2 (Technology Anxiety): items about fear and discomfort when using technology
•Factor 3 (Perceived Ease of Use): items about how easy it is to use technology

The three factors together explain 58.3% of total variance, which is acceptable for the social sciences (50-60% is considered good).

Once you establish the factor structure, the next step is checking the reliability of each subscale. For that, see the guide on Cronbach's alpha coefficient, which is the standard measure of internal consistency.

Common Mistake

Using PCA and calling it "factor analysis."

This is so widespread that many researchers do not realize they are making an error. PCA (Principal Components Analysis) and FA (Factor Analysis) are mathematically different procedures with different assumptions.

PCA looks for linear combinations of variables that explain maximum total variance. FA looks for latent constructs that explain shared variance.

Why does this matter? If you write "factor analysis was conducted" in your paper but actually used PCA, a knowledgeable reviewer will flag it. More importantly, PCA typically produces higher factor loadings and can create a false impression of item quality.

How to avoid this mistake:

In SPSS: when choosing extraction, explicitly select "Principal Axis Factoring" instead of "Principal Components"
In your paper: clearly state the extraction method, rotation type, and criterion for the number of factors
If you use PCA, be honest and write "principal components analysis," not "factor analysis"

Reporting in APA Format

When reporting factor analysis, be sure to include:

Extraction method (PAF, PCA, ML...)
Type of rotation (Varimax, Oblimin...)
Criterion for number of factors
KMO and Bartlett's test
Percentage of variance explained
Table of factor loadings (with cross-loadings)
Correlations among factors (if using oblique rotation)

Example: "Exploratory factor analysis was conducted using principal axis factoring (PAF) with Oblimin rotation. The KMO measure of sampling adequacy was .87, and Bartlett's test of sphericity was statistically significant (chi-square(276) = 2341.5, p < .001). Based on parallel analysis and the scree plot, three factors were extracted, accounting for 58.3% of total variance."

Try the Istrazimo Platform

Istrazimo includes exploratory factor analysis with the KMO test, scree plot, and automatic factor determination. Instead of configuring everything manually in SPSS or R, you can run a complete EFA in just a few clicks, with a clear loadings table and a scree plot visualization that you can export directly for your paper. Get started.

Try this in Istražimo

From creating surveys to statistical analysis, all in one place. Free for students and researchers.

Start for free →

Psychometrics

Cronbach's Alpha Explained: How to Measure Scale Reliability

Psychometrics

Factor Analysis: A Guide to Exploratory Factor Analysis (EFA)

March 21, 202610 min readIstražimo Team

# Factor Analysis: A Guide to Exploratory Factor Analysis (EFA)

What Is Factor Analysis and Why Use It?

Factor analysis is a statistical method that identifies latent (hidden) constructs based on correlations among observed variables. In other words, it looks for structure in your data.

Factor analysis formally identifies these groups and tells you: "Your 24 items reduce to 3 factors: perceived usefulness, technology anxiety, and perceived ease of use."

Two fundamental goals:

Variable reduction (reduce 24 items to 3 factor scores)
Identification of latent constructs (discover what your questionnaire actually measures)

EFA vs CFA: When to Use Which?

These are two fundamentally different approaches to factor analysis, and it is important to know the distinction.

Use EFA when:

•You are developing a new questionnaire and testing its structure
•You are translating a questionnaire into another language and want to check whether the structure replicates
•You do not have a clear theoretical basis for the expected structure

Confirmatory factor analysis (CFA) is used when you have a precise hypothesis about how many factors exist and which items belong to which factor. You test whether the data fit your model.

Use CFA when:

•You are validating an existing questionnaire on a new sample
•You are testing a theoretical model
•You are comparing two or more alternative structural models

In practice, a good strategy is: EFA on one sample, then CFA on another (or split your sample in half).

Prerequisites for Factor Analysis

Before running EFA, you need to check three things.

1. Sample Size

There are multiple rules for minimum sample size:

•The 5:1 rule means you need at least 5 participants per item. For a 24-item questionnaire, that is a minimum of 120 participants.
•The 10:1 rule is a more conservative estimate and is recommended when communalities are weak.
•An absolute minimum of 100 participants, regardless of the number of items.

For more detailed guidance on planning your sample size, see the article on determining how many participants you need.

2. The KMO Test (Kaiser-Meyer-Olkin)

The KMO measure of sampling adequacy indicates how suitable your variables are for factor analysis. KMO ranges from 0 to 1.

KMO Value	Interpretation
< 0.50	Unacceptable
0.50 - 0.59	Poor
0.60 - 0.69	Mediocre
0.70 - 0.79	Good
0.80 - 0.89	Very good
>= 0.90	Excellent

Minimum for factor analysis: KMO > 0.60. If KMO falls below this threshold, your variables do not share enough common variance and factor analysis is not justified.

3. Bartlett's Test of Sphericity

Requirement: p < .05. In practice, this test is almost always significant with a reasonable sample size, so KMO is generally considered the more informative indicator.

Extraction Method: PCA vs PAF

This is where the confusion that has puzzled students and researchers for years begins.

Principal Components Analysis (PCA)

Principal Axis Factoring (PAF)

PAF is a true factor analysis method. It assumes that observed variables are indicators of latent constructs and attempts to identify the shared variance among variables.

The key difference: PCA analyzes total variance (including measurement error), while PAF analyzes only shared variance.

Which Method to Choose?

•If you want data reduction without theoretical assumptions: PCA
•If you are looking for latent constructs and developing theory: PAF
•For psychometric purposes (questionnaire development): PAF is the better choice

Rotation: Varimax vs Oblimin

After extraction, factors are rotated to make them more interpretable. There are two categories of rotation.

Orthogonal Rotation (Varimax)

Varimax assumes that factors are independent (uncorrelated). After rotation, each variable has a high loading on one factor and low loadings on the others.

Use Varimax when:

•You theoretically expect the constructs to be independent
•You want a simpler structure for interpretation
•You are running an exploratory analysis without clear hypotheses

Oblique Rotation (Oblimin)

Use Oblimin when:

•You expect factors to be correlated
•You are working with psychological constructs that naturally overlap
•You want a more realistic picture of the data structure

Practical tip: Run both rotations. If Oblimin shows inter-factor correlations below .32, the results will be nearly identical to Varimax, and you can use the simpler Varimax solution.

How to Determine the Number of Factors

This is one of the most important decisions in factor analysis, and unfortunately, there is no single correct answer. Three criteria are commonly used.

1. Kaiser's Criterion (Eigenvalue > 1)

Retain only factors with an eigenvalue greater than 1. This is the most commonly used criterion, but also the most criticized because it tends to overestimate the number of factors.

2. Scree Plot (Cattell's Scree Test)

3. Parallel Analysis (Horn)

Recommendation: Use all three criteria and look for convergence. If Kaiser says 4, the scree plot says 3 or 4, and parallel analysis says 3, you probably have 3 factors.

How to Interpret Factor Loadings

A factor loading is the correlation between an item and a factor. The higher the loading, the better the item serves as an indicator of that factor.

Guidelines:

•> 0.70 = excellent loading
•0.55 - 0.70 = good loading
•0.40 - 0.55 = acceptable loading
•< 0.40 = the item does not belong to the factor (consider removing it)

Cross-loadings: If an item has a loading above 0.40 on two or more factors, that is problematic. Such an item is "ambiguous" and is typically removed from the questionnaire.

Practical Example: Attitudes Toward Technology in Education

Suppose you are developing a questionnaire about attitudes toward technology use in education. You started with 24 items and collected data from 250 students.

Step 1: Prerequisites

•KMO = 0.87 (very good)
•Bartlett's test: chi-square(276) = 2341.5, p < .001 (significant)
•Sample of 250 for 24 items = 10.4:1 ratio (excellent)

Step 2: Extraction (PAF) and Determining the Number of Factors

•Kaiser: 4 factors with eigenvalue > 1
•Scree plot: elbow at 3 or 4 factors
•Parallel analysis: 3 factors
•Decision: 3 factors

Step 3: Rotation (Oblimin)

Item	F1: Usefulness	F2: Anxiety	F3: Ease
Technology improves the quality of teaching	.78	.05	.12
Students learn better with digital tools	.73	-.08	.15
I feel nervous when using new applications	.02	.81	-.10
I am afraid of making mistakes on a computer	-.05	.76	-.14
I easily master new technologies	.11	-.12	.72
I intuitively understand how software works	.08	-.06	.69

Step 4: Interpretation

•Factor 1 (Perceived Usefulness): items about how useful technology is for learning
•Factor 2 (Technology Anxiety): items about fear and discomfort when using technology
•Factor 3 (Perceived Ease of Use): items about how easy it is to use technology

The three factors together explain 58.3% of total variance, which is acceptable for the social sciences (50-60% is considered good).

Common Mistake

Using PCA and calling it "factor analysis."

PCA looks for linear combinations of variables that explain maximum total variance. FA looks for latent constructs that explain shared variance.

How to avoid this mistake:

In SPSS: when choosing extraction, explicitly select "Principal Axis Factoring" instead of "Principal Components"
In your paper: clearly state the extraction method, rotation type, and criterion for the number of factors
If you use PCA, be honest and write "principal components analysis," not "factor analysis"

Reporting in APA Format

When reporting factor analysis, be sure to include:

Extraction method (PAF, PCA, ML...)
Type of rotation (Varimax, Oblimin...)
Criterion for number of factors
KMO and Bartlett's test
Percentage of variance explained
Table of factor loadings (with cross-loadings)
Correlations among factors (if using oblique rotation)

Try the Istrazimo Platform

Try this in Istražimo

From creating surveys to statistical analysis, all in one place. Free for students and researchers.

Start for free →

Psychometrics

What Is Factor Analysis and Why Use It?

EFA vs CFA: When to Use Which?

Prerequisites for Factor Analysis

1. Sample Size

2. The KMO Test (Kaiser-Meyer-Olkin)

3. Bartlett's Test of Sphericity

Extraction Method: PCA vs PAF

Principal Components Analysis (PCA)

Principal Axis Factoring (PAF)

Which Method to Choose?

Rotation: Varimax vs Oblimin

Orthogonal Rotation (Varimax)

Oblique Rotation (Oblimin)

How to Determine the Number of Factors

1. Kaiser's Criterion (Eigenvalue > 1)

2. Scree Plot (Cattell's Scree Test)

3. Parallel Analysis (Horn)

How to Interpret Factor Loadings

Practical Example: Attitudes Toward Technology in Education

Common Mistake

Reporting in APA Format

Try the Istrazimo Platform

Try this in Istražimo

Related posts

Cronbach's Alpha Explained: How to Measure Scale Reliability

What Is Factor Analysis and Why Use It?

EFA vs CFA: When to Use Which?

Prerequisites for Factor Analysis

1. Sample Size

2. The KMO Test (Kaiser-Meyer-Olkin)

3. Bartlett's Test of Sphericity

Extraction Method: PCA vs PAF

Principal Components Analysis (PCA)

Principal Axis Factoring (PAF)

Which Method to Choose?

Rotation: Varimax vs Oblimin

Orthogonal Rotation (Varimax)

Oblique Rotation (Oblimin)

How to Determine the Number of Factors

1. Kaiser's Criterion (Eigenvalue > 1)

2. Scree Plot (Cattell's Scree Test)

3. Parallel Analysis (Horn)

How to Interpret Factor Loadings

Practical Example: Attitudes Toward Technology in Education

Common Mistake

Reporting in APA Format

Try the Istrazimo Platform

Try this in Istražimo

Related posts

Cronbach's Alpha Explained: How to Measure Scale Reliability