# Principal Components Analysis

After this transformation, we only really have one relevant dimension and thus we can discard the second axis.

Comparing the original observations with our new projections, we can see that it’s not an exact representation of our data.

However, one could argue it does capture the essence of our data — not exact, but enough to be meaningful.

Let’s take another look at the data.

Specifically, look at the spread of the data along the green and orange directions.

Notice that there’s a much more deviation along the green direction than there is along the orange direction.

Projecting our observations onto the orange vector requires us to move much further than we would need to for projecting onto the green vector.

It turns out, the vector which is capable of capturing the maximum variance of the data minimizes the distance required to move our observations as projection onto the vector.

Factor AnalysisFactor analysis is a statistical procedure to identify interrelationships that exist among a large number of variables, i.

e.

, to identify how suites of variables are related.

Factor analysis can be used for exploratory or confirmatory purposes.

As an exploratory procedure, factor analysis is used to search for a possible underlying structure in the variables.

In confirmatory research, the researcher evaluates how similar the actual structure of the data, as indicated by factor analysis, is to the expected structure.

The major difference between exploratory and confirmatory factor analysis is that researcher has formulated hypotheses about the underlying structure of the variables when using factor analysis for confirmatory purposes.

As an exploratory tool, factor analysis doesn’t have many statistical assumptions.

The only real assumption is presence of relatedness between the variables as represented by the correlation coefficient.

If there are no correlations, then there is no underlying structure.

Steps in conducting a factor analysis :There are five basic factor analysis steps:Data collection and generation of the correlation matrixPartition of variance into common and unique components (unique may include random error variability)Extraction of initial factor solutionRotation and interpretationConstruction of scales or factor scores to use in further analyses.