The Mathematics Behind Principal Component Analysis

The whole process of obtaining principle components from a raw dataset can be simplified in six parts :Take the whole dataset consisting of d+1 dimensions and ignore the labels such that our new dataset becomes d dimensional.Compute the mean for every dimension of the whole dataset.Compute the covariance matrix of the whole dataset.Compute eigenvectors and the corresponding eigenvalues.Sort the eigenvectors by decreasing eigenvalues and choose k eigenvectors with the largest eigenvalues to form a d × k dimensional matrix W.Use this d × k eigenvector matrix to transform the samples onto the new subspace.So, let’s unfurl the maths behind each of this one by one.Take the whole dataset consisting of d+1 dimensions and ignore the labels such that our new dataset becomes d dimensional.Let’s say we have a dataset which is d+1 dimensional..Where d could be thought as X_train and 1 could be thought as X_test (labels) in modern machine learning paradigm..So, X_train + X_test makes up our complete dataset.So, after we drop the labels we are left with d dimensional dataset and this would be the dataset we will use to find the principal components..Also, let’s assume we are left with a three-dimensional dataset after ignoring the labels i.e d = 3.we will assume that the samples stem from two different classes, where one-half samples of our dataset are labeled class 1 and the other half class 2.Let our data matrix X be the score of three students :2..Compute the mean of every dimension of the whole dataset.The data from the above table can be represented in matrix A, where each column in the matrix shows scores on a test and each row shows the score of a student.Matrix ASo, The mean of matrix A would beMean of Matrix A3..Compute the covariance matrix of the whole dataset ( sometimes also called as the variance-covariance matrix)So, we can compute the covariance of two variables X and Y using the following formulaUsing the above formula, we can find the covariance matrix of A..Also, the result would be a square matrix of d ×d dimensions.Let’s rewrite our original matrix like thisMatrix AIts covariance matrix would beCovariance Matrix of AFew points that can be noted here is :Shown in Blue along the diagonal, we see the variance of scores for each test..The art test has the biggest variance (720); and the English test, the smallest (360). So we can say that art test scores have more variability than English test scores.The covariance is displayed in black in the off-diagonal elements of the matrix Aa) The covariance between math and English is positive (360), and the covariance between math and art is positive (180). This means the scores tend to covary in a positive way. As scores on math go up, scores on art and English also tend to go up; and vice versa.b) The covariance between English and art, however, is zero..This means there tends to be no predictable relationship between the movement of English and art scores.4..Compute Eigenvectors and corresponding EigenvaluesIntuitively, an eigenvector is a vector whose direction remains unchanged when a linear transformation is applied to it.Now, we can easily compute eigenvalue and eigenvectors from the covariance matrix that we have above.Let A be a square matrix, ν a vector and λ a scalar that satisfies Aν = λν, then λ is called eigenvalue associated with eigenvector ν of A.The eigenvalues of A are roots of the characteristic equationCalculating det(A-λI) first, I is an identity matrix :Simplifying the matrix first, we can calculate the determinant later,Now that we have our simplified matrix, we can find the determinant of the same :We now have the equation and we need to solve for λ, so as to get the eigenvalue of the matrix..So, equating the above equation to zero :After solving this equation for the value of λ, we get the following valueEigenvaluesNow, we can calculate the eigenvectors corresponding to the above eigenvalues..I would not show how to calculate eigenvector here, visit this link to understand how to calculate eigenvectors.So, after solving for eigenvectors we would get the following solution for the corresponding eigenvalues5.. More details

Leave a Reply