Blaise Labriola
April 14, 2018
Blaise Labriola @ Zoonova.com
Managing Partner Zoonova.com.

Using Principal Component Analysis (PCA) on a Stock/ETF Portfolio Correlation Matrix.

The following calculations and definitions are taken from  ZOONOVA

Here is the calculation of Principal Component Analysis using a Stock portfolio correlation matrix. First calculate the Stock Correlation matrix

Take the correlation matrix as input and calculate Principal Component Analysis (PCA) for the portfolio.

Explanation of PCA and the fields.

PC Analysis

Principal component analysis is a statistical technique used to analyze the interrelationships between a large number of variables and to explain them in terms of a smaller set of variables (i.e., "principal components"). Using the correlation matrix for the stock watch portfolio, eigenvalues and eigenvectors are calculated – which imply corresponding principal components. The output is as follows:

Principal component coefficients (Reduced Model)

Principal component coefficients displays eigenvectors for the principal components ( NOTE:  components with an eigenvalue of less than 1 are omitted, i.e., "Reduced"). Values with a high correlation (relative to the user-defined threshold) between the principal components and the (standardized) original variables are highlighted.

Total Variance Explained

The number of components matches the number of unique stocks in a stock portfolio. Because PCA uses the correlation matrix, the variables are standardized: Values along the correlation diagonal are equal to 1, and the total variance is equal to the number of unique stocks. The columns are:

eValue

This column displays the eigenvalue (variance) of each principal components. Beginning with a variance of 1, each successive component accounts for a decreasing amount of the total variance.

% Var

This column displays the percent of variance for each principal component.

Cum %

This column displays the cumulative percentage of variance for each principal component.

Scree Plot

The Scree Plot displays the eigenvalue for each principal component (by number). Latter components cause the line to become almost flat showing that their contribution is increasingly negligible.

Principal Component Analysis (PCA) allows you to reduce a large number of correlated variables into much fewer synthetic variables. These synthetic variables, which effectively replicate the original data, are referred to as principal components.

Cheers.

More from Blaise Labriola
The most important insight of the day
Get the Harvest Daily Digest newsletter.