Return to Algorithms

Principal Component Analysis (PCA) Visualization

Original Space

Original high-dimensional data with principal components shown as vectors

PCA Projection

Data projected onto the principal components

Controls

Data Generation

4.0

PCA Parameters

Visualization Options

PCA Stats

Variance Explained: -
Eigenvalues: -
Current Step: Not Started

Explained Variance

How Principal Component Analysis Works

Principal Component Analysis (PCA) is a dimensionality reduction technique that transforms data to a new coordinate system where the axes (principal components) are ordered by the amount of variance they explain.

The PCA Algorithm:

  1. Standardization: Center the data by subtracting the mean, and optionally scale it to unit variance
  2. Covariance Matrix: Compute the covariance matrix of the standardized data
  3. Eigendecomposition: Find the eigenvectors and eigenvalues of the covariance matrix
  4. Sort Components: Order the eigenvectors by their corresponding eigenvalues (highest to lowest)
  5. Project Data: Transform the original data onto the new coordinate system defined by the principal components

Key Properties:

  • Principal Components: Directions in which the data varies the most
  • Eigenvalues: Represent the amount of variance explained by each principal component
  • Variance Explained: Percentage of total variation captured by each principal component
  • Orthogonality: Principal components are perpendicular to each other

Applications:

  • Visualization of high-dimensional data
  • Noise reduction and data preprocessing
  • Feature extraction and selection
  • Compression of high-dimensional data

Limitations:

  • Only captures linear relationships in the data
  • Sensitive to the relative scaling of the original variables
  • May not work well if the data has non-linear structures
  • Interpretation of principal components can be difficult