# 14 Dimensionality Reduction

## Motivation I Data Compression

1. compress the data, use up less computer memory or disk space
2. speed up our learning algorithms.

## Motivation II Visualization

If you have 50 features, it’s very difficult to plot 50-dimensional data.

But if you reduce dimensions, the problems is what these new features means.

## Principal Component Analysis Problem Formulation

By far the most commonly used algorithm is something called principal components analysis or PCA.

What PCA does is it tries to find the surface onto which to project the data so as to minimize that.

Before applying PCA it’s standard practice to first perform mean normalization and feature scaling.

## Principal Component Analysis Algorithm

Reduce the dimensions :

1. Mean normalization, maybe perform feature scaling as well
2. Covariance matrix, $$\sum = \frac {1}{m} \sum _{i=1}^{n}(x^{(i)})(x^{(i)})^T$$
3. Eigenvectors of the matrix sigma

## Choosing The Number Of Principal Components

The variation of the training sets : $$\frac {1}{m} \sum_{i=1}^{m} \left \| x^{(i)} \right \|^2$$

Try to choose k, a pretty common rule of thumb for choosing k is to choose the smaller values so that the ratio of  the average square projection error and the total variation in the data between these is less than 0.01.

$$\frac {\frac {1}{m} \sum_{i=1}^{m} \left \| x^{(i)} – x^{(i)}_{approx} \right \|^2}{\frac {1}{m} \sum_{i=1}^{m} \left \| x^{(i)} \right \| ^2} = 1 – \frac {\sum_{i=1}^{k}S_{ii}}{\sum_{i=1}^{n}S_{ii}} \leq 1 \%$$

$$\frac {\sum_{i=1}^{k}S_{ii}}{\sum_{i=1}^{n}S_{ii}} \geq 99 \%$$

After compressed :

$$x^{(i)}_{approx} = U_{reduce}Z^{(i)}$$

• First consider doing it with your original raw data $$x^{(i)}$$, and only if that doesn’t do what you want, then implement PCA before using $$Z^{(i)}$$.