# 4 Linear Regression with Multiple Variables

## Linear Regression with Multiple Variables

$$x^{(i)}_j$$ : refer to feature number j in the x factor.

## Gradient Descent for Multiple Variables

1. def computeCost(X, y, theta):
2.     inner = np.power(((X * theta.T) - y), 2)
3.     return np.sum(inner) / (2 * len(X))

## Gradient Descent in Practice I – Feature Scaling

the different features take on similar ranges of values, then gradient descents can converge more quickly.

$$x_n = \frac {x_n -\mu _n}{s_n}$$
$$\mu _n$$ : the average value
$$s_n$$ : the range of values

## Gradient Descent in Practice II – Learning Rate

Looking at figure that pluck the cost function j of theta as gradient descent runs and the x-axis here is the number of iteration of gradient descent and as gradient descent runs.

Maybe $$\alpha = 0.01, 0.03, 0.1, 0.3, 1, 3, 10$$

## Features and Polynomial Regression

Look at the data and choose features.

You can put polynomial functions as well and sometimes by appropriate insight into the feature simply get a much better model for your data.

Feature Scaling is very necessary if you use polynomial functions.

## Normal Equation

The normal equation, which for some linear regression problems, will give us a much better way to solve for the optimal value of the parameters theta.

$$\theta = (X^{T}X)^{-1}X^{T}y$$