# Water's Home

Just another Life Style

0%

## Classification

Classification : $y =$ 0 or 1 $h_\theta (x)$ can be > 1 or < 0

Logistic Regression: $0 \leq h_\theta (x) \leq 1$ Logistic regression which has the property that the output, the predictions of logistic regression are always between zero and one. Logistic Regression is actually a classification algorithm.

## Hypothesis Representation

Sigmoid function : $g(z) = \frac {1}{1+e^{-z}}$

import numpy as np
def sigmoid(z):
return 1 / (1 + np.exp(-z))

## Decision Boundary

Much higher order polynomials, then it’s possible to show that you can get even more complex decision boundaries and logistic regression can be used to find the zero boundaries.

## Cost Function

How to fit the parameters theta for logistic regression. In particular, I’d like to define the optimization objective or the cost function that we’ll use to fit the parameters. Here’s to supervised learning problem of fitting a logistic regression model.

#### Linear regression the cost function

$J(\theta ) = \frac {1}{m} \sum_{i = 1}^{m} \frac {1}{2}(h_\theta(x^{(i)}) - y^{(i)})^{2}$

#### Logistic regression the cost function

$J(\theta ) = \frac {1}{m} \sum_{i = 1}^{m} Cost(h_\theta(x^{(i)}), y^{(i)})$

$Cost(h_\theta (x), y) = \begin{cases} -log(h_\theta(x)) & \text{ if } y=1 \\ -log(1 - h_\theta(x)) & \text{ if } y=0 \end{cases}$

## Simplified Cost Function and Gradient Descent

How to implement a fully working version of logistic regression. It’s too hard to notes here , if you are interested in details, you can visit coursera.org A vectorized implementation can update, you know, all of these N plus 1 parameters all in one fell swoop. Feature scaling can help gradient descents converge faster for linear regression. The idea of feature scaling also applies to gradient descent for logistic regression.

For gradient descent, I guess technically you don’t actually need code to compute the cost function $J_\theta$. You only need code to compute the derivative terms. Conjugate gradient BFGS and L-BFGS are examples of more sophisticated optimization algorithms. These algorithms have a number of advantages:* do not need to manually pick the learning rate alpha.