Logistic Regression is for classification problem, and the predication value is fixed descrete values, such as 1 for positive or 0 for negative. The essence of logistic regression is:
- hypothesis function is sigmoid function
- cost function: J(theta)
- gradient descent and algorithms
- advantanced optimization with regularization to solve overfitting problem.
Basics about logistic regressionhypothesis function = 1 / (1 + exp(-htheta(x))),
where htheta(x) = theta’ * x(theta’ is transpose theta)
htheta(x) mean Probalitiy that y=1, given x parameterized by theta P(y=1 | x; theta),
if htheta(x) >= 0.5, then y = 1
if htheta(x) < 0.5, then y = 0
Our goal is the calculate theta, can classify our traing data with descision boundary.
In the example, the traning data can be classified into 2 categories by a straight line.
if (theta'x) >= 0, then htheta(x) >= 0.5, then y = 1
if (theta'x) < 0, then htheta(x) < 0.5, then y = 0
Cost function implementationFor the assignment of week3, predicate the adimission by university with 2 exams grade data.
I optimize the implementation with vectoriaztion
function [J, grad] = costFunction(theta, X, y)
Cost function with regularization
Regularzation is for overfitting problem.
- underfit: not fit the training data, with high bias between predications and actual value
- Just Right: great fit
- Overfitting: often with too many features, not so much traning data, fit traing data well, but with hight variance, predict new data not very well
function [J, grad] = costFunctionReg(theta, X, y, lambda)
the lambda for regularization can’t be too large:
- large lamba will got very small theta value, and underfit.
- small lambda will got large theta velue, and overfit.
- the lambda for the exerise is 1
Write on the last
After one year, I learn the logistic regression again. Last week, Andrew NG left Baidu. Maybe, these great people thought Baidu is not worth to fight for. Now I still decidated on a Spark project and focus on Spark Streaming. As team leader, I am bearing a great burden and is stressful. It’s a great chance to train my leadership. I am also wondering next opportunity. Learning Machine Learning is right and worth to do. Anyway, even though mist is on the path, just go forward and fight~