Machine Learning - Andrew NG in Coursera

Machine learning
What is machine leaning


  • Arthur Samuel(1959)
    • Field of study that gives computers the ability to learn without being explicitly programmed.
  • Tom Mitchell(1998)
    • A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.

Machine learning algorithms

  • Supervised learning
  • Unsupervised learning
  • Others: Reinforcement learning, recommender systems.

Supervised learning

  • “right answers” given
  • Regression: Predict continuous valued output(price)
  • Classification: Discrete valued output(0 or 1)

Unsupervised learning

  • Just give the data, then give the structure of these data set - Cluster
  • Cocktail party problem
  • [W,s,v] = svd((repmat(sum(x.*x,1),size(x,1),1).*x)*x’);

Linear Regression

Notation: m = Number of training examples

x’s = “input” variable / features

y’s = “output” variable / “target” variable

Cost function

Idea: Choose θ0,θ1 so that hθ(x) is close to y for our training examples(x,y)

Sometimes called the squared error cost function

Cost Function

hθ(x) for fix θ1, this is a function of x, J(θ1) function of the parameter θ1

Cost Function

Gradient descent

Gradient descent

Gradient descent

⍺ is learning rate, if ⍺ is too small, gradient descent can be slow. But if it is too large, gradient descent can overshoot the minimum. It may fail to converge, or even diverge.

The simultaneous update should like the formula correct one before reassign them all.

Derivative term after ⍺, you can take the tangent to one point for example. When it goes positive number, after θ1: θ1 - ⍺ x (positive number), θ1 will go minus. It goes to left with a decrease one. This is the right direction to get closer to minimum value.

Gradient descent can converge to a local minimum, even with the learning rate ⍺ fixed. So no need to descrease ⍺ over time.

Batch Gradient Descent, Each step of gradient descent uses all the training examples.

Next Step

Linear Algebra: Notation and set of the things you can do with matrices and vectors.