Optimization for machine learning

Estimator choice:


First we need to write out the function to be optimized.
To minimize/maximize a function $F$, there are a few choices:

Gradient descent:

Subgradient descent:

Conjugate gradient:


Newton's method:


Limited-memory BFGS

Reading