CS 451 Quiz 14
Large scale machine learning; SVM cost function
What is a typical training set size for a modern "large dataset"?
m = 200,000
m = 5,000,000
m = 100,000,000
How can you tell that training with a large data set will give better performance than when training with just a small subset (m = 1000) of the data?
If the learning curves (training and validation costs as a function of m) indicate high bias for small m
If the learning curves (training and validation costs as a function of m) indicate high variance for small m
Batch gradient descent means to make a single gradient descent steps after looking at
one training example
several/many training examples
all training examples
For large training sets, stochastic gradient descent can be much faster than batch gradient descent
In mini-batch gradient descent, a typical choice of the mini-batch-size b is
b = 10
b = sqrt(m)
b = m/10
In order to check stochastic gradient descent for convergence, we can compute the average of, say, the last 1000 cost values. For each training example, the cost value should be computed
before making the gradient descent step
after making the gradient descent step
In order for stochastic gradient descent to converge, it can be a good idea to decrease the learning rate with the number of iterations.
The cost function used in SVMs is similar to the cost function in logistic regression, but instead of using the log function
it is piecewise linear
it is quadratic
it uses the tan function
In logistic regression, we have a parameter lambda controlling the amount of regularization. In SVMs, what do we have instead?
A parameter C that acts like lambda
A parameter C that acts like 1 / lambda
SVMs don't require regularization
SVM stands for
System Validation Model
Singular Value Manipulation
Special Value Machine
Support Vector Manipulation
Support Vector Machine
Student Volunteer Movement
Space Vector Modulation
This content is neither created nor endorsed by Google.
Terms of Service