CS 451 Quiz 16
Large scale machine learning
Email address *
In mini-batch gradient descent, a typical choice of the mini-batch-size b is *
What is a typical training set size for a modern "large dataset"? *
Non-linearly-separable data can be handled with a linear classifier if it is first mapped to a higher-dimensional feature space *
In order to check stochastic gradient descent for convergence, we can compute the average of the, say, last 1000 cost values. For each training example, the cost value should be computed *
How can you tell that training with a large data set will give better performance than when training with just a small subset (m = 1000) of the data? *
Batch gradient descent means to make a single gradient descent steps after looking at *
In order for stochastic gradient descent to converge, it can be a good idea to decrease the learning rate with the number of iterations. *
For large training sets, stochastic gradient descent can be much faster than batch gradient descent *
K nearest neighbors is an algorithm for *
The "Kernel trick" refers to *
This content is neither created nor endorsed by Google. Report Abuse - Terms of Service - Additional Terms