CS 451 Quiz 10
Neural network implementation and application
Where does the concept of "symmetry breaking" appear in neural network learning?
When initializing the theta values
In parameter unrolling
In gradient checking
When computing training set accuracy
Suppose you train two identical neural nets using the same training set. Each net has 4 layers with 10 nodes each. The only difference is that network 1 starts with theta initialized to all zeros and network 2 starts with a random theta. What will happen to the accuracy after, say, 1000 iterations?
Both will end up with the same accuracy
Net 1 will have higher accuracy
Net 2 will have higher accuracy
There is not enough information to tell
What's a good value for epsilon in gradient checking?
Should gradient checking be turned off before training your classifier?
Yes, because it will affect numerical stability
Yes, because it slows things down too much
No, otherwise you cannot see whether gradient descent converges
If you have two Theta matrices of size 10x11 and 2x11, unrolling parameters yields
a 11x2 matrix
a 110x22 matrix
a 34x1 vector
a 110x1 vector
a 132x1 vector
In the autonomous driving example shown in the video, a neural net is given input images and learns to
control the speed of the car
analyze traffic signs
output a steering angle
Unlike for logistic regression, the cost function J for multi-layer neural nets is non-convex and thus gradient descent can end up in a local minimum
When deciding on a network architecture, (check all that apply)
The number of input units should match the number of features
The number of output units should match the number of classes
The total number of hidden units (in all layers combined) should not exceed the number of input units
You compute (J(x+e) - J(x-e))/(2*e) if you want to do
Which Octave operation unrolls a 5x5 matrix M into a vector v?
v = M(:)
v = reshape(M, 5, 5)
v = vstack(M)
This content is neither created nor endorsed by Google.
Terms of Service