DL in NLP 2020. Spring. Quiz 6
Some questions can be not mentioned in the lecture explicitly, but you can still use logic and google.
Credit for some questions: cs231n.stanford.edu
Email address *
Github account *
Why vanishing gradient is a problem?
Give any example of a sentence with long-distance dependency, where short-term dependency leads to a wrong answer. You may use English or Russian.
How exploding gradients problem can be detected?
What is the optimal choice for h in the grading clipping formula? http://proceedings.mlr.press/v28/pascanu13.pdf
Captionless Image
Clear selection
Suppose the reset and update gate values are numbers, not vectors. Pick the situations when the previous hidden state of GRU cannot affect the current one for any possible value of it
Consider the LSTM and GRU formulas given in lecture. What is the possible range for the values of the hidden state components?
Clear selection
When LSTM is a good default choice?
Select potential solutions to the vanishing/exploding grading problem is other networks.
A direct connection without additional weights between the first and the last layers of the network is called
Your questions about the lecture (if any)
Never submit passwords through Google Forms.
This content is neither created nor endorsed by Google. Report Abuse - Terms of Service - Privacy Policy