CS 451 Quiz 23
Anomaly detection and multivariate Gaussians
When evaluating an anomaly detection system, we have no positive examples in the training set, but a small number of positive examples in each of CV and test sets.
Is classification accuracy a good way to measure the performance of an anomaly detection system?
Yes, because we have labels in CV / test sets
No, because we do not have labels in CV / test sets
No, because of skewed classes
In anomaly detection, the decision boundary depends on the parameter epsilon, which we can set using the cross validation set
In which of the following situations is it better to use anomaly detection as opposed to supervised learning?
When we have similar numbers of positive and negative training examples
When there are few positive examples
When the positive examples have little in common
In the video, Andrew Ng proposes to replace a feature x with functions like log(x + c) or x^c in order to make them "more Gaussian". What does he suggest to compute to guide this process?
The mean and variance of the transformed feature
The histogram of the transformed feature
The PCA of the transformed feature
If an anomaly detection system fails to assign a low value of p for an anomalous event, how could this be addressed? Check all that apply.
Adding a new feature that captures a novel aspect of the training data
Adding a new feature that is the ratio of two existing features
The multivariate Gaussian distribution models the overall probability as the product of the individual distributions p(x1)*p(x2)*...*p(xn).
The covariance matrix models correlations between the features
Aside from anomaly detection, where else did we encounter the covariance matrix?
When is it better to use the original model, instead of the multivariate model?
When there are correlations between different features
When n is very large
When m is very large
This content is neither created nor endorsed by Google.
Terms of Service