2 of 15

Types of problems in traditional ML/AI

Regression

Assigning a continuous value to an object

Classification

Assigning a category or class to an object

Clustering

Grouping of objects based on some similarity measure.

Dimensionality reduction

Reducing the number of random variables to use for modelling

3 of 15

Key terms

Supervised Learning

The training is done on labelled data (ground truth). Typically used for classification/regression

Unsupervised Learning

Learning the inherent structure of the data without any labelling. Typically used for clustering

Model

A mathematical representation that takes the input values and produces an output (a value, a category or a cluster).

Training (a model)

Enabling the model to “learn” its parameters by providing a set of inputs and the corresponding set of expected outputs.

Testing (a model)

Determining the accuracy of the model by providing a set of “unseen” inputs and obtaining the output

Overfitting

When the model is too close to the training data. Overfitting decreases accuracy

4 of 15

Linear Regression

Simplest implementation is Ordinary Least Squares

Assumes independence of variables
Usually employed in price prediction type of tasks
Regularization can be applied to take care of dependent variables and to improve stability
Example: Smoothing of audio, noise reduction etc.

5 of 15

K Nearest Neighbors

Not a ‘real’ model but a collection of data points.
When classifying, it takes a set of nearest neighbors to the given data point and uses their classes to determine the results
Different definitions of distances are used depending on the task at hand (Some assume equal weightage to all features while others don’t)
Quite a slow algorithm since no model exists and the calculations are done during classification
Example: Genre classification

6 of 15

Support Vector Machines

The model is a set of separating hyperplanes
The order of Kernel determines the shape of hyperplane
Perfectly fitting for large datasets is complex and time consuming
Regularization is a tuning parameter that controls error tolerance (tradeoff)
Example: Genre/Artist classification

7 of 15

Decision Trees

Non parametric
Reduces training features to a simple set of rules
Easy to visualize
The deeper the tree, the more complex it is
May create biased trees on unbalanced training data

8 of 15

K means Clustering

Divide the training data into K groups by trying to minimize the “inertia”

The number of clusters have to be specified
Usually used in conjunction with PCA (Principal Component Analysis) to reduce dimensionality
Works well with convex isotropic data
Instrument classification, Structural segmentation

9 of 15

Gaussian Mixture Models

Probabilistic model
Output is a mixture of finite number of Gaussian distributions (or normal distribution)
Fastest algorithm for mixture models
Uses expectation maximization
Insufficient data per component may lead to singularities

10 of 15

Neural networks

Basic unit of computation is a neuron
It receives inputs and each input may be weighted
The node applies a function called Activation function to the inputs and produces an output
Activation functions are usually non-linear

11 of 15

Convolutional Neural networks

Convolution
Non Linearity (ReLU)
Pooling or Sub Sampling
Classification (Fully Connected Layer)

12 of 15

Recurrent neural networks

RNNs allow information to persist
Multiple copies of the same network, each passing a message to its successor
Timeseries classification

13 of 15

Long short term memory (LSTM)

Special RNNs capable to learning long term dependencies
A cell state is maintained that learns what to retain and what to output

14 of 15

Generative Adversarial networks (GAN)

Generative models - able to produce new content
Involves a generator and a discriminator.
The generator and discriminator are adversaries and hence the name Adversarial Network

1 of 15