2 of 45

Who am I?

Simon Bernhard

Teaching myself AI for the past 5+ years

Currently completing a Master in Computer Science at Schaffhausen Institute of Technology

Working as a PLC developer at Bibliotheca

3 of 45

Some of my recent AI projects

Predicting the stock market
Style transfer
Image generation using GANs and Diffusion
Music production using symbolic AI

4 of 45

What can AI even do?

7 of 45

https://openai.com/blog/whisper/

8 of 45

The history of AI

The pursuit of AI developed concurrently with Computer Science

Connectionism vs. Symbolic AI

Symbolic AI

Build AI systems based on rule-based manipulation of symbols
Heuristic searching, Expert systems

Connectionism

Build AI systems based on networks of simple interconnected nodes
Artificial neural nets

9 of 45

The history of AI

First AI winter 1967–1977
Rise of Expert Systems

Concurrently, research continued on Connectionist approaches

Second AI winter 1988-1993
2012 onwards - Machine Learning Golden Age

10 of 45

Symbolic Artificial Intelligence

Did it fail?

It did lead to two AI winters, but NO.

It gave us amazingly useful things like:

Garbage collection
Dynamic typing
Higher-order functions
Recursion
Conditionals

11 of 45

AI besides Machine Learning

12 of 45

Support Vector Machines

Draws a hyperplane which divides data.
Chooses the dividing line which maximizes the distance from the nearest points
Can be used for classification, regression
There are both linear kernels and non-linear kernels

13 of 45

Clustering

Clustering algorithms group data based on certain criteria
Used for classification
There are a lot of different algorithms and different implementations

Which implementation is best depends on what the data looks like and what the end goal is

14 of 45

Clustering

Centroid-based Clustering (K-means)

Density-based Clustering

Distribution-based Clustering

Hierarchical Clustering

15 of 45

Decision trees

Decision trees are trees that progressively divide the data into smaller and smaller groups until the data can’t be divided more

There are several ways to divide the trees:

Positive Correctness
Gini impurity
Information gain

Trees are easy to interpret (white box)
Trees can work on diverse types of data
Trees are also great candidates for ensemble learning

16 of 45

Ensemble learning (Bagging, Boosting)

Ensemble learning is essentially running multiple instances of an algorithm and taking the combined answer.
Bagging

The dataset the algorithms train on is a random equally-sized subset of the original dataset, with replacement
Each instance is trained on a different dataset
The answers of each instance are averaged

Boosting

Similar to Bagging, but instead of creating datasets randomly, datasets are weighted based on misclassified data from earlier instances

17 of 45

Ensemble learning (Boosting)

18 of 45

Machine Learning

19 of 45

What is a Neural Net?

An input layer
A hidden layer
An output layer
Between each layer is an array of weights

20 of 45

Deep Neural Nets

A neural net with more hidden layers

21 of 45

How does a Neural Net learn?

Backpropagation!

The neural net makes a prediction of the data
The error in the prediction is converted into loss by a loss function
This loss is sent to an optimizer which calculates the changes to the weights needed to minimize the loss
This changes are then sent backwards along the gradients of the neural net, correcting the weights

The optimizer will not completely correct the loss, but only take a small step towards improving it
The end goal is to slowly move towards the optimal set of weights

22 of 45

What are problems with DNN

Training is difficult

Lots of connections means lots of GPU memory and processing power is necessary
Lots of data is needed
Could overfit or underfit
Lots of weights may not be used properly

There are solutions to some of the problems

Tuning hyperparameters
Dropout
Different types of networks
Creating train, test, validation sets of the data

23 of 45

Convolutional Neural Nets (CNNs)

Very popular for networks dealing with images
A DNN would have a complete connection of all the pixels in an image

This leads to a massive and cumbersome network

A CNN learns convolutions to run over the image instead

Smaller network
Easier to train

CNNs where some of the first neural networks to outperform humans

24 of 45

Convolutional Neural Nets (CNNs)

25 of 45

Convolutional Neural Nets (CNNs)

26 of 45

Recurrent Neural Nets (RNNs)

In sequential data, the sequence itself is often more important than the data at each point
RNNs attempt to provide address this by providing the network with a kind of memory
This provided excellent results for things like word processing and numeric regression
Training can be difficult because of vanishing gradient

27 of 45

Long Short Term Memory Networks (LSTMs)

To address the shortcomings of RNNs, LSTMs were introduced
LSTMs use a series of gates to decide what to store
This allows for much longer memory
The downside is added complexity

28 of 45

Attention Networks

Attention networks are aimed at simulating how human attention works.
It looks at the entire input at the same time and focuses on areas of interest
The areas of interest are weighed more by the network
Attention Networks can be used for all sorts of applications from images to understanding text

29 of 45

Attention Networks

30 of 45

Transformers

Transformers are a type of neural network architecture that can be made up of different kinds of layers
They are defined by the idea that there is an input network and an output network
The input transform the data into some form that the output can use

32 of 45

Autoencoder

33 of 45

Generative Adversarial Networks (GANs)

A GAN is really two networks

A generator
A discriminator

The generator generates content
The discriminator discriminates if the content is real or generated by the generator
Can be difficult to train because you are training two networks and they have to be balanced

34 of 45

Diffusion Learning

The most recent AI hotshot
The network is trained to generate content from noise
To train, gradually add noise to a image
Then use a neural net to convert noise back into an image
After it is trained, we can input pure noise or an existing image
Most popular for generating images, upscaling images

35 of 45

Diffusion Learning

36 of 45

Diffusion Learning

https://colab.research.google.com/drive/1Jb9B_Aa4cZ_PIH0xpQau6ifAdPkx3zob?usp=sharing

37 of 45

How can you apply Machine Learning?

38 of 45

When does Machine Learning make sense?

Machine Learning is just a really complicated statistics algorithm
Does it make sense to solve the problem with statistics?

Image Recognition

Translation

Image Generation

Maybe

Music Generation

Maybe

Raising children

Organizing a kitchen

39 of 45

Get the data

This is the most important step
Most time consuming step

Collect data
Clean data
Preprocess data
Feature engineering

Mistakes here can be disastrous

Leaking data in the training set

40 of 45

Decide on an architecture

Many factors affect this

Training data
Desired output
Available training time
Available evaluation time
Available hardware resources

41 of 45

Training

Data is split into training, test, and often evaluation data sets
Data is preprocessed and split into batches

Batches make training more smooth and help reduce gpu memory requirments

Network is trained on the training data
Network progress is tested on the evaluation data set
Finally, the test data is used to test the real world usefulness of the network

42 of 45

Tune hyperparameters

Re-evaluate the architecture
Tune the hyperparameters

Grid search
Random search
Gradient based
Bayesian
etc.

43 of 45

Pitfalls

Overfitting/Underfitting

Network is too specific or too general
More data
Early stopping

Network is Exploding/Vanishing gradient

Activation function

Training data is not representative of test data
Hardware

Long training times, not enough gpu memory

44 of 45

Transfer Learning - The shortcut to success

Training neural networks is really expensive
Fortunately, lots of companies release pre-trained networks for free
These networks can be retrained for your needs

Most often only some layers are trained
Saves lots of time and money

45 of 45

Where should you get started?

Google Colab

A online python notebook with free gpus
A great way to get started without much investment

🤗Hugging face is a great company which is trying to make AI for everyone
Lots of online resources

1 of 45

2 of 45

3 of 45

4 of 45

5 of 45

6 of 45

7 of 45

8 of 45

9 of 45

10 of 45

11 of 45

12 of 45

13 of 45

14 of 45

15 of 45

16 of 45

17 of 45

18 of 45

19 of 45

20 of 45

21 of 45

22 of 45

23 of 45

24 of 45

25 of 45

26 of 45

27 of 45

28 of 45

29 of 45

30 of 45

31 of 45

32 of 45

33 of 45

34 of 45

35 of 45

36 of 45

37 of 45

38 of 45

39 of 45

40 of 45

41 of 45

42 of 45

43 of 45

44 of 45

45 of 45