1 of 22

Object Detection

With Adam, Shreyas, Rohan, Joon, and Sky

2 of 22

Table of Contents

01

03

02

04

Background

Data

What is Object Detection, and what are its possible applications?

To train/test our model, what data was used, and how did we use it?

Detection Models

Classification Models

What types of models are at our disposal, and which ones should we use?

How can we use more advanced models to make even better predictions?

3 of 22

4 of 22

01

Object Detection

(Background)

5 of 22

What is Object Detection?

6 of 22

7 of 22

8 of 22

02

Data

9 of 22

Our Data

  • Images/Pixels
  • Normalization

Outputs

Inputs

  • Probability of image classification
  • [B, C, T]
  • [0.1, 0.1, 0.8] TRUCK
  • [0.1, 0.7, 0.2] CAR

Labels/Index

10 of 22

03

Sliding Windows

11 of 22

Sliding Window Algorithm

The sliding window algorithm takes cropped parts of the image of a fixed size and runs them through the image classifier mentioned before. However, it has some problems…

12 of 22

04

Classification Models

13 of 22

Our Methods of Classification

Neural Networks

Convolutional NN

Transfer Learning

14 of 22

Neural Networks

Terms:

  • Weights: Connection between neurons signifying importance of the input
  • Activation function (non-linear):

ReLU (Rectified Linear Unit) Softmax

Goal:

  • Adjust weights to get optimal accuracy

15 of 22

Our Model

Results

16 of 22

Convolutional Neural Networks

How it works:

  • Kernel (filter) is used to convert input data into a feature map
    • Highlights features
    • Multiple times
  • Activation function
  • Pooling: Shrinks size of data
    • Prevents overfitting (learning from unimportant data)
    • Faster calculations

17 of 22

Our Model

Results

18 of 22

Transfer Learning

  • Transfer learning is the use of “expert” models trained on other tasks to make predictions on a new task.
  • Some transfer models we can use are:
    • VGG16
    • VGG19
    • ResNet50
    • DenseNet121
  • For our predictions, we used the VGG16 model (pictured below)
  • The VGG16 model had an accuracy of about 95%, which was almost 10% more than the CNN model!

19 of 22

05

YOLO

20 of 22

YOLO

“You Only Look Once”

How it works: Divide up the image into grids -> Predict the bounding box along with what object is in the bounding box + the probability of object being present -> Label the boxes and the object inside the boxes

21 of 22

YOLO

Pros

  • Unlike previous Object Detection models, which use different classifiers to perform detection, YOLO uses a single fully connected layer.
  • This makes YOLO the most accurate and fastest model by far – which can process images up to 155 frames per second. ��

Cons

  • YOLO struggles to detect small objects that are clustered in a group since each grid is constrained to detect a single object.
  • This also means they struggle to detect close objects.
  • YOLO also struggles with recall and localize things compared to other models.

22 of 22

THANKS

Do you have any questions?

CREDITS: This presentation template was created by Slidesgo, including icons by Flaticon, infographics & images by Freepik and illustrations by Stories