1 of 19

Autonomous Driving Simulation

By Zezhou (Joe) Sun, Siqi Zhang

2 of 19

Introduction

3 of 19

Problem Definition

Problem: Develop an agent that can accept only game image(s) and current score as input and should return the actions (keypress: UP, DOWN, LEFT, RIGHT, N) to take under current condition.
Goal: Gain scores as high as possible.

We want an agent that achieves the same goal but faster

4 of 19

System Architecture

Universe Library

Pytorch

5 of 19

Image Pre-processing

Grey-scale conversion

Cropping

Gaussian downsizing

6 of 19

Image Pre-processing

Multiple Illumination source

Random fog

Can’t use optical flow :(

7 of 19

Image Pre-processing

Background Noise

8 of 19

Image Pre-processing

Feature extraction

Adaptive threshold + morphology

Canny edge detector

Scale-invariant Feature Transform

9 of 19

Image Pre-processing

Segmentation Algorithm

Driving uphill

Driving downhill

10 of 19

Segmentation Algorithm

Image Pre-processing

Where x_0 and x_1 are points on the regression line

Linear regression on RGB value of the pixels:

A pixel is a road pixel if d is small and less than 15

	Label: Car off road	Label: Car on road
Algorithm: Car off road	103	19
Algorithm: Car on road	4	125

11 of 19

Segmentation Algorithm

Image Pre-processing

A pixel-based road segmentation

12 of 19

Image Pre-processing

layer	content
1	Red	From original image
2	Green
3	Blue
4	Canny(t-3)	Canny edge on 4 frames of history
5	Canny(t-2)
6	Canny(t-1)
7	Canny(t)
8	Segmentation

Image after preprocessing: 64 x 100 x 8

13 of 19

Neural Network Architecture

Type	Layer shape	Class/filter size	Stride	Activation
Conv-1	8*16	3 x 3	2	ReLU
Conv-2	16*32	4 x 4	2	ReLU
Conv-3	32*64	5 x 5	2	ReLU
FC-1	3200*300		N/A	ReLU
FC-2	300*#of Actions		N/A	Linear

Each layer is also followed by a normalization layer.

14 of 19

Reinforcement Learning

ac is defined as change of score changes,

Which is acceleration

15 of 19

Reinforcement Learning

16 of 19

Result Analysis

17 of 19

Result Analysis

18 of 19

Result Analysis

Problems:

It doesn’t know how to respond to sharp turning.
In the last half of the race, there is a downhill followed by a right-turning. The agent will run out of the lane almost every time because the high velocity caused by the gravitational force will gain a high reward automatically therefore encourage the agent to keep moving forward instead of turning. Also, the inertia caused by the physics makes turning even harder.
Training of the model is super slow, will take days for training.
The way we define exploration method is too simple which result in that our model trained well on first half of the race, but not much trained on rest.
The way we input history frames is very simple. We just pushed history frames to CNN directly without any processing. Actually we should use a LSTM network after CNN to deal with this time sequence data.

19 of 19

Question