Daniel Flower Dot Com Banner

Literature Review

The key paper central to this project was "Neural Network Vision for Robot Driving" by Dean A. Pomerleau [1]. In this paper, the ALVINN (Autonomous Land Vehicle In a Neural Network) is described. ALVINN controlled the steering of a modified van while driving on public roads by analysing the image of the road ahead captured by a camera mounted at the front of the vehicle. The image was resampled to the small size of only 30 x 32 pixels, and each of the pixel values were input into a neural network. The neural network had only one hidden layer with four nodes, and the output layer had 30 nodes. Each of the output nodes was a linear representation of the steering amount, from hard left to hard right. When the input was received from the camera, the pixels were put into the neural network and the values propagated down the network into the output layer. The direction to steer was determined by finding the node with the highest output value, and using the values of the surrounding nodes to "fine-tune" the steering value.

Using this technique, the neural network could successfully navigate the vehicle along the road after only around five minutes of training. It is worth keeping in mind that images from the real world were used, which are always noisy and always changing. Trying to "hand code" an algorithm to perform the same task would be supremely difficult, and the fact that the very simple neural network was able to learn so quickly showed that neural networks may have a very important role to play in visual perception systems.

Subsequent papers using the ALVINN system include "Vision Guided Lane Transition" [3] which examines the issues involved in allowing an autonomous vehicle to change lanes while "DAMN: A Distributed Architecture for Mobile Navigation" [4] looked at integrating perception driven navigation with higher-level goals.

The only other related research found was entitled "AI Driven Vehicle" [5], which is from a university project. In this is a vehicle needs to steer its way around a world that has obstacles in various positions. At each time step, the world was partitioned into three areas: the area to the left and in front of the vehicle, the area directly in front of the vehicle, and the area to the right and in front of the vehicle, with everywhere else ignored. The distances to the closest obstacle in each partition was used as inputs into the neural network, which would use this information to avoid the obstacles. This was therefore only simulating robot vision, and was a huge simplification of vision guided navigation, however it did manage to avoid the obstacles with only a few neurons.