Progress Report

License Plate Recognition

Project idea

In this project, we will apply the deep learning techniques to solve the problem of license plate recognition (LPR), i.e., given an image of the licensed plate,

we want the machine learning system returns us a string of characters and numbers.The steps for the task can be summarized as the following several steps

Based on our goal, our main tasks for the project will consist of the following steps:

Step 1: data collection and data pre-processing
Step 2: character and digits segmentation;
Step 3: building neural network
Step 4: training and testing our collected data
Step 5: report writing, including progressive report, and final report
Optional Task:If we have time, we will build a mySQL database to match results from our recognition system

Data collection and data pre-processing

We use the dataset from License Plate Detection, Recognition and Automated Storage.

The image database contains over 500 images of the rear views of various vehicles (cars, trucks, busses), taken under various lighting conditions (sunny, cloudy, rainy, twilight, night light).

Sample images from the dataset.

In our project, we mainly use two ways to prepare our data and generate two data sets for training and testing on our program.

We also want to consider how different data sets can affect the performance.

In fact, we also have the third way(optional) to prepare the data on our program.

1) Use openCV2 to detect potential car license number.

On the first way, we use openCV2 to write a program(Detection.py) to find the contours in the image, and then crop it.

Here is sample below.

Original image

After detect the car license number in the image

Crop car license number and remove the background

Current unfinished works for it: we need to resize each cropped image into same-size and pre-process the collected car license images

to generate numerical resperesentation of the data set with the relation: feature data & label data.

2) Manually segment car license numbers on an image.

We manually crop the car license number in the image and do the pre-prcessing for the cropped image including character and digits segmentation.

It spend much more time on preparing data set, but we can gain more standard and accurate plate license number in the image than using program.

Here is sample below.

Original image

Crop car license number and do the semgmentation

After collection, we use the program(image_to_data.m) to generate numerical resperesentation of the data set with the

relation: feature data & label data.

This part is almost done. To fit the needs of a course project, we only manually prepare a dataset of 100 plate samples.

Since each plate sample has 7 characters, this actually gives us 700 images of letters or numbers. We will use 80% of the samples for training, and the rest 20% for testing.

For feature data:

a.Each plate image is segmented to get 7 images of characters or images;
b.All the 7 images are converted to gray images so that we can reduce the memory needed for computation, and they are resized to be of the same size, i.e., 40 rows and 20 columns;
c.For each of the 7 images, we reshape it to be an vector with 1 row and 800 columns;
d.Finally, we will have 700 such vectors (100 plates, and 7 characters for each plate). We will store them in a matrix with 700 rows (characters) and 800 columns (pixels);
e.Starting from the first row of the matrix, every 7 consecutive rows form the features of a plate;

For label data:

a.For each image, we will consider its 7 characters one by one. For each character we create a one hot vector for it;
b.Since we have 10 choices of number digit and 26 choices of letters, we use a one-hot vector with 1 row and 36 columns to represent the label of a character. The correspondence is as follows: Character <-> position of ‘1’ in one-hot vector 0 <-> 1, 1 <-> 2, … A <-> 11, B <-> 12, …, Y <-> 35, Z <-> 36
c.Similar to the feature data case, we finally get a label matrix with 700 rows and 36 columns. Each row represents a label of a character from a plate.
d.Starting from the first row, every consecutive 7 rows will form the label for a plate.;

Samples for feature data

Samples for label data

Samples for whole_image_data

Sample for whole_plate_sequence

3) Generate own car license number.

We can generate training and test images to use the program（gen.py).

The process for generating the images is illustrated below:

We can spend less time to collect thousands of car license plate; however, those car license number are randomly generated and not real.

Current unfinished works for it: we need to pre-process those generated car license images

to build numerical resperesentation of the data set with the relation: feature data & label data.

Recognition system design, training, and testing

The network.

We studies from two online projects including Number plate recognition with Tensorflow and Vehicle Plate Number Identiﬁcation Available Models, we have two models on our network.

1) Convolutional Neural Networks

Convolution Neural Networks is one of the popular deep learning models, it can be trained via back propagation algorithms.

Thus as a detection task, CNN is one of our selected models for vehicle detection.

2) Recurrent Neural Networks and LSTM

Unlike feedforward neural networks, RNNs can use their internal state to process sequences of inputs. This makes them applicable to tasks such as unsegmented,

connected handwriting recognition or speech recognition. Since the car license number can also be viewed as a sequence with 7 characteristics thus RNN can be applied in our task.

Current unfinished works for it: we already build basic structure of our neural networks, but we stil need time to make some modification on our network.

Furthermore, we have done preparation of our data and collection. We will finish pre-processing of our collected data by the Nov 9, 2018,

so we will begin to train and test these collected data on our network by the Thinksgiving break.

Software

tensorflow
opencv
keras
theano
MySQL We want to build an car plate licenses database to match the rsult from our decision system.
Plate_Number_Generation.py We will use it to create some random car plate licenses and insert its into our database.

Paper to read

These papers are included to read for our program.

Algorithmic and mathematical principles of automatic number plate recognition systems

D-PNR: Deep License Plate Number Recognition

Reading car license plates using deep neural networks. Image and Vision Computing

Tensorflow: a system for large-scale machine learning

Teammate

Thus the assignments for every member are as follows:

Jirong Yi: data preparation and collection(the second way), designing the sturcture of program.
Ke Ma: data preparation and collection(the first way and the third way), writing the progress report.
Qi Qi: data preparation and collection(the second way), designing the neural network.

Progress Milestones

Even though every one of us will focus on certain part of the project, we will get involved in the whole process to make sure that the work we present is up to everyone’s standard. To make the project run as we expect, we set the following milestones:

Before 11/09/2018, we should have finished pre-pressing of data;

Before 11/18/2018, we should have finished building neural network and inital training and testing;

Before 11/24/2018, we should have finished modification for our program;

Before 12/04/2018, we should have finished final report;

If time permits, we will try to give some theoretical justifications for the whole system and even design new algorithms for segmentation and training.

proposal for License Plate Recognition