License Plate Recognition
Project idea
In this project, we will apply the deep learning techniques to solve the problem of license plate recognition (LPR), i.e., given an image of the licensed plate,
we want the machine learning system returns us a string of characters and numbers.The steps for the task can be summarized as the following several steps
Based on our goal, our main tasks for the project will consist of the following steps:
- Step 1: data collection and data pre-processing
- Step 2: character and digits segmentation;
- Step 3: building neural network
- Step 4: training and testing our collected data
- Step 5: report writing, including progressive report, and final report
- Optional Task:If we have time, we will build a mySQL database to match results from our recognition system
Data collection and data pre-processing
We use the dataset from License Plate Detection, Recognition
and Automated Storage.
The image database contains over 500 images of the rear views of various vehicles (cars, trucks, busses),
taken under various lighting conditions (sunny, cloudy, rainy, twilight, night light).
Sample images from the dataset.
In our project, we mainly use two ways to prepare our data and generate two data sets for training and testing on our program.
We also want to consider how different data sets can affect the performance.
In fact, we also have the third way(optional) to prepare the data on our program.
1) Use openCV2 to detect potential car license number.
On the first way, we use openCV2 to write a program(Detection.py) to
find the contours in the image, and then crop it.
Here is sample below.
Original image
After detect the car license number in the image
Crop car license number and remove the background
Current unfinished works for it: we need to resize each cropped image into same-size and pre-process the collected car license images
to generate numerical resperesentation of the data set with the relation: feature data & label data.
2) Manually segment car license numbers on an image.
We manually crop the car license number in the image and do the pre-prcessing for the cropped image including character and digits segmentation.
It spend much more time on preparing data set, but we can gain more standard and accurate plate license number in the image than using program.
Here is sample below.
Original image
Crop car license number and do the semgmentation
After collection, we use the program(image_to_data.m) to generate numerical resperesentation of the data set with the
relation: feature data & label data.
This part is almost done. To fit the needs of a course project, we only manually prepare a dataset of 100 plate samples.
Since each plate sample has 7 characters, this actually gives us 700 images of letters or numbers. We will use 80% of the samples for training, and the rest 20% for testing.
For feature data:
- a.Each plate image is segmented to get 7 images of characters or images;
- b.All the 7 images are converted to gray images so that we can reduce the memory
needed for computation, and they are resized to be of the same size, i.e., 40 rows and 20 columns;
- c.For each of the 7 images, we reshape it to be an vector with 1 row and 800 columns;
- d.Finally, we will have 700 such vectors (100 plates, and 7 characters for each plate).
We will store them in a matrix with 700 rows (characters) and 800 columns (pixels);
- e.Starting from the first row of the matrix, every 7 consecutive rows form the features of a plate;
For label data:
- a.For each image, we will consider its 7 characters one by one. For each character we create a one hot vector for it;
- b.Since we have 10 choices of number digit and 26 choices of letters, we use a one-hot vector with 1 row and 36 columns to represent the label of a character. The correspondence is as follows:
Character <-> position of ‘1’ in one-hot vector 0 <-> 1, 1 <-> 2, … A <-> 11, B <-> 12, …, Y <-> 35, Z <-> 36
- c.Similar to the feature data case, we finally get a label matrix with 700 rows and 36 columns. Each row represents a label of a character from a plate.
- d.Starting from the first row, every consecutive 7 rows will form the label for a plate.;
Samples for feature data
Samples for label data
Samples for whole_image_data
Sample for whole_plate_sequence
3) Generate own car license number.
We can generate training and test images to use the program(gen.py).
The process for generating the images is illustrated below:
We can spend less time to collect thousands of car license plate; however, those car license number are randomly generated and not real.
Current unfinished works for it: we need to pre-process those generated car license images
to build numerical resperesentation of the data set with the relation: feature data & label data.
Recognition system design, training, and testing
The network.
We studies from two online projects including Number plate recognition with Tensorflow and
Vehicle Plate Number Identification Available Models, we have two models on our network.
1) Convolutional Neural Networks
Convolution Neural Networks is one of the popular deep learning models, it can be trained via back propagation algorithms.
Thus as a detection task, CNN is one of our selected models for vehicle detection.
2) Recurrent Neural Networks and LSTM
Unlike feedforward neural networks, RNNs can use their internal state to process sequences of inputs. This makes them applicable to tasks such as unsegmented,
connected handwriting recognition or speech recognition. Since the car license number can also be viewed as a sequence with 7 characteristics thus RNN can be applied in our task.
Current unfinished works for it: we already build basic structure of our neural networks, but we stil need time to make some modification on our network.
Furthermore, we have done preparation of our data and collection. We will finish pre-processing of our collected data by the Nov 9, 2018,
so we will begin to train and test these collected data on our network by the Thinksgiving break.
Software
Paper to read
These papers are included to read for our program.
Teammate
Thus the assignments for every member are as follows:
- Jirong Yi: data preparation and collection(the second way), designing the sturcture of program.
- Ke Ma: data preparation and collection(the first way and the third way), writing the progress report.
- Qi Qi: data preparation and collection(the second way), designing the neural network.
Progress Milestones
Even though every one of us will focus on certain part of the project, we will get involved in the whole process to make sure that the work we present is up to everyone’s standard. To make the project run as we expect, we set the following milestones:
Before 11/09/2018, we should have finished pre-pressing of data;
Before 11/18/2018, we should have finished building neural network and inital training and testing;
Before 11/24/2018, we should have finished modification for our program;
Before 12/04/2018, we should have finished final report;
If time permits, we will try to give some theoretical justifications for the whole system and even design new algorithms for segmentation and training.
proposal for License Plate Recognition