If you drive a car to work every day, it is almost certain that you have thought about how nice it would be to have a chauffer. You could kick back and read the paper on your way to work and on the way home from work. Of course, in order to make this happen, a winning lotto ticket is necessary. The other way to do this is to make your car drive itself. The basics of making a car that can drive itself is computer vision. I will show in this article the fundamentals of computer vision, and with a little further thought you might be able to build your very own car that can see.
A computer sees by analyzing several still images that have been taken in succession. A computer picture is made up of pixels. A pixel is basically a dot that is drawn on the screen, and it has a color associated with it. By drawing a lot of dots with different colors, a picture will appear. For example, a picture with 128 different colors will assign a number between 0 and 127 for each pixel it draws. 0 means black, and 127 means white. The rest of the colors are somewhere between.
When a computer wants to see something it takes two pictures and quickly examines them. Without a lot of programming, a computer does not know what a mailbox is, and cannot pick one out of a picture. If a computer vision system is to be used in a moving car, the system should only be concerned with objects that are coming closer to the car. In the most basic of navigation systems it should not care what the objects are, just where the objects are and if they are getting closer. By looking at what changes from one picture to the next, the navigation system can make quick decisions by looking at only what changed between the last two pictures taken.
To figure out what has changed between the last two pictures taken, the computerized vision system looks at what pixels have changed. If the pixels from one picture to the next are the same, they are ignored. If the car is moving, the vision system is concerned with objects that are getting closer to the car. If the objects are not changing from one picture to the next, they are of no concern to the vision system. For example, if you are approaching a stopped vehicle in the road and take one picture, then wait 2 seconds, take another and compare the two of them, you will notice the stopped car is bigger in the second picture. The mountains in the background of the pictures are unchanged, however the size of the car is changing. Since the car is bigger in the second picture, the vision system will assume it is getting closer to the object, and make a decision as to stop or turn to avoid hitting the object.
The easiest way for the computer vision system to know what has changed between the two pictures is to subtract all the values of the pixels. If the pixels have not changed, they will be the same value. Subtracting them will leave the resulting image with a zero, or a black pixel. If the two pixels have changed from one picture to another, the pixel in the resulting picture will be a non-zero number. After every pixel is examined in each picture, we are left with another image that has black pixels where there is no change between the pictures, and different color pixels where there is change. That is all the computer vision system needs. It can now determine where there is motion. Looking at several sets of pictures in a row, the computerized vision system can recognize trends, and then the navigation system can make a decision to keep moving, slow down, stop, or turn to avoid the object.
With the understanding of this simple pixel subtraction method, anything in computerized vision is possible. Knowing what is changing in the field of view is more important than what the objects in the field of view actually are. If you do decide to add a computerized vision system to your car, I would highly recommend testing out your new ride in the middle of the desert before you try to let it drive you in downtown New York.