Adventures in Geekdom: May 2009

Thursday, May 14, 2009

Object Tracking using Particle Filters

During my first year as a computer science grad I took a course in advanced computer vision. The class was divided into teams and each team could choose one of two projects to work on. The first project involved identifying students as they walked into a classroom based on their facial features. The second project involved tracking students as they walked into the classroom in-between other students. My team decided to work on the second project because my partner and I were both interested in motion tracking at the time.

We developed 3 systems with various degrees of motion-tracking success:

Kalman filter. Actually the particle filter developed by Cuevas, Zaldivar, and Rojas which is based on the Extended Kalman Filter (EKF). Some success.
Kalman filter with spring forces. Attempt to use multiple particle systems linked by spring forces to track multiple parts of a subject's body. Limited success.
Hierarchical particle filter. Based on the work of Viola and Jones and the work of Yang, et al. Uses rectangular windows for feature extraction. Quite different from the Kalman filter and the 2 systems described above. Most successful.

A lot of time was spent adjusting the particle set properties used by the first 2 systems for each student test video. The third system is more robust and doesn't require as much custom-tailoring to the video to which it is applied.

The particle filter presented by Cuevas, Zaldivar, and Rojas attempts to track a small color distribution within a circular window centered around a target pixel.

Alternatively, the approach presented by Viola and Jones employs a rectangular target window to determine the relative position of average color intensities within an area of the video.

Viola and Jones had great success detecting faces using the target window to capture the average color intensity around a person’s eyes and the average color intensity of that person’s upper cheeks. It's a very simple concept but it works well in practice because of the intensity difference between those two regions of a person's face. In general a person's upper cheeks are much brighter than the inset region around their eyes.

Resources:

Sunday, May 10, 2009

Mobile Robots Programming: Retriever

During my third quarter as a computer science grad I took a course in mobile robot programming. Students were required to use the Player network server to control a Pioneer 2 DX (differential drive) robot with an 8-point sonar array and bumper array built into the front of the robot.

Using the Player interface, programs can be written on a laptop attached to the robot via an ethernet cable. In such a setup the laptop provides most of the processing power while communicating with the robot's sensors and actuators via an IP connection. Programs can be tested before controlling the hardware using the Stage simulator, which can simulate sensor readings and objects in a 2-D bitmapped environment.

Teams of students were asked to write control software capable of guiding the robot to a number of predetermined locations on the third floor of the GCCIS building, making sure to avoid other robots and innocent human bystanders on the way as well as successfully maneuvering along walls and through door frames when required. The idea was that there were items at each intermediary location that needed to be retrieved before traveling to a final destination.

To accomplish this the robot needed to perform 3 major tasks: localization, path planning, and path execution. The robot was guaranteed to start in one of 8 known poses (location and orientation), however, it did not know which of those locations it started in. We had the robot determine its starting pose by detecting known landmarks in the environment near its starting location, such as pillars and walls, and narrowing down the set of possible locations until it was left with the most probable one.

After localization the robot planned a path to the destinations specified in a user-provided input file. Path planning was aided by the fact that the environment was known a priori. The robot was provided with a black and white raw image file of the layout of the third floor of GCCIS that it converted to a rudimentary obstacle map. A probabilistic road map (PRM) approach was employed to plan a path from one location to another using the obstacle map.

Each path that was planned consisted of a series of waypoints. The entire path itself may have been complex, but the path between any 2 waypoints is nothing more than a straight line. The robot navigated the path between each pair of waypoints by using potential field motion. This addressed the issue of obstacle avoidance because the robot was repulsed by obstacles in its environment while being attracted to its current goal.

Resources:

Technical paper (PDF)