Monocular Pallet Modeling

The goal of the project is to develop monocular 3-D vision sensor technology for warehouse robots. The sensor locates parcels and sub-pallets on pallets and transmits the interpretations to the warehouse management system. The search for parcels and pallets is based on using digital models stored in the product catalog of the warehouse management system.

With an intelligent vision sensor the need for human intervention can be reduced. For example, when picking a parcel an adjacent one may move, causing a failure later on. In addition, the current warehouse automation solutions require entering every manual modification of the pallets to the management system. A vision sensor that captures the 3-D structure of the pallets has potential to solving both problems.

The experimental laboratory scale system (shown below) consists of an industrial robot with a camera mounted in its hand. The 3-D structure of the pallet is captured with no a priori knowledge of the camera motion.

Picture

The experimental non-optimized software-only implementation runs at the speed of 6-7 image frames/second on a Sun Sparcstation IPX. Three first frames of an image sequence are shown here (click the picture for a 150kbyte image)

Picture

Modeling the scene with no a priori knowledge of the structure of the pallet takes 1.5-2.5 minutes. Checking the structure of the pallet against an a priori model consumes 1.5-2 seconds and correcting minor errors, such as missing parcels, from 2-20 seconds. The 3-D accuracy of the sensor is better than 1 cm from 2 m.

The interface of the pallet sensor is shown below (click for a 104kbyte picture)

Picture.

A visualized result of the modeled pallet shows the 3-D interpretation for an image sequence.


Olli Silven (olli.silven@ee.oulu.fi)
Tapio Repo (tapio.repo@ee.oulu.fi)