1 of 11

More about PCL-based

point cloud analysis

CISC 829 -- Prof. Christopher Rasmussen

October 2, 2012

2 of 11

HW #2

This is all with the simulated Kinect (still using ground truth poses) and should be written as separate nodes that run while nav runs. Do not integrate your solutions into the nav code!
Tasks:
Obstacle detection. Fit a ground plane to the point cloud, segment out obstacle points, and publish them in a form that nav's costmapper will see (with the laser turned off). Warning: some obstacles may be quite short!
Object recognition. Explore a room filled with similarly-sized objects and label them in rviz when you gather enough information

3 of 11

Object Recognition Details

Using nav or teleop to manually guide your robot around the hector indoor world, your program should send one and only one visualization marker to rviz for each discrete object encountered
Each marker should consist of

A map-axis-aligned bounding cube enclosing the object
A label string for the object from the set {"sphere", "box", "cylinder", "rock", "duck"}

Use only the PointCloud2 info, not the Image
I will use the objects node of the spawner package to create the object set (you may modify it for testing)

I may change the scale of each object by 20%, but only uniformly

4 of 11

ROS "bag" files for logging

rosbag is a tool for recording and playing back ROS topics

rosbag record <topic names>
rosbag play <bag file name>

Why?

Don't need gazebo (but make sure roscore is running first)
Can focus on perception algorithm instead of moving around

Try hw2_test1.bag (160 seconds, 83 Mb compressed)

Has tf, base_pose_ground_truth, scan, camera/depth/points

5 of 11

Some HW #2 Thoughts

You know what the possible objects are and where they can be (size, yaw, and label are the variables)

Imagine an array of 6 slots that have "???" in them which you are trying to fill with "sphere", "rock", etc. (recall that there may be 0, 1, or more of various objects spawned)
Pre-learn/measure everything you can about each object
Once you've identified an object, don't treat it as still unknown!

How to tell the walls/table from one of the objects we're interested in (particularly the cube)?

Location! You know how big the room is and where the objects are--use that information!

6 of 11

What are we looking at?

Start by segmenting an object, whatever it is, from the floor (try RANSAC code in kinect.cpp here)

You may want to circle each object and do a vertical "sweep" (move the Kinect up and down) to capture as much of the object as possible

How to merge multiple object point clouds taken from different points of view into one? Assuming we have perfect poses, put everything into same frame (i.e., map), then:

Easy: Simply concatenate point clouds
But there may be overlaps...If this is a problem, voxelize to ensure maximum density in a particular region of space

7 of 11

Telling Objects Apart: Ideas...

Try to fit specific shapes (box, sphere, cylinder) using RANSAC-like approach
Compute "description" of each object offline, then find one most similar to current online description

Simple 3-D measurements on merged point cloud: Height, width/depth/radius, aspect ratio, etc.
Fit convex/concave hull polygon to 2-D "footprint" of object and work from that

http://www.pointclouds.org/documentation/tutorials/hull_2d.php

Pure recognition using

Is your approach scale/rotation/translation invariant, or at least somewhat robust to changes in these?

8 of 11

Registration

Registration is the task of trying to compute a minimum- error transformation between nominally-overlapping point clouds A and B

What kind of transformation? 3-D rigid transform (3 translation, 3 rotation parameters) for static scenes or objects...

Could also include scale and other variables

Do the point clouds match at all? That is, are they similar enough or should we say there is no such registration?
If we can assume a match, how far out of alignment are the point clouds?

9 of 11

Pairwise Registration Pipeline

10 of 11

Registration Notes

Keypoints should be distinctive, stable, and relatively rare

Makes matching easier, more efficient
See slides and code on keypoints & features here

Feature descriptors can be simple (e.g., just the keypoint's 3-D coordinates) or complicated (a description of the neighborhood around each point)

Normal vector, tangent vector, curvature/flatness, etc.

Correspondence

Which keypoints in the source scene match which in the destination scene?
Generally this means finding the feature whose descriptor is most similar. Can require this to be mutual (i.e., both points pick each other)

3 point correspondences define a rigid transform...does this remind you of RANSAC?

11 of 11

More about registration

Iterative Closest Points (ICP) is one registration algorithm that assumes point clouds are quite close and there's very little noise

Matching just picks the nearest point in the other point cloud

RANSAC-like approach with feature descriptors is more robust to wider misalignments and to mismatches

Too many outliers means there's no match!

Registration is useful for a lot of things!

Object recognition
Mapping