We present a method of semi-autonomous teleoperation of a vehicle which allows it to accurately traverse hilly terrain while communicating with the operator across a very low bandwidth link. The operator plots the vehicle's chosen trajectory based on a single 2-D image, and the transformation of 2-D image points to 3-D world points is done in real time on the vehicle. Traditional flat-earth geometry models do not work well on real world terrain. In contrast, STRIPE models the world as a collection of polygons rather than as a single plane. As the vehicle moves through the world, STRIPE continually adds new polygons to its internal world model. Each new polygon is derived by sensing the orientation of the patches of the ground beneath the vehicle's wheels as it moves. The projection onto the ground of a path, designated in a 2-D image, will become increasingly accurate as the world model is incrementally refined. While points far ahead of the vehicle will still be imprecisely projected, the incremental polygonal representation will almost always give adequate 3-D projection for the next few points used in steering the vehicle. The STRIPE method is equally applicable to on and off-road terrain, and can also be used for on-line mapping of the local terrain.
In STRIPE, a single image is transmitted from the vehicle to the base station. The teleoperator uses a mouse or a joystick to pick a sequence of points in the image that the vehicle should follow. This sequence of 2-D points is transmitted back to the vehicle, and the vehicle uses the points together with the incremental polygonal earth technique (described below) to project the points onto the 3-D terrain and follow the desired path. When the vehicle has moved a certain distance or has reached the end of the path of points, it transmits another image back to the base station and the process is repeated.
The STRIPE technique is particularly useful in the case of a teleoperated vehicle which is being controlled over a low-bandwidth link. In order to make any sort of reasonable progress, we must cope with the low frequency with which images of the terrain (which typically contain a lot of data) are transmitted back to the base station. With the STRIPE system, a single image from the vehicle allows us to make considerable progress along an operator-defined path before the next image is needed. In addition, image transmission can occur in parallel with path following once the first path has been defined. The new image can be transmitted while the vehicle is only part of the way along the old path, and so the vehicle continues to make forward progress during this transmission.
Given a stereo pair of images, it is mathematically possible to compute a 3-D description of the world, although computationally this is still a hard problem. One system similar to STRIPE [Lescoe 91] uses the human visual system to compute the 3-D geometry. The operator visually fuses the image pair, perhaps by wearing shuttered glasses, and then picks 3-D points using a 3-D mouse. Of course, if we are concerned about transmission time because we are using a limited-bandwidth link, then transmission of an image pair would take about twice as long as the transmission of a single image, and this delay may be unacceptably long.
Given only a single image, the "flat-earth" assumption is a simple way to constrain the problem sufficiently to make the transformation from 2-D to 3-D points well defined. This method assumes that all the points in the world lie on the same plane, which is known from the robot and camera geometry, and so the transformation from 2-D to 3-D is easy and well defined. Unfortunately, road following in a world which doesn't have a flat earth while using the flat-earth assumption generally gives poor results[DeMenthon 86].
Figure 1 contains an example showing a particularly hazardous road. The road travels straight ahead, down a steep hill, and the bends off to the right. Figure 1a is a view of the road from off to the side, and Figure 1b is a view from directly above the road. If the flat earth assumption was used, the length of the road before the turn would be underestimated. A vehicle using that method would take the path indicated by the gray line, with potentially disastrous results.
As we have demonstrated, it is unreasonable to assume that the entire road lies on the same plane. However it is usually safe to assume that the next little bit of road immediately in front of the vehicle lies on approximately the same ground plane as the vehicle itself.
This leads to an obvious algorithm:
This algorithm was shown to work well in simulation, but isn't really very practical. Step 2 is very slow, especially using a human operator, and so the algorithm wouldn't really work in real time.
STRIPE uses a similar, but much faster, three part algorithm:
With this approach, the points are chosen in the B module, while the path following is done in the C module. Road following can continue at high speed so long as there are some points remaining in the C module that are ahead of the vehicle. As soon as new points are chosen, they are transmitted from B to C, and C can immediately begin to follow these new points instead.
The transformation between any vehicle coordinate frame and its corresponding camera coordinate frame is constant, since the camera is fixed on the vehicle, and we will refer to it as .
Assume that we know the transformation between some world coordinate frame, w, and the vehicle position, a, where the image was taken (), and the transformation between the same world coordinate frame and the current vehicle location, b ().
So first we use to put the point in a's coordinates. Next we use to put the point in world coordinates, and finally transforms the point out of world coordinates and into b's coordinate frame. Summarizing, if p is a point in the camera's coordinate frame, and q is the point in b's coordinates, we have:
Note that relative position information, such as data from an INS, is sufficient for the STRIPE system. No global positioning information is necessary, the actual location of the origin of the world coordinate frame is irrelevant.
Notice how the road weaves about and banks at the same time as the ground plane is changing. The vehicle remains well centered on the road, and does a good job of mapping the terrain already covered. Even though at the beginning, in Figure 2, STRIPE's estimated position of the last rung in Figure 3 may not be very accurate, as we approach that spot the estimate gets better and we accurately follow the path as can be seen in the STRIPE's mapping of the road.
Figure 4 is an example of an image that's not terribly useful. We can only pick out points for a very short path because we have a very limited view of the road. While a human driver can adjust his gaze in order to gain a better view of the road, STRIPE currently has to put up with the occasional image like that of Figure 4. Because of the bandwidth constraint, the vehicle has to somehow know that it should pan before digitizing an image. We intend to investigate this problem in the future.
STRIPE is currently being tested on the Carnegie Mellon Navlab vehicle [Thorpe 91]. Arguably the hardest problem for the current version of STRIPE is the fact that the camera is mounted in a fixed position relative to the vehicle. Computationally, adding transformations to cope with a new camera position is quite trivial, however the problem of deciding how to move the camera is less so.
[Lescoe 91] Paul Lescoe, David Lavery, Roger Bedard, Navigation of Military and Space Unmanned Ground Vehicles in Unstructured Terrains, Third Conference on Military Robotic Applications, September 1991.
[Thorpe 91] C. Thorpe, M. Hebert, T. Kanade, and S. Shafer. Toward Autonomous Driving: The CMU Navlab. IEEE Expert, V 6 # 4 August 1991.
Figure 2. At the start of the road
Figure 3. About 8 Meters Along.
Figure 4. Not Much Road Visible.
Figure 5. Further Along the Road.
Back to Jennifer Kay's home page
Back to the Navlab project home page
Thanks for reading, you are visitor number since December 11, 1995.