Major Changes
Most of our major changes have resulted from our meetings with Intel labs, and becoming more familiar with the challenges parallelizing ODE pose. First, we are now focusing more on multi-threading than multi-processor systems. Secondly, a lot of our focus has also shifted, on the basis of the recommendations from Intel labs, to focusing much more on profiling the code, identifying ways of parallelizing each section, and examining the inherent difficulties, rather than attempting to parallelize multiple different sections.
Our parallelization focus will probably remain on the collision detection portions of ODE.
Accomplished So Far
We acquired more interesting test cases from Intel, that showed that there appear to be two places that take most of the time: either in collision detection (collide2() and collide()) or in a solver for the linear complementarity problem. In particular, the DPR related test cases spend almost all of their time in collision detection, while sparcer models for more general robots spend a good chunk of time in each. This reaffirmed our plan to focus on parallelizing collision detection, since this is one of the major slowdowns in ODE. We have made an attempt at parallelizing collide2() and have identified how to parallelize collide() in a similar fashion. We have also found a parallel algorithm that may work to speedup the solver (they have to solve a linear complementarity problem, or LCP, frequently), though it appears a full parallel implementation of LCP is outside the scope of our project.
Meeting Our Milestone
We have almost met our milestone, of having one version parallelized and another ready to go. We have an implementation of one of the two collision functions. It took more time than we expected to accomplish all the profiling.
Surprises
ODE is a large system, not very well documented, and the programs available for profiling do not produce easy to understand output. Profiling and identifying areas strongly suited for parallelization has been more time consuming than expected. Also, some data structures used in ODE are strongly serial (for instance, walking a linked list) which makes parallelization more difficult.
Revised Schedule
Today: November 20--project milestone
Obtain preliminary results from at least one parallelization method accomplished. Also, another method of parallelization should be at least one week into being coded we have identified, but not coded, this method.
The rest of our schedule:
1 week--November 27
Complete second method of parallellization. Modify as necessary. Iterate by running on dual core and 8-way machines, test new attempts for parallelization, repeat.
2 weeks--December 4
Writeup is due. All results have been obtained and analyzed.
Resources Needed
Our first pass at double checking our implementations for parallelized collision detection will probably be tested on our laptops (dual core machines) and desktops (quad core machines). After that, we think access to 8-way machines will be sufficient to test the effects of parallelization.