Speculation in a Shared Cache
Project Web Page
Proposal: http://www.cs.cmu.edu/afs/cs/user/faulring/15740/project/proposal.html
Project home page: http://www.cs.cmu.edu/afs/cs/user/faulring/15740/project/index.html
Project Description
As processors become smaller, computer architects designing uniprocessors face dimishing returns due to electrical and physical limits. One hope around these performance limits is the use of more parallel architectures, perhaps in the form of multiprocessors. The operating system can schedule multiple processes or theads to run concurrently on the different processors. This model requires the programmer to explicitly split the task into separate subtasks. Beyond the inherent difficulties with such an endeavor, programs must be rewritten before they will benefit from the multiprocessor architecture.
A few limited cases have found multiprocessors to offer significant speedup. Existing compiler and language techniques have been successful in achieving significant speedup for large regular numeric problems. Unfortunately, most users require solutions to problems that do not have a reqular numeric nature.
Thread-Level Data Speculation (TLDS) is a technique to take traditional sequential programs (often of the irregular, non-numeric nature) and extract parallel threads from them. Separate iterations of different loops are scheduled on a tightly coupled multiprocessor. A modified cache allows the speculative iterations to proceed, without affecting the permanent state until all previous iterations have proceeded.
For our project we will investigate performance issues related to allowing separate multiprocessors to share a single cache. Since the simulator almost supports this functionality, we intended to spend the bulk of our time running experiments using the common benchmarks and then analyzing the results.
Logistics
Schedule
Week |
Task |
Notes |
22 Oct |
- Obtain simulator code
- Code walk-through with Greg
|
29 Oct |
- Complete code modifications(replicating speculatively modified lines
within a cache set)
|
05 Nov |
-
Run the standard four benchmarks (buk, compress95, equake, ijpef
|
12 Nov |
-
Analyize results of benchmarks to determine possible areas of
improvement.
|
19 Nov |
-
Implement any improvements suggested by the earlier experiments.
|
Milestone on 20 Nov |
26 Nov |
|
03 Dec |
|
Project due on 04 Dec |
Milestone
By this point we plan to have enhanced the simulator code to support a shared cache and to have run the four standard benchmark programs (buk, compress95, equake, and ijpef) on this simulator.
Literature Search
- A Scalable Approach to Thread-Level Speculation
[PS]
J. Gregory Steffan, Christopher B. Colohan, Antonia Zhai, and Todd Mowry
Proceedings of the 27th International Symposium on Computer Architecture,
June 12-14, 2000, Vancover, British Columbia, Canada
- Extending Cache Coherence to Support Thread-Level Data Speculation on
a Single Chip and Beyond
[PS]
J. Gregory Steffan, Christopher B. Colohan, and Todd Mowry
Technical Report CMU-CS-98-171, School of Computer Science,
Carnegie Mellon University, December 1998.
- A Low-Overhead Software Approach to Thread-Level Data Dependence Speculation on Multiprocessors [PS]
Peter Rundberg. Technical Report No. 00-13, Department of Computer Engineering, Chalmers University of Technology, July 2000.
- Architectural Support for Scalable Speculative Parallelization in Shared-Memory Multiprocessors [PS]
Marcelo Cintra, Jos e F. Martnez, and Josep Torrella. ACM Intl. Symp. on Comp. Arch. 2000.
Resources Needed
- Stampede project simulator
Getting Started
So far we have read the two papers cited above to begin familiarizing ourselves
with the work. We have also contacted Todd Mowry and Greg Steffan to schedule a
meeting to discuss the details of the project.