Project Proposal for 15-740/18-740: Computer Architecture

Exploiting thread Motion on a CMP with private L1 Caches

Group Member :  Athula Balachandran  (abalacha@cs.cmu.edu)
                                 Lavanya Subramanian  (lsubrama@andrew.cmu.edu)
Project Home Page: http://www.cs.cmu.edu/~abalacha/15740/15740_project.html

Introduction:
Dynamic Voltage Frequency Scaling (DVFS) is a traditional technique that is used to exploit run time variability and conserve power with minimum performance degradation. DVFS is typically employed at the OS scheduler intervals. However, recent research shows that applications' variability behaviour is more fine-grained and cannot be exploited effectively, by performing DVFS at OS scheduler intervals. Employing DVFS at fine-grained intervals imposes a huge delay overhead for the regulator voltage level transitions and is practically impossible, with off-chip regulators. In this light, [1] proposes a scheme, where the different cores are assigned different voltage/performance levels and can be used based on the applications' performance requirements. The authors call this mechanism Thread Motion. Our project looks into the challenges/bottlenecks in applying this to a generic chip multiprocessor.

Project Description:
In [1], the authors employ an architecture similar to the Sun ROCK processor. This architecture groups processors into clusters and they share an L1 cache. Migrations that are performed within a cluster do not suffer the impact of missing L1 cache data. However, in most Chip Multiprocessor Systems, each processor has a private L1 cache. So, we aim at exploring the effectiveness of the "Thread motion" scheme in this scenario. Specifically, we would like to quantify the performance degradation, that would result from the L1 misses, when migration is performed. We observe that the concept of intra and inter clusters does not apply in this scenario.
Migration to a far-off core would also result in increased L2 access latency. We plan to also fine tune the migration algorithm/strategy to minimize this.

Related Work:
[1] looks at migration, at fine grained intervals as described above.Previous work does do migration either at OS intervals, [2] for process variation-aware application mapping combined with DVFS and [3] during thermal hotspots/emergencies. Apart from [1], there isn't any work to our knowledge, that looks at migration at finer-grained intervals than the OS scheduling interval.

Resources:

Schedule:

We both will be working on the design and implemention of the thread motion manager in the simulator and the evaluation process. Once we are done with the design, we may suitably modularize and divide the implementation work between the two of us.

Milestone: Preliminary Evaluation with Thread Motion Manager implemented

Getting Started: We have gone through the existing literature in this area. We have also collected some of the resources required for the project.

References: