Virtual Views

Table of Contents

1 Group Info

Kevin Chang <kevincha@andrew.cmu.edu>
Vivek Seshadri <vseshadr@cs.cmu.edu>

2 Project Web page

3 Problem

Many applications access their data structures using multiple access patterns. However, since data is stored in main memory in only one format, some of these access patterns may have poor locality characteristics due to the way data is laid out in memory. This leads to poor cache performance and DRAM performance for such access patterns.

For example, in a database system, a database is generally stored as a set of tables. Each table is a collection of records each having values for multiple fields. Databases are commonly stored as a "row-store" where all fields related of a record are stored together. Although this might be efficient for a query that accesses most of the fields of a record, storing a database in such a manner leads to poor performance for queries that only touch a few fields. [C-store]

4 High-level Solution Approach

In this project, we propose Virtual Views, a new mechanism to address this problem. The key idea is to store multiple "views" of the data structure each having a different data layout (e.g., multiple organizations of the database). Different views are kept consistent with each other and access to the data structure are directed to the view that is best suited for the access pattern. For example, our mechanism can choose to have a column-based layout of a database, and queries that access only a single column can be sent to the column-organized view of the database. This ensures that the query is serviced fast and also does not transfer unwanted data across the memory hierarchy. As a result, Virtual Views improves the overall system performance and may also improve energy efficiency by avoiding unwanted data movement.

5 Key Challenges

There are several key challenges in realizing Virtual Views.

5.1 Identifying Opportunity for Virtual Views

This can be accomplished in two ways: 1) The application itself can explicitly specify the existence of multiple access pattern for a data structure. The compiler can use this information and generate code that creates multiple views that are kept consistent. 2) The compiler can perform static analysis of the program (possibly include profiling) and automatically identify opportunities for creating Virtual Views.

5.2 Maintaining Consistency Between Different Views

This part requires the compiler to analyze the program and identify stores to data locations that have multiple copies (views). The compiler should then issue multiple stores so as to keep the data consistent across views (store amplification). This also requires the compiler to be able to compute the view-space location of the data being modified.

6 Goals

TargetDescription
75% GoalThe application explicitly specifies the opportunities for
views (Point 1, Challenge 1). The goal is to have the compiler
do the rest.
100% GoalCompiler automatically identifies the opportunities for
Virtual Views.
125% GoalHow much performance improvement can be obtained if the
hardware is made aware of the existence of Virtual Views.

7 Plan of attack

TimelineMini goals
Week 1Create micro-benchmarks that can benefit from Virtual Views. Also
identify real benchmarks which can possibly benefit from Virtual
Views. Ensure that the LLVM framework has sufficient features to
accomplish the goals of this project.
Week 2Develop the infrastructure through which an application can
communicate the Virtual Views to the compiler.
Week 3Compiler static analysis to determine stores that need to be
amplified.
Week 4Code generation and testing (75% goal) (Milestone)
Week 5Work on static analysis to automatically identify opportunities
for Virtual Views.
Week 6Test and compare performance of the two different approaches.

8 Resources Needed

We hope that the LLVM framework has sufficient features to accomplish our goals. We will ensure this during the first week of our project.