Project Proposal for CS740: Computer
Architecture
Fast Block Operation in DRAM
Group Member : Ningning Hu (hnn@cs.cmu.edu)
Jichuan Chang (cjc@cs.cmu.edu)
Project Home Page: http://www.cs.cmu.edu/~hnn/cs740-project.html
Project Description: In a traditional DRAM chips, an entire row of bits
are read into a latch upon a RAS signal (subsequent CAS signals are used to
access individual bits within this row). The latched row values must be written
back to the DRAM row after each access, since reading the row is a destructive
operation. We plan to modify the DRAM chip so that we can specify that the
current contents of the latch be written back to an arbitrary row in the DRAM
cell. This would be a variation on a RAS signal, which causes a write rather
than a read of a row. If we can make all of the DRAM chips in the system do this
simultaneously, we can potentially copy a whole block of bits in just two DRAM
cycles, thus improve the system performance greatly. We will also consider other
kinds of operation executed directly in DRAM, and try to improve their
performance. Fast block operation in DRAM can move large blocks of data quickly
from one region of memory to another, as well as quickly clear a large block of
memory (when allocating a new page). We will study how to implement these
functions in DRAM and evaluate the performance improvement.
Plan of Attack: We will first add a special instruction to implement
the fast memory copy operation in SimpleScalar. Our idea is to use this
instruction to replace the one in the most popular memory operations in common
Glib library, such as memcpy() and memset(), and try to take full
advantage of the special block operations in DRAM. When copying consecutive
memory blocks, we will not read the data into register or cache, but write
directly back to the destination address, using the DRAM read-write mechanism.
We hope it could improve the performance of some typical memory operation so as
to improve the performance of the whole system (operating system also performs
memory copy operation frequently).
Schedule:
- Week 1 (Oct. 20 - Oct. 26): Read papers. Install SimpleScalar. Design the
hardware block diagram to implement the fast block DRAM operation.
- Week 2 (Oct. 27 - Nov. 2): Use a typical benchmark to obtain the
percentage of block copy operation in all memory access operations.
- Week 3 (Nov. 3 - Nov. 9): Add a special instruction (for fast block copy)
into SimpleScalar's ISA.
- Week 4 (Nov. 10 - Nov. 16): Modify the assembly code of memcpy
function in SimpleScalar's GLib.
- Week 5 (Nov. 17 - Nov. 23): Evaluate the performance of new memory
system using the same benchmark we used before.
- Week 6 (Nov. 24 - Dec. 4): Write the final project report.
The above tasks are supposed to be done by Jichuan Chang and Ningning Hu
together.
Milestone: By Nov. 20, we should have finished the
implementation of the new memory instruction on the simulator and should be on
the way of evaluation.
Literature Search:
- David Patterson, Thomas Anderson, et al. A Case for Intelligent RAM: IRAM.
IEEE Micro, April 1997.
- Rosenblum, M., et. al. The Impact of Architectural Trends on Operating
System Performance. 15th ACM Symposium on Operating Systems Principles, Dec.
1995.
- Tulika Mitra. Dynamic Random Access Memory: A Survey. Research Proficiency
Examination Report. SUNY Stony Brook, March 1999.
- J. Carter, W. Hsieh, L. Stoller, et al. Impulse: Building a smarter memory
controller. In Proceedings of the 5th IEEE International Symposium on High
Performance Computer Architecture, Jan. 1999.
- IBM Corp. Synchronous
DRAMs: The DRAM of the future.
- Ars Technica. RAM
Guide.
Resources Needed:
- SimpleScalar on Linux.
- Machines : Office PC, PIII 700 MHz
- Benchmarks: Spec'95
Getting Started: We have already read the related papers on advanced
memory systems, and have finished part of the installation of SimpleScalar (we
met some trouble when installing SimpleScalar's Gcc and Glib on Linux).