Project Proposal for CS740: Computer Architecture
Fast Block Operation in DRAM
Group Member : Ningning Hu (hnn@cs.cmu.edu)
Jichuan Chang (cjc@cs.cmu.edu)
Project Home Page: http://www.cs.cmu.edu/~hnn/cs740-project.html
Project Description: In a traditional DRAM chips,
an entire row of bits are read into a latch upon a RAS signal (subsequent
CAS signals are used to access individual bits within this row). The latched
row values must be written back to the DRAM row after each access, since
reading the row is a destructive operation. We plan to modify the DRAM
chip so that we can specify that the current contents of the latch be written
back to an arbitrary row in the DRAM cell. This would be a variation on
a RAS signal, which causes a write rather than a read of a row. If we can
make all of the DRAM chips in the system do this simultaneously, we can
potentially copy a whole block of bits in just two DRAM cycles, thus improve
the system performance greatly. We will also consider other kinds of operation
executed directly in DRAM, and try to improve their performance. Fast block
operation in DRAM can move large blocks of data quickly from one region
of memory to another, as well as quickly clear a large block of memory
(when allocating a new page). We will study how to implement these functions
in DRAM and evaluate the performance improvement.
Plan of Attack: We will first add a special instruction to implement
the fast memory copy operation in SimpleScalar. Our idea is to use this
instruction to replace the one in the most popular memory operations in
common Glib library, such as memcpy() and memset(), and try
to take full advantage of the special block operations in DRAM. When copying
consecutive memory blocks, we will not read the data into register or cache,
but write directly back to the destination address, using the DRAM read-write
mechanism. We hope it could improve the performance of some typical memory
operation so as to improve the performance of the whole system (operating
system also performs memory copy operation frequently).
Schedule:
-
Week 1 (Oct. 20 - Oct. 26): Read papers. Install SimpleScalar. Design the
hardware block diagram to implement the fast block DRAM operation.
-
Week 2 (Oct. 27 - Nov. 2): Use a typical benchmark to obtain the
percentage of block copy operation in all memory access operations.
-
Week 3 (Nov. 3 - Nov. 9): Add a special instruction (for fast block copy)
into SimpleScalar's ISA.
-
Week 4 (Nov. 10 - Nov. 16): Modify the assembly code of memcpy function
in SimpleScalar's GLib.
-
Week 5 (Nov. 17 - Nov. 23): Evaluate the performance of new memory
system using the same benchmark we used before.
-
Week 6 (Nov. 24 - Dec. 4): Write the final project report.
The above tasks are supposed to be done by Jichuan Chang and Ningning Hu
together.
Milestone: By Nov. 20, we should have finished the implementation
of the new memory instruction on the simulator and should be on the way
of evaluation.
Literature Search:
-
David Patterson, Thomas Anderson, et al. A Case for Intelligent RAM: IRAM.
IEEE Micro, April 1997.
-
Rosenblum, M., et. al. The Impact of Architectural Trends on Operating
System
Performance. 15th ACM Symposium on Operating Systems Principles, Dec. 1995.
-
Tulika Mitra. Dynamic Random Access Memory: A Survey. Research Proficiency
Examination Report. SUNY Stony Brook, March 1999.
-
J. Carter, W. Hsieh, L. Stoller, et al. Impulse: Building a smarter memory
controller. In Proceedings of the 5th IEEE International Symposium on High
Performance Computer Architecture, Jan. 1999.
-
IBM Corp. Synchronous
DRAMs: The DRAM of the future.
-
Ars Technica. RAM
Guide.
Resources Needed:
-
SimpleScalar on Linux.
-
Machines : Office PC, PIII 700 MHz
-
Benchmarks: Spec'95
Getting Started: We have already read the related papers on advanced
memory systems, and have finished part of the installation of SimpleScalar
(we met some trouble when installing SimpleScalar's Gcc and Glib on Linux).