Abstract
Communication latency is central to multiprocessor design. This study presents the design principles of the EM-X distributed-memory multiprocessor towards tolerating communication latency. The EM-X overlaps computation with communication for latency tolerance by multithreading. In particular, we present two types of hardware support for remote memory access: (1) priority-based packet scheduling for thread invocation, and (2) direct remote memory access. The prioritybased scheduling policy extends a FIFO ordered thread invocation policy to adopt to different computational needs. The direct remote memory access is designed to overlap remote memory operations with thread execution. The 80-processor prototype of EM-X is developed and is operational since December 1995. We execute several programs on the machine and evaluate how the EM-X effectively overlaps computation with communication toward tolerating communication latency for high performance parallel computing.
Original language | English |
---|---|
Pages (from-to) | 1065-1071 |
Number of pages | 7 |
Journal | IEICE Transactions on Information and Systems |
Volume | E79-D |
Issue number | 8 |
Publication status | Published - 1996 |
Externally published | Yes |
Keywords
- Distributed shared memory
- Fine grain communication
- Mitltithread architecture
ASJC Scopus subject areas
- Software
- Hardware and Architecture
- Computer Vision and Pattern Recognition
- Electrical and Electronic Engineering
- Artificial Intelligence