抄録
This paper proposes a data-localization scheme for macro-dataflow computation in which coarse-grain tasks such as loops, subroutines and basic blocks in a Fortran program are dynamically scheduled onto processors and executed in parallel. The proposed scheme reduces data transfer overhead via centralized shared memory by using local memory effectively for passing shared data among coarse-grain tasks, especially loops. This compilation scheme decomposes multiple loops with data dependences to enable to localize data by loop-aligned-decomposition method, then fuses decomposed loops requiring a large amount of data transfer among them into a macrotask, which is assigned to a processor at run-time. The scheme has been implemented on an actual multiprocessor system OSCAR having centralized shared memory and distributed shared memory in addition to local memory on each processor. Performance evaluation on OSCAR shows that the proposed data-localization scheme can reduce the execution time by 21%.
本文言語 | English |
---|---|
ページ | 135-140 |
ページ数 | 6 |
出版ステータス | Published - 1995 |
イベント | Proceedings of the 1995 IEEE Pacific RIM Conference on Communications, Computers, and Signal Processing - Victoria, BC, Can 継続期間: 1995 5月 17 → 1995 5月 19 |
Other
Other | Proceedings of the 1995 IEEE Pacific RIM Conference on Communications, Computers, and Signal Processing |
---|---|
City | Victoria, BC, Can |
Period | 95/5/17 → 95/5/19 |
ASJC Scopus subject areas
- 信号処理
- コンピュータ ネットワークおよび通信