Data-localization for Fortran macro-dataflow computation using partial static task assignment

Akimasa Yoshida*, Kenichi Koshizuka, Hironori Kasahara

*Corresponding author for this work

Research output: Contribution to conferencePaperpeer-review

12 Citations (Scopus)


This paper proposes a data-localization compilation scheme for macro-dataflow computation, in which coarse-grain tasks such as loops, subroutines and basic blocks in a Fortran program are automatically processed in parallel on a multiprocessor system. The data-localization scheme reduces data transfer overhead for passing shared data among coarse-grain tasks composed of Doall loops and sequential loops by using local memory effectively. In this scheme, a compiler partitions coarse-grain tasks, or loops, having data dependences among them into multiple groups by a loop aligned decomposition so that data transfer among groups can be minimum, generates dynamic scheduling routine with partial static task assignment to assign decomposed tasks in a group to the same processor at run-time, and generates parallel machine code to pass shared data inside the group through local memory. A compiler has been implemented for an actual multiprocessor system OSCAR having centralized shared memory and distributed shared memory in addition to local memory on each processor. Performance evaluation on OSCAR shows that macro-dataflow computation with the proposed data-localization scheme can reduce the execution time by 10% to 20% average compared with ordinary macro-dataflow computation using centralized shared memory.

Original languageEnglish
Number of pages8
Publication statusPublished - 1996
EventProceedings of the 1996 International Conference on Supercomputing - Philadelphia, PA, USA
Duration: 1996 May 251996 May 28


OtherProceedings of the 1996 International Conference on Supercomputing
CityPhiladelphia, PA, USA

ASJC Scopus subject areas

  • Computer Science(all)


Dive into the research topics of 'Data-localization for Fortran macro-dataflow computation using partial static task assignment'. Together they form a unique fingerprint.

Cite this