TY - GEN
T1 - OSCAR Parallelizing and Power Reducing Compiler and API for Heterogeneous Multicores
T2 - 2021 IEEE/ACM Workshop on Programming Environments for Heterogeneous Computing, PEHC 2021
AU - Kasahara, Hironori
AU - Kimura, Keiji
AU - Kitamura, Toshiaki
AU - Mikami, Hiroki
AU - Morita, Kazutaka
AU - Fujita, Kazuki
AU - Yamamoto, Kazuki
AU - Kawasumi, Tohma
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Heterogeneous computing systems, connecting general-purpose processor cores with accelerators and/or different kinds of general-purpose processor cores, have been widely used for HPC, cloud servers, self-driving vehicles, AI robots, and so on. They are used to obtain high performance and/or low power consumption. This paper introduces the OSCAR (Optimally Scheduled Advanced Multiprocessor) parallelizing compiler and OSCAR API. They allow users to automatically parallelize and power-reduce a C or Fortran program for various heterogeneous computing systems. OSCAR compiler has been developed since 1983, aiming at co-design of multiprocessor architecture and compiler. Currently, it can generate parallel machine codes for any shared memory homogeneous and heterogeneous multicores with or without hardware cache-coherent mechanism if a sequential C or Fortran compiler exists for the target multicore. OSCAR compiler translates a sequential user program written in C or Fortran into a parallelized C or Fortran program with OSCAR API compatible with frequency-voltage control, clock-gating, and power gating directives for each core, memory module, and interconnect defined in OSCAR API. The generated parallel program consists of threads specified by OpenMP "section"directives. The threads can be compiled into machine codes by an OpenMP compiler or a sequential C or Fortran compiler for a target general-purpose processor cores or accelerator cores. The compilation flow and execution and power-reduce performance for scientific and embedded applications and Deep Learning are shown on several heterogeneous systems, such as a heterogeneous multicore processor having eight general-purpose cores and 4 DRPs, or Dynamically Reconfigurable Processors, a heterogeneous multicore on FPGA using NIOS cores, and a new vector accelerator based on the past Japanese supercomputers and a personal vector supercomputer NEC Aurora Tsubasa.
AB - Heterogeneous computing systems, connecting general-purpose processor cores with accelerators and/or different kinds of general-purpose processor cores, have been widely used for HPC, cloud servers, self-driving vehicles, AI robots, and so on. They are used to obtain high performance and/or low power consumption. This paper introduces the OSCAR (Optimally Scheduled Advanced Multiprocessor) parallelizing compiler and OSCAR API. They allow users to automatically parallelize and power-reduce a C or Fortran program for various heterogeneous computing systems. OSCAR compiler has been developed since 1983, aiming at co-design of multiprocessor architecture and compiler. Currently, it can generate parallel machine codes for any shared memory homogeneous and heterogeneous multicores with or without hardware cache-coherent mechanism if a sequential C or Fortran compiler exists for the target multicore. OSCAR compiler translates a sequential user program written in C or Fortran into a parallelized C or Fortran program with OSCAR API compatible with frequency-voltage control, clock-gating, and power gating directives for each core, memory module, and interconnect defined in OSCAR API. The generated parallel program consists of threads specified by OpenMP "section"directives. The threads can be compiled into machine codes by an OpenMP compiler or a sequential C or Fortran compiler for a target general-purpose processor cores or accelerator cores. The compilation flow and execution and power-reduce performance for scientific and embedded applications and Deep Learning are shown on several heterogeneous systems, such as a heterogeneous multicore processor having eight general-purpose cores and 4 DRPs, or Dynamically Reconfigurable Processors, a heterogeneous multicore on FPGA using NIOS cores, and a new vector accelerator based on the past Japanese supercomputers and a personal vector supercomputer NEC Aurora Tsubasa.
KW - accelerator
KW - heterogeneous multicore
KW - paralelizing compiler
UR - http://www.scopus.com/inward/record.url?scp=85124249102&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85124249102&partnerID=8YFLogxK
U2 - 10.1109/PEHC54839.2021.00007
DO - 10.1109/PEHC54839.2021.00007
M3 - Conference contribution
AN - SCOPUS:85124249102
T3 - Proceedings of PEHC 2021: Workshop on Programming Environments for Heterogeneous Computing, Held in conjunction with SC 2021: The International Conference for High Performance Computing, Networking, Storage and Analysis
SP - 10
EP - 19
BT - Proceedings of PEHC 2021
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 19 November 2021
ER -