TY - GEN
T1 - Green multicore-SoC software-execution framework with timely-power-gating scheme
AU - Onouchi, Masafumi
AU - Toyama, Keisuke
AU - Nojiri, Toru
AU - Sato, Makoto
AU - Mase, Masayoshi
AU - Shirako, Jun
AU - Sato, Mikiko
AU - Takada, Masashi
AU - Ito, Masayuki
AU - Mizuno, Hiroyuki
AU - Namiki, Mitaro
AU - Kimura, Keiji
AU - Kasahara, Hironori
PY - 2009
Y1 - 2009
N2 - We are developing a software-execution framework based on an octo-core chip multiprocessor named RP2 and an automatic multigrain-parallelizing compiler named OSCAR. The main purpose of this framework is to maintain good speed scalability and power efficiency over the number of processor cores under severe hardware restrictions for embedded use. Key to the speed scalability is reduction of a communication overhead with parallelized tasks. A data-categorization scheme enables small-overhead cache-coherency maintenance by using directives and instructions from the compiler. In this scheme, the number of cache-flushing time is minimized and parallelized tasks are quickly synchronized by using flags in local memory. As regards power efficiency, to reduce power consumption, power supply to processor cores waiting for other cores is timely and frequently cut off, even in the middle of an application, by using a timelypower- gating scheme. In this scheme, to achieve quick mode transition between "NORMAL" mode and "RESUME POWEROFF" mode, register values of the processor core are stored in core-local memory, which is active even in "RESUME POWEROFF" mode and can be accessed in one or two clock cycles. Measured speed and power of an application show good speed scalability in execution time and high power efficiency, simultaneously. In the case of a secure AAC-LC encoding program, execution speed when eight processor cores are used can be increased by 4.85 times compared to that of sequential execution. Moreover, power consumption under the same condition can be reduced by 51.0% by parallelizing and timely-power gating. The time for mode transition is less than 20 μsec, which is only 2.5% of the "RESUME POWER-OFF" period.
AB - We are developing a software-execution framework based on an octo-core chip multiprocessor named RP2 and an automatic multigrain-parallelizing compiler named OSCAR. The main purpose of this framework is to maintain good speed scalability and power efficiency over the number of processor cores under severe hardware restrictions for embedded use. Key to the speed scalability is reduction of a communication overhead with parallelized tasks. A data-categorization scheme enables small-overhead cache-coherency maintenance by using directives and instructions from the compiler. In this scheme, the number of cache-flushing time is minimized and parallelized tasks are quickly synchronized by using flags in local memory. As regards power efficiency, to reduce power consumption, power supply to processor cores waiting for other cores is timely and frequently cut off, even in the middle of an application, by using a timelypower- gating scheme. In this scheme, to achieve quick mode transition between "NORMAL" mode and "RESUME POWEROFF" mode, register values of the processor core are stored in core-local memory, which is active even in "RESUME POWEROFF" mode and can be accessed in one or two clock cycles. Measured speed and power of an application show good speed scalability in execution time and high power efficiency, simultaneously. In the case of a secure AAC-LC encoding program, execution speed when eight processor cores are used can be increased by 4.85 times compared to that of sequential execution. Moreover, power consumption under the same condition can be reduced by 51.0% by parallelizing and timely-power gating. The time for mode transition is less than 20 μsec, which is only 2.5% of the "RESUME POWER-OFF" period.
UR - http://www.scopus.com/inward/record.url?scp=77951486753&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77951486753&partnerID=8YFLogxK
U2 - 10.1109/ICPP.2009.6
DO - 10.1109/ICPP.2009.6
M3 - Conference contribution
AN - SCOPUS:77951486753
SN - 9780769538020
T3 - Proceedings of the International Conference on Parallel Processing
SP - 510
EP - 517
BT - ICPP-2009 - The 38th International Conference on Parallel Processing
T2 - 38th International Conference on Parallel Processing, ICPP-2009
Y2 - 22 September 2009 through 25 September 2009
ER -