TY - JOUR
T1 - Sub-operation parallelism optimization in SIMD processor core synthesis
AU - Kawazu, Hideki
AU - Uchida, Jumpei
AU - Miyaoka, Yuichiro
AU - Togawa, Nozomu
AU - Yanagisawa, Masao
AU - Ohtsuki, Tatsuo
PY - 2005
Y1 - 2005
N2 - A b-bit SIMD functional unit has n k-bit sub-functional units in itself, where b = k × n. It can execute n-parallel k-bit operations. However, all the b-bit functional units in a processor core do not necessarily execute n-parallel operations. Depending on an application program, some of them just execute n/2-parallel operations or even n/4-parallel operations. This means that we can modify a b-bit SIMD functional unit so that it has n/2 k-bit sub-functional units or n/4 k-bit sub-functional units. The number of k-bit sub-functional units in a SIMD functional unit is called sub-operation parallelism. We incorporate a sub-operation parallelism optimization algorithm into SIMD functional unit optimization. Our proposed algorithm gradually reduces sub-operation parallelism of a SIMD functional unit while the timing constraint of execution time satisfied. Thereby, we can finally find a processor core with small area under the given timing constraint. We expect that we can obtain processor core configurations of smaller area in the same timing constraint rather than a conventional system. The promising experimental results are also shown.
AB - A b-bit SIMD functional unit has n k-bit sub-functional units in itself, where b = k × n. It can execute n-parallel k-bit operations. However, all the b-bit functional units in a processor core do not necessarily execute n-parallel operations. Depending on an application program, some of them just execute n/2-parallel operations or even n/4-parallel operations. This means that we can modify a b-bit SIMD functional unit so that it has n/2 k-bit sub-functional units or n/4 k-bit sub-functional units. The number of k-bit sub-functional units in a SIMD functional unit is called sub-operation parallelism. We incorporate a sub-operation parallelism optimization algorithm into SIMD functional unit optimization. Our proposed algorithm gradually reduces sub-operation parallelism of a SIMD functional unit while the timing constraint of execution time satisfied. Thereby, we can finally find a processor core with small area under the given timing constraint. We expect that we can obtain processor core configurations of smaller area in the same timing constraint rather than a conventional system. The promising experimental results are also shown.
KW - Hardware/software cosynthesis
KW - Hardware/software partitioning
KW - Packed SIMD type operation
KW - Processor synthesis
KW - Sub-operation parallelism
UR - http://www.scopus.com/inward/record.url?scp=24144453118&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=24144453118&partnerID=8YFLogxK
U2 - 10.1093/ietfec/e88-a.4.876
DO - 10.1093/ietfec/e88-a.4.876
M3 - Article
AN - SCOPUS:24144453118
SN - 0916-8508
VL - E88-A
SP - 876
EP - 883
JO - IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
JF - IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
IS - 4
ER -