计算机学报
計算機學報
계산궤학보
CHINESE JOURNAL OF COMPUTERS
2009年
11期
2168-2177
,共10页
张为华%朱嘉华%张宏江%臧斌宇
張為華%硃嘉華%張宏江%臧斌宇
장위화%주가화%장굉강%장빈우
有效化控制%溢出处理%饱和算术%编洋优化%并行度
有效化控製%溢齣處理%飽和算術%編洋優化%併行度
유효화공제%일출처리%포화산술%편양우화%병행도
bitwidth analysis%overflow analysis%saturation operation%compiler optimization%parallelism
随着SIMD功能单元作为多媒体加速部件的广泛应用,如何有效利用这一构架优化应用程序成为编译优化研究的热点.目前典型的SIMD结构为同一操作对不同的数据化宽提供了不同的指令版本,随着操作数位宽的增加,对应的SIMD指令可同时完成的操作个数也随之降低.因此,如何有效识别操作数的有效位宽,对提高优化过程中SIMD指令内操作的并行度将产生至关重要的影响.文中针对SIMD优化面临的并行度问题,提出了一种优化算法,该算法在对操作数的有效位进行分析的基础上,进行溢出控制,从而减少操作数对宽位宽数据类型的依赖.实验数据表明,该算法可以有效提高多媒体程序优化的并行度,对多媒体程序获得较好的加速效果.
隨著SIMD功能單元作為多媒體加速部件的廣汎應用,如何有效利用這一構架優化應用程序成為編譯優化研究的熱點.目前典型的SIMD結構為同一操作對不同的數據化寬提供瞭不同的指令版本,隨著操作數位寬的增加,對應的SIMD指令可同時完成的操作箇數也隨之降低.因此,如何有效識彆操作數的有效位寬,對提高優化過程中SIMD指令內操作的併行度將產生至關重要的影響.文中針對SIMD優化麵臨的併行度問題,提齣瞭一種優化算法,該算法在對操作數的有效位進行分析的基礎上,進行溢齣控製,從而減少操作數對寬位寬數據類型的依賴.實驗數據錶明,該算法可以有效提高多媒體程序優化的併行度,對多媒體程序穫得較好的加速效果.
수착SIMD공능단원작위다매체가속부건적엄범응용,여하유효이용저일구가우화응용정서성위편역우화연구적열점.목전전형적SIMD결구위동일조작대불동적수거화관제공료불동적지령판본,수착조작수위관적증가,대응적SIMD지령가동시완성적조작개수야수지강저.인차,여하유효식별조작수적유효위관,대제고우화과정중SIMD지령내조작적병행도장산생지관중요적영향.문중침대SIMD우화면림적병행도문제,제출료일충우화산법,해산법재대조작수적유효위진행분석적기출상,진행일출공제,종이감소조작수대관위관수거류형적의뢰.실험수거표명,해산법가이유효제고다매체정서우화적병행도,대다매체정서획득교호적가속효과.
Although the SIMD units have been widely used in different architecture designs, the automatic optimizations for such architectures are not well developed yet. Since most optimiza-tions for SIMD architectures are transplanted from traditional vectorization techniques, many spe-cial features of SIMD architectures, such as packed operations, have not been thoroughly consid-ered. While operands are tightly packed within a register, there is no spare space to indicate over-flow. To maintain the accuracy of automatic SIMDized programs, the operands should be un-packed to preserve enough space for interim overflow. However, such a strategy would lead to great overhead. Moreover, the additional instructions for handling overflows can sometimes pre-vent other optimizations. In this paper, a new technique, BCSA (Bitwidth controlled SIMD arith-metic), is proposed to reduce the negative effects caused by interim overflow handling and elimi-nate the interference of interim overflows. The algorithm is applied to the multimedia benchmarks of Berkeley. The experimental results show that the algorithm can significantly improve the per-formance of multimedia applications.