电子与信息学报
電子與信息學報
전자여신식학보
JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY
2014年
11期
2768-2774
,共7页
刘斌%赵银亮%韩博%李玉祥%吉烁%冯博琴%武万杰
劉斌%趙銀亮%韓博%李玉祥%吉爍%馮博琴%武萬傑
류빈%조은량%한박%리옥상%길삭%풍박금%무만걸
并行处理%线程级推测%循环选择%性能预测
併行處理%線程級推測%循環選擇%性能預測
병행처리%선정급추측%순배선택%성능예측
Parallel processing%Thread-Level Speculation (TLS)%Loop selection%Performance prediction
线程级推测(Thread-Level Speculation, TLS)是多核上一种加速串行程序的线程级自动并行化技术。循环具有规则的结构并在运行时占有大量的执行时间,因此循环是挖掘并行性的理想对象。然而,选择哪些循环并行才能提高程序的加速比是一个很难决定的问题。为了解决该问题,该文提出一种基于性能预测的循环选择方法。基于输入训练集获取程序预执行的剖析信息,同时结合各种推测因素,构建了循环结构的性能预测模型。预测结果定量评估了循环推测并行的加速比并决定该循环在运行时是否适合并行。实验结果表明,该文提出的方法能有效地预测循环并行时所蕴含的并行性,并依据预测结果准确地选择具有并行收益的循环推测并行,最终 Olden 基准测试集加速比性能平均提升了12.34%。
線程級推測(Thread-Level Speculation, TLS)是多覈上一種加速串行程序的線程級自動併行化技術。循環具有規則的結構併在運行時佔有大量的執行時間,因此循環是挖掘併行性的理想對象。然而,選擇哪些循環併行纔能提高程序的加速比是一箇很難決定的問題。為瞭解決該問題,該文提齣一種基于性能預測的循環選擇方法。基于輸入訓練集穫取程序預執行的剖析信息,同時結閤各種推測因素,構建瞭循環結構的性能預測模型。預測結果定量評估瞭循環推測併行的加速比併決定該循環在運行時是否適閤併行。實驗結果錶明,該文提齣的方法能有效地預測循環併行時所蘊含的併行性,併依據預測結果準確地選擇具有併行收益的循環推測併行,最終 Olden 基準測試集加速比性能平均提升瞭12.34%。
선정급추측(Thread-Level Speculation, TLS)시다핵상일충가속천행정서적선정급자동병행화기술。순배구유규칙적결구병재운행시점유대량적집행시간,인차순배시알굴병행성적이상대상。연이,선택나사순배병행재능제고정서적가속비시일개흔난결정적문제。위료해결해문제,해문제출일충기우성능예측적순배선택방법。기우수입훈련집획취정서예집행적부석신식,동시결합각충추측인소,구건료순배결구적성능예측모형。예측결과정량평고료순배추측병행적가속비병결정해순배재운행시시부괄합병행。실험결과표명,해문제출적방법능유효지예측순배병행시소온함적병행성,병의거예측결과준학지선택구유병행수익적순배추측병행,최종 Olden 기준측시집가속비성능평균제승료12.34%。
Thread-Level Speculation (TLS) is a thread-level automatic parallelization technique to accelerate sequential programs on multi-core. Loops are usually regular structures and programs spent significant amounts of time executing them, thus loops are ideal candidates for exploiting the parallelism of programs. However, it is difficult to decide which set of loops should be parallelized to improve overall program performance. In order to solve the problem, this paper proposes a loop selection approach based on performance prediction. Basing on the input training set, the paper gathers profiling information during program pre-execution. Combining profiling information associated with the program and various speculative execution factors, the paper establishes a performance prediction model for loops. Then, based on the result of prediction, the paper can quantitatively estimate the speedup of loops and decide which loops should be parallelized on runtime. The experimental results show that the proposed approach effectively predicts the parallelism of loops when speculative execution and accurately selects beneficial loops for speculative parallelization according to the predicted results, finally Olden benchmarks reach 12.34%speedup performance improvement on average speedup.