计算机工程与应用
計算機工程與應用
계산궤공정여응용
COMPUTER ENGINEERING AND APPLICATIONS
2014年
21期
79-84
,共6页
网络编码%图形处理器(GPU)%并行%计算统一设备架构(CUDA)%优化
網絡編碼%圖形處理器(GPU)%併行%計算統一設備架構(CUDA)%優化
망락편마%도형처리기(GPU)%병행%계산통일설비가구(CUDA)%우화
network coding%Graphic Processing Unit(GPU)%parallelizing%Compute Unified Device Architecture(CUDA)%optimization
网络编码允许网络节点在数据存储转发的基础上参与数据处理,已成为提高网络吞吐量、均衡网络负载和提高网络带宽利用率的有效方法,但是网络编码的计算复杂性严重影响了系统性能。基于众核GPU加速的系统可以充分利用众核GPU强大的计算能力和有效利用GPU的存储层次结构来优化加速网络编码。基于CUDA架构提出了以片段并行的技术来加速网络编码和基于纹理Cache的并行解码方法。利用提出的方法实现了线性随机编码,同时结合体系结构对其进行优化。实验结果显示,基于众核GPU的网络编码并行化技术是行之有效的,系统性能提升显著。
網絡編碼允許網絡節點在數據存儲轉髮的基礎上參與數據處理,已成為提高網絡吞吐量、均衡網絡負載和提高網絡帶寬利用率的有效方法,但是網絡編碼的計算複雜性嚴重影響瞭繫統性能。基于衆覈GPU加速的繫統可以充分利用衆覈GPU彊大的計算能力和有效利用GPU的存儲層次結構來優化加速網絡編碼。基于CUDA架構提齣瞭以片段併行的技術來加速網絡編碼和基于紋理Cache的併行解碼方法。利用提齣的方法實現瞭線性隨機編碼,同時結閤體繫結構對其進行優化。實驗結果顯示,基于衆覈GPU的網絡編碼併行化技術是行之有效的,繫統性能提升顯著。
망락편마윤허망락절점재수거존저전발적기출상삼여수거처리,이성위제고망락탄토량、균형망락부재화제고망락대관이용솔적유효방법,단시망락편마적계산복잡성엄중영향료계통성능。기우음핵GPU가속적계통가이충분이용음핵GPU강대적계산능력화유효이용GPU적존저층차결구래우화가속망락편마。기우CUDA가구제출료이편단병행적기술래가속망락편마화기우문리Cache적병행해마방법。이용제출적방법실현료선성수궤편마,동시결합체계결구대기진행우화。실험결과현시,기우음핵GPU적망락편마병행화기술시행지유효적,계통성능제승현저。
It is well known that network coding has emerged as a promising technique to improve network throughput, balance network loads as well as better utilization of the available bandwidth of networks, in which intermediate nodes are allowed to perform processing operations on the incoming packets other than forwarding packets. But, its potential for practical use has remained to be a challenge, due to its high computational complexity which also severely damages its performance. However, system accelerated by many-core GPU can advance network coding with powerful computing capacity and optimized memory hierarchy from GPU. A fragment-based parallel coding and texture-based parallel decoding are proposed on CUDA-enable GPU. Moreover, random linear coding is parallelizing using CUDA with optimization based on proposed techniques. Experimental results demonstrate a remarkable performance improvement, and prove that it is extraordinarily effective to parallelize network coding on many-core GPU-accelerated system.