计算机工程与应用
計算機工程與應用
계산궤공정여응용
COMPUTER ENGINEERING AND APPLICATIONS
2013年
23期
31-34,51
,共5页
向后传播算法%并行化%计算统一设备架构%手写数字训练
嚮後傳播算法%併行化%計算統一設備架構%手寫數字訓練
향후전파산법%병행화%계산통일설비가구%수사수자훈련
Back-Propagation(BP)algorithm%parallelization%Compute United Device Architecture(CUDA)%handwritten digits training
CUDA是应用较广的GPU通用计算模型,BP算法是目前应用最广泛的神经网络模型之一。提出了用CUDA模型并行化BP算法的方法。用该方法训练BP神经网络,训练开始前将数据传到GPU,训练开始后计算隐含层和输出层的输入输出和误差,更新权重和偏倚的过程都在GPU上实现。将该方法用于手写数字图片训练实验,与在四核CPU上的训练相比,加速比为6.12~8.17。分别用在CPU和GPU上训练得到的结果识别相同的测试集图片,GPU上的训练结果对图片的识别率比CPU上的高0.05%~0.22%。
CUDA是應用較廣的GPU通用計算模型,BP算法是目前應用最廣汎的神經網絡模型之一。提齣瞭用CUDA模型併行化BP算法的方法。用該方法訓練BP神經網絡,訓練開始前將數據傳到GPU,訓練開始後計算隱含層和輸齣層的輸入輸齣和誤差,更新權重和偏倚的過程都在GPU上實現。將該方法用于手寫數字圖片訓練實驗,與在四覈CPU上的訓練相比,加速比為6.12~8.17。分彆用在CPU和GPU上訓練得到的結果識彆相同的測試集圖片,GPU上的訓練結果對圖片的識彆率比CPU上的高0.05%~0.22%。
CUDA시응용교엄적GPU통용계산모형,BP산법시목전응용최엄범적신경망락모형지일。제출료용CUDA모형병행화BP산법적방법。용해방법훈련BP신경망락,훈련개시전장수거전도GPU,훈련개시후계산은함층화수출층적수입수출화오차,경신권중화편의적과정도재GPU상실현。장해방법용우수사수자도편훈련실험,여재사핵CPU상적훈련상비,가속비위6.12~8.17。분별용재CPU화GPU상훈련득도적결과식별상동적측시집도편,GPU상적훈련결과대도편적식별솔비CPU상적고0.05%~0.22%。
CUDA is a generally used GPGPU(General Purpose Computing on GPU)model. BP algorithm is one of the most widely used neural network model at present. A method of parallelizing BP algorithm using CUDA is proposed in this paper. When this method are used to train BP neural network, data are transferred to GPU before training. Process of computing inputs, outputs, errors of hidden layer and output layer and updating weights, biases are realized on GPU. Training handwritten digital images with this method has speed-up ratio between 6.12 and 8.17 compared to training on four cores CPU. When this two results are respectively used to recognize the same test set, the recognition rate based on training result on GPU increases 0.05%~0.22%compared to that of CPU.