Technology of implementation of neural network algorithm  in cuda environment at the example of handwritten digits recognition
P.Y. Izotov, S.V. Sukhanov, D.L. Golovashkin

Image Processing Systems Institute of the RAS,
Samara State Aerospace University

Full text of article: Russian language.

On a convolution neural network example features of implementation of pattern recognition algorithm on Graphic Processing Unit (GPU) on NVIDIA CUDA are shown. Duration of training of a network on the video adapter is reduced in 5.96, and recognition of test samples set in 8.76 times in comparison with the optimised algorithm which uses only central processor (CPU) for calculations. Perspective of implementation of such neural network algorithms on graphic processors is shown.

Key words:
convolutional neural network, pattern recognition, neural network training, backpropagation of error, parallel computing, GPGPU, NVIDIA, CUDA, matrices multiplication, CUBLAS.


  1. Akulov, P.V. Solving problems of forecasting using neural networks // Master's portal Donetsk National Technical University. [2006]. URL: (verified at 10/26/2009). – (in Russian).
  2. Kulchin, Yu.N. Neuro-iterative algorithm of tomographic reconstruction of the distributed physical fields in the fibre-optic measuring systems / Yu.N. Kulchin, B.S. Notkin, V.A. Sedov // Computer Optics. – 2009. – . 33,  4. – P. 446-455. – (in Russian).
  3. Dovzhenko, A.Yu. Parallel neural network with remote access based on a distributed cluster of computers / A.Yu. Dovzhenko, S.A. Krashakov // Abstracts of II International conference on new techniques and applications of modern physical chemical methods for environmental studies. – P. 52-53. – (in Russian).
  4. Fatica, M. CUDA for High Performance Computing: materials of HPC-NA Workshop 3 (January 2009).
  5. Belgian researchers develop desktop supercomputer // FASTRA. URL: (verified at 10/28/2009).
  6. John E. Stone, James C. Phillips, Peter L. Freddolino, David J. Hardy, Leonardo G. Trabuco, and Klaus Schulten. Accelerating molecular modeling applications with graphics processors. Journal of Computational Chemistry, 28:2618-2640, 2007.
  7. New NVIDIA Tesla GPUs Reduce Cost Of Supercomputing By A Factor Of 10 // NVIDIA – World Leader in Visual Computing Technologies. [2009]. URL: (verified at 10/20/2009).
  8. Patrice Simard, Dave Steinkraus, and John C. Platt. Best practices for convolutional neural networks applied to visual document analysis. In Proceedings of ICDAR 2003, pages 958–962, 2003.
  9. Mike O'Neill. Neural Network for Recognition of Handwritten Digits // Codeproject. Free source code and programming help. [2006]. URL: (verified at 09/20/2009).
  10. Tsaregorodtsev, V.G. Inexhaustible as the atom (about neural networks and neural applications). [2008]. URL: (verified at 10/20/2009). – (in Russian).
  11. LeCun, Y., Y. Bengio. Convolutional networks for images, speech, and time-series. In M. A. Arbib, editor, The Handbook of Brain Theory and Neural Networks. MIT Press, 1995.
  12. Rumelhart, David E.; Hinton, G.E.; Williams, R.J. Learning Internal Representations by Error Propagation. In Parallel Distributed Processing, Cambridge: M.I.T. Press, v. 1, p. 318-362 (1986).
  13. Y. LeCun, L. Bottou, G. Orr and K. Muller. Efficient BackProp – Neural Networks: Tricks of the trade, Springer, 1998.
  14. NVIDIA CUDA Programming Guide Version 2.3.1 // NVIDIA – World Leader in Visual Computing Technologies. [2009]. URL:  (verified at 20.10.2009).
  15. Y. LeCun. The MNIST database of handwritten digits // MNIST handwritten digit database, Yann LeCun and Corinna Cortes. [2009]. URL: (verified at 10/20/2009).

© 2009,
, 443001, , . , 151; : ; : +7 (846 2) 332-56-22, : +7 (846 2) 332-56-20