Test-object recognition in thermal images
Mingalev A.V., Belov A.V., Gabdullin I.M., Agafonova R.R., Shusharin S.N.


JSC “Scientific and Production Association “State Institute of Applied Optics”, Kazan, Russia


The paper presents a comparative analysis of several methods for recognition of test-object position in a thermal image when setting and testing characteristics of thermal image channels in an automated mode. We consider methods of image recognition based on the correlation image comparison, Viola-Jones method, LeNet classificatory convolutional neural network, GoogleNet (Inception v.1) classificatory convolutional neural network, and a deep-learning-based convolutional neural network of Single-Shot Multibox Detector (SSD) VGG16 type. The best performance is reached via using the deep-learning-based convolutional neural network of the VGG16-type. The main advantages of this method include robustness to variations in the test object size; high values of accuracy and recall parameters; and doing without additional methods for RoI (region of interest) localization.

image classification, object detection in images, image recognition, deep-learning-based convolutional neural network, thermal image, thermal imaging device

Mingalev AV, Belov AV, Gabdullin IM, Agafonova RR, Shusharin SN. Test-object recognition in thermal images. Computer Optics 2019; 43(3): 402-411. DOI: 10.18287/2412-6179-2019-43-3-402-411.


  1. Gonzalez R, Woods R. Digital image processing. 3rd ed. Prentice Hall Inc; 2008.
  2. Viola P, Jones MJ. Rapid object detection using a boosted cascade of simple features. Proc IEEE Conf on Comp Vision and Pattern Recogn (CVPR 2001) 2001.
  3. Viola P, Jones MJ. Robust real-time face detection. Int J Comp Vision 2004; 57(2): 137-154.
  4. LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient basedlearning applied to document recognition. Proc IEEE 1998; 86(11): 2278-2324.
  5. LeCun Y, Bengio Y. Convolutional networks for images, speech and time series. In Book: Arbib MA, ed. The handbook of brain theory and neural networks. Cambridge, MA: MIT Press; 1998: 255-258.
  6. Szegedy Ch, Liu W, Hill Ch, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. arXiv:1409.4842v1 [cs.CV]. Source: <https://arxiv.org/abs/1409.4842>.
  7. Liu W, Anguelov D, Erhan D, Szegedy Ch, Reed S, Fu Ch-Y, Berg AC. SSD: Single shot multibox detector. arXiv:1512.02325v5 [cs.CV]. Source: <https://arxiv.org/abs/1512.02325>.
  8. Saito T, Rehmsmeier M. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS One 2015; 10(3): e0118432.
  9. Šochman J, Matas J. AdaBoost. Prague: Center for Machine Perception, Czech Technical University; 2010.
  10. Freund Y, Schapire RE. A short introduction to boosting», Shannon Laboratory, USA; 1999: 771-780.
  11. The MNIST database of handwritten digits. Source: <http://yann.lecun.com/exdb/mnist>.
  12. Caffe. Source: <http://Caffe.berkeleyvision.org>.

© 2009, IPSI RAS
151, Molodogvardeiskaya str., Samara, 443001, Russia; E-mail: journal@computeroptics.ru ; Tel: +7 (846) 242-41-24 (Executive secretary), +7 (846)332-56-22 (Issuing editor), Fax: +7 (846) 332-56-20