Object Recognition with Hybrid Deep Learning Methods and Testing on Embedded Systems

Yavuz Selim Taspinar; Murat Selek

doi:10.18201/ijisae.2020261587

Authors

Yavuz Selim Taspinar Selcuk University http://orcid.org/0000-0002-7278-4241
Murat Selek http://orcid.org/0000-0001-8642-1823

DOI:

https://doi.org/10.18201/ijisae.2020261587

Keywords:

Deep learning, Image classification, object detection, hybrid methods

Abstract

Object recognition applications can be made with deep neural networks. However, this process may require intensive processing load. For this purpose, hybrid object recognition algorithms that can be created for the recognition of an object in the image and the comparison of the working time of these algorithms on various embedded systems are emphasized. While Haar Cascade, Local Binary Pattern (LBP) and Histogram Oriented Gradients (HOG) algorithms are used for object detection, Convolutional Neural Network (CNN) and Deep Neural Network (DNN) algorithms are used for classification. As a result, six hybrid structures such as Haar Cascade+CNN, LBP+CNN, HOG+CNN and Haar Cascade+DNN, LBP+DNN, HOG+DNN are developed. In this study, these 6 hybrid algorithms were analyzed in terms of success percentage and time, then compared with each other. Microsoft COCO dataset was used to train and test all these hybrid algorithms. Object recognition success of CNN was 76.33%. Object recognition success of Haar Cascade+CNN, one of the hybrid methods we recommend, with a success rate of 78.6% is higher than CNN and other hybrid methods. LBP+CNN method recognized objects in 0.487 seconds which is faster than any other hybrid methods. In our study, Nvidia Jetson TX2, Asus TinkerBoard, Raspbbery Pi 3 B+ were used as embedded systems. As a result of these tests, Haar Cascade+CNN method on Nvidia Jetson TX2 was detected in 0.1303 seconds, LBP+DNN and Haar Cascade+DNN methods on Asus Tinker Board were detected in 0.2459 seconds, and HOG+DNN method on Raspberry Pi 3 B+ was detected in 0.7153 seconds.

Downloads

Download data is not yet available.

References

Viola, P., & Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. In Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference on (Vol. 1, pp. I-I). IEEE.

Cevikalp, H., & Triggs, B. (2017). Visual object detection using cascades of binary and one-class classifiers. International Journal of Computer Vision, 123(3), 334-349.

Dehghani, A., & Moloney, D. (2016, August). Speed improvement of object recognition using boundary-bitmap of histogram of oriented gradients. In Image, Vision and Computing (ICIVC), International Conference on (pp. 51-56). IEEE.

Reschke, J., & Sehr, A. (2017). Face Recognition with Machine Learning in OpenCV_ Fusion of the results with the Localization Data of an Acoustic Camera for Speaker Identification. arXiv preprint arXiv:1707.00835.

Utaminingrum, F., Praetya, R. P., & Sari, Y. A. (2017). Image Processing for Rapidly Eye Detection based on Robust Haar Sliding Window. International Journal of Electrical and Computer Engineering (IJECE), 7(2), 823-830.

Yadav, Y., Walavalkar, R., Yedurkar, A., Suchak, S., & Gharat, S. (2017). Street Light Intensity Controller Using Density Mapping Mechanism.

Lee, D., Kim, D., Lee, J., Lee, S., Hwang, H., Mariappan, V., ... & Cha, J. (2017). Design of Low Cost Real-Time Audience Adaptive Digital Signage using Haar Cascade Facial Measures. International Journal of Advanced Culture Technology (IJACT), 5(1), 51-57.

Kim, J., Yu, S., Kim, D., Toh, K. A., & Lee, S. (2017). An adaptive local binary pattern for 3D hand tracking. Pattern Recognition, 61, 139-152.

Karczmarek, P., Kiersztyn, A., Pedrycz, W., & Dolecki, M. (2017). An application of chain code-based local descriptor and its extension to face recognition. Pattern Recognition, 65, 26-34.

Wasim, M., Aziz, A., Ali, S. F., Siddiqui, A. A., Ahmed, L., & Saeed, F. (2017). Object‘s Shape Recognition using Local Binary Patterns. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 8(8), 258-262.

Liu, L., Lao, S., Fieguth, P. W., Guo, Y., Wang, X., & Pietikäinen, M. (2016). Median robust extended local binary pattern for texture classification. IEEE Transactions on Image Processing, 25(3), 1368-1381.

Tian, S., Bhattacharya, U., Lu, S., Su, B., Wang, Q., Wei, X., ... & Tan, C. L. (2016). Multilingual scene character recognition with co-occurrence of histogram of oriented gradients. Pattern Recognition, 51, 125-134.

Lawgali, A. (2016). Recognition of Handwritten Digits using Histogram of Oriented Gradients.

Chowdhury, S. A., Uddin, M. N., Kowsar, M. M. S., & Deb, K. (2016, October). Occlusion handling and human detection based on Histogram of Oriented Gradients for automatic video surveillance. In Innovations in Science, Engineering and Technology (ICISET), International Conference on (pp. 1-4). IEEE.

Tanısık, G., Aselsan, A. S., Güçlü, O., & Ikizler-Cinbis, N. Bölüt ve Kontur Özniteliklerini Kullanarak ImgelerdekiInsan Hareketlerini Tanıma Recognizing Human Actions in Images Using Segment and Contour Features.

Sharifara, A., Rahim, M. S. M., & Anisi, Y. (2014, August). A general review of human face detection including a study of neural networks and Haar feature-based cascade classifier in face detection. In Biometrics and Security Technologies (ISBAST), 2014 International Symposium on (pp. 73-78). IEEE.

Goerick, C., Noll, D., & Werner, M. (1996). Artificial neural networks in real-time car detection and tracking applications. Pattern Recognition Letters, 17(4), 335-343.

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).

Başer, E., & Altun, Y. (2017, April). Classification of vehicles in traffic and detection faulty vehicles by using ANN techniques. In Electric Electronics, Computer Science, Biomedical Engineerings' Meeting (EBBT), 2017 (pp. 1-4). IEEE.

Yang, A., Yang, X., Wu, W., Liu, H., & Zhuansun, Y. (2019). Research on feature extraction of tumor image based on convolutional neural network. IEEE Access, 7, 24204-24213.

Passricha, V., & Aggarwal, R. K. (2019). A hybrid of deep CNN and bidirectional LSTM for automatic speech recognition. Journal of Intelligent Systems, 29(1), 1261-1274.

Lv, Y., Duan, Y., Kang, W., Li, Z., & Wang, F. Y. (2015). Traffic flow prediction with big data: A deep learning approach. IEEE Trans. Intelligent Transportation Systems, 16(2), 865-873.

Jin, Z., Iqbal, M. Z., Bobkov, D., Zou, W., Li, X., & Steinbach, E. (2019). A flexible deep CNN framework for image restoration. IEEE Transactions on Multimedia.

Yang, W. J., Su, Y. S., Chung, P. C., & Yang, J. F. (2017). Moving Object Detection Using Histogram of Uniformly Oriented Gradient. World Academy of Science, Engineering and Technology, International Journal of Computer, Electrical, Automation, Control and Information Engineering, 11(6), 649-653.

Kushwaha, A. K. S., Srivastava, S., & Srivastava, R. (2017). Multi-view human activity recognition based on silhouette and uniform rotation invariant local binary patterns. Multimedia Systems, 23(4), 451-467.

Muthevi, A., & Uppu, R. B. (2017, January). Leaf classification using completed local binary pattern of textures. In Advance Computing Conference (IACC), 2017 IEEE 7th International (pp. 870-874). IEEE.

Sharifara, A., Rahim, M. S. M., & Anisi, Y. (2014, August). A general review of human face detection including a study of neural networks and Haar feature-based cascade classifier in face detection. In Biometrics and Security Technologies (ISBAST), 2014 International Symposium on (pp. 73-78). IEEE.

He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2017, October). Mask r-cnn. In Computer Vision (ICCV), 2017 IEEE International Conference on (pp. 2980-2988). IEEE.

Julina, J. K. J., & Sharmila, T. S. (2017, January). Facial recognition using histogram of gradients and support vector machines. In 2017 International Conference on Computer, Communication and Signal Processing (ICCCSP) (pp. 1-5). IEEE.

Gkioxari, G., Girshick, R., & Malik, J. (2015). Contextual action recognition with r* cnn. In Proceedings of the IEEE international conference on computer vision (pp. 1080-1088).

Lu, M., Hu, Y., & Lu, X. (2020). Driver action recognition using deformable and dilated faster R-CNN with optimized region proposals. Applied Intelligence, 50(4), 1100-1111.

Guo, S., Huang, W., Wang, L., & Qiao, Y. (2017). Locally supervised deep hybrid model for scene recognition. IEEE transactions on image processing, 26(2), 808-820.

Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., ... & Zitnick, C. L. (2014, September). Microsoft coco: Common objects in context. In European conference on computer vision (pp. 740-755). Springer, Cham.

Viola, P., & Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. In Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference on (Vol. 1, pp. I-I). IEEE.

Papageorgiou, C. P., Oren, M., & Poggio, T. (1998, January). A general framework for object detection. In Computer vision, 1998. sixth international conference on (pp. 555-562). IEEE.

Ojala, T., Pietikäinen, M., & Harwood, D. (1996). A comparative study of texture measures with classification based on featured distributions. Pattern recognition, 29(1), 51-59.

Günay, A., & Nabiyev, V. V. (2011). LBP Yardimiyla Görüntüdeki Kişinin Yaşinin Bulunmasi. Çankaya University Journal of Science and Engineering, 8(1), 27-41.

Shashua, A., Gdalyahu, Y., & Hayun, G. (2004, June). Pedestrian detection for driving assistance systems: Single-frame classification and system level performance. In Intelligent Vehicles Symposium, 2004 IEEE (pp. 1-6). IEEE.

Dalal, N., & Triggs, B. (2005, June). Histograms of oriented gradients for human detection. In Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on (Vol. 1, pp. 886-893). IEEE.

Kim, S., & Cho, K. (2014). Fast Calculation of Histogram of Oriented Gradient Feature by Removing Redundancy in Overlapping Block. J. Inf. Sci. Eng., 30(6), 1719-1731.

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).

Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.

Szegedy, C., Toshev, A., & Erhan, D. (2013). Deep neural networks for object detection. In Advances in neural information processing systems (pp. 2553-2561).

Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., ... & Kudlur, M. (2016). Tensorflow: A system for large-scale machine learning. In 12th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 16) (pp. 265-283).