Geometric Deep Learning for Computer Vision and Image Analysis: A Survey of Recent Advances and Future Directions

Authors

  • Suresh A. J.

Keywords:

Geometric Deep Learning, Computer Vision, Image Analysis, Graph Neural Networks, 3D Shape Analysis, Spectral Methods, Message-Passing Algorithms.

Abstract

Geometric Deep Learning (GDL) has emerged as a powerful framework for addressing complex computer vision and image analysis tasks by extending traditional deep learning techniques to non-Euclidean data structures such as graphs, manifolds, and meshes. This survey provides a comprehensive overview of recent advances in GDL for computer vision, highlighting its application in areas such as 3D shape analysis, medical imaging, scene understanding, and object recognition. We discuss key architectural innovations, including graph neural networks, spectral methods, and message-passing algorithms, that enable the effective representation and processing of geometric data. Furthermore, we explore challenges such as computational complexity and generalization across diverse domains. Lastly, we outline potential future research directions, including the integration of GDL with multimodal learning, improved scalability, and the development of more robust and interpretable models. This survey emphasizes GDL’s growing significance in advancing state-of-the-art computer vision techniques and its potential to solve increasingly complex tasks.

Downloads

Download data is not yet available.

References

M. M. Bronstein, J. Bruna, T. Cohen, and P. Veliˇckovi´c, “Geometric deep learning: Grids, groups, graphs, geodesics, and gauges,” arXiv preprint arXiv:2104.13478, 2021.

E. Kalogerakis, M. Averkiou, S. Maji, and S. Chaudhuri, “3d shape segmentation with projective convolutional networks,” in IEEE conference on computer vision and pattern recognition, 2017.

Y. Feng, Y. Feng, H. You, X. Zhao, and Y. Gao, “Meshnet: Mesh neural network for 3d shape representation,” in AAAI conference on artificial intelligence, 2019.

C. Wang, M. Cheng, F. Sohel, M. Bennamoun, and J. Li, “Normalnet: A voxel-based cnn for 3d object classification and retrieval,” Neurocomputing, vol. 323, pp. 139–147, 2019.

T. Le and Y. Duan, “Pointgrid: A deep network for 3d shape understanding,”in IEEE conference on computer vision and pattern recognition, 2018.

P. K. Jayaraman, A. Sanghi, J. G. Lambourne, K. D. Willis, T. Davies, H. Shayani, and N. Morris, “Uv-net: Learning from boundary representations,”in IEEE Conference on Computer Vision and Pattern Recognition, 2021.

C. Krahe, A. Br¨aunche, A. Jacob, N. Stricker, and G. Lanza, “Deep learning for automated product design,” CIRP Design Conference, 2020.

C. Krahe, M. Marinov, T. Schmutz, Y. Hermann, M. Bonny, M. May,and G. Lanza, “Ai based geometric similarity search supporting component reuse in engineering design,” CIRP Design Conference, 2022.

D. Machalica and M. Matyjewski, “Cad models clustering with machine learning,” Archive of Mechanical Engineering, vol. 66, no. 2, 2019.

B. T. Jones, M. Hu, M. Kodnongbua, V. G. Kim, and A. Schulz, “Selfsupervised representation learning for cad,” in IEEE Conference on Computer Vision and Pattern Recognition, 2023.

J. G. Lambourne, K. D. Willis, P. K. Jayaraman, A. Sanghi, P. Meltzer,and H. Shayani, “Brepnet: A topological message passing system for solid models,” in IEEE Conference on Computer Vision and Pattern Recognition, 2021.

P. K. Jayaraman, J. G. Lambourne, N. Desai, K. D. D. Willis, A. Sanghi, and N. J. W. Morris, “Solidgen: An autoregressive model for direct b-rep synthesis,” Transactions on Machine Learning Research,2023,.

S. Zhou, T. Tang, and B. Zhou, “Cadparser: a learning approach of sequence modeling for b-rep cad,” in International Joint Conference on Artificial Intelligence, 2023.

R. Wu, C. Xiao, and C. Zheng, “Deepcad: A deep generative network for computer-aided design models,” in IEEE International Conference on Computer Vision, 2021.

K. D. Willis, P. K. Jayaraman, H. Chu, Y. Tian, Y. Li, D. Grandi, A. Sanghi, L. Tran, J. G. Lambourne, A. Solar-Lezama et al., “Joinable: Learning bottom-up assembly of parametric cad joints,” in IEEE Conference on Computer Vision and Pattern Recognition, 2022.

B. Jones, D. Hildreth, D. Chen, I. Baran, V. G. Kim, and A. Schulz, “Automate: A dataset and learning approach for automatic mating of cad assemblies,” ACM Transactions on Graphics (TOG), vol. 40, no. 6, pp. 1–18, 2021.

A. Seff, Y. Ovadia, W. Zhou, and R. P. Adams, “Sketchgraphs: A large-scale dataset for modeling relational geometry in computer-aided design,” arXiv preprint arXiv:2007.08506, 2020.

K. D. Willis, P. K. Jayaraman, J. G. Lambourne, H. Chu, and Y. Pu,“Engineering sketch generation for computer-aided design,” in IEEE Conference on Computer Vision and Pattern Recognition, 2021.

C. Li, H. Pan, A. Bousseau, and N. J. Mitra, “Free2cad: Parsing freehand drawings into cad commands,” ACM Transactions on Graphics (TOG), vol. 41, no. 4, pp. 1–16, 2022.

A. Seff, W. Zhou, N. Richardson, and R. P. Adams, “Vitruvion: A generative model of parametric CAD sketches,” in International Conference on Learning Representations, ICLR, 2022.

Singha, A. K., Tiwari, P. K., Shukla, N., & Yadav, G. Transformative Trends: Analyzing the Integration of Block chain in Banking Operations.

Singha, A. K., & Zubair, S. Combination of optimization methods in a multistage approach for a deep neural network model. International Journal of Information Technology, 16(3), 1855-1861.

Singh, A. K., Zubair, S., Malibari, A., Pathak, N., Urooj, S., & Sharma, N. (2023). Design of ANN Based Non-Linear Network Using Interconnection of Parallel Processor. Comput. Syst. Sci. Eng., 46(3), 3491-3508.

Singha, A. K., Jena, M., Zubair, S., Tiwari, P. K., & Bhadauria, A. P. S. (2023, August). Deep Neural Networks Performance Comparison for Handwritten Text Recognition. In International Conference on Mobile Radio Communications & 5G Networks (pp. 539-553). Singapore: Springer Nature Singapore.

Zubair, S., Singha, A. K., Pathak, N., Sharma, N., Urooj, S., & Larguech, S. R. (2023). Performance Enhancement of Adaptive Neural Networks Based on Learning Rate. Computers, Materials & Continua, 74(1).

Singha, A. K., Pathak, N., Sharma, N., Tiwari, P. K., & Joel, J. P. C. (2022). COVID-19 disease classification model using deep dense convolutional neural networks. In Emerging Technologies in Data Mining and Information Security: Proceedings of IEMIS 2022, Volume 2 (pp. 671-682). Singapore: Springer Nature Singapore.

Singha, A. K., Pathak, N., Sharma, N., Tiwari, P. K., & Joel, J. P. C. (2022). Forecasting COVID-19 confirmed cases in China using an optimization method. In Emerging Technologies in Data Mining and Information Security: Proceedings of IEMIS 2022, Volume 2 (pp. 683-695). Singapore: Springer Nature Singapore.

Downloads

Published

25.12.2023

How to Cite

Suresh A. J. (2023). Geometric Deep Learning for Computer Vision and Image Analysis: A Survey of Recent Advances and Future Directions. International Journal of Intelligent Systems and Applications in Engineering, 12(1), 847 –. Retrieved from https://www.ijisae.org/index.php/IJISAE/article/view/7035

Issue

Section

Research Article