Geometric Deep Learning for Computer Vision and Image Analysis: A Survey of Recent Advances and Future Directions
Keywords:
Geometric Deep Learning, Computer Vision, Image Analysis, Graph Neural Networks, 3D Shape Analysis, Spectral Methods, Message-Passing Algorithms.Abstract
Geometric Deep Learning (GDL) has emerged as a powerful framework for addressing complex computer vision and image analysis tasks by extending traditional deep learning techniques to non-Euclidean data structures such as graphs, manifolds, and meshes. This survey provides a comprehensive overview of recent advances in GDL for computer vision, highlighting its application in areas such as 3D shape analysis, medical imaging, scene understanding, and object recognition. We discuss key architectural innovations, including graph neural networks, spectral methods, and message-passing algorithms, that enable the effective representation and processing of geometric data. Furthermore, we explore challenges such as computational complexity and generalization across diverse domains. Lastly, we outline potential future research directions, including the integration of GDL with multimodal learning, improved scalability, and the development of more robust and interpretable models. This survey emphasizes GDL’s growing significance in advancing state-of-the-art computer vision techniques and its potential to solve increasingly complex tasks.
Downloads
References
M. M. Bronstein, J. Bruna, T. Cohen, and P. Veliˇckovi´c, “Geometric deep learning: Grids, groups, graphs, geodesics, and gauges,” arXiv preprint arXiv:2104.13478, 2021.
E. Kalogerakis, M. Averkiou, S. Maji, and S. Chaudhuri, “3d shape segmentation with projective convolutional networks,” in IEEE conference on computer vision and pattern recognition, 2017.
Y. Feng, Y. Feng, H. You, X. Zhao, and Y. Gao, “Meshnet: Mesh neural network for 3d shape representation,” in AAAI conference on artificial intelligence, 2019.
C. Wang, M. Cheng, F. Sohel, M. Bennamoun, and J. Li, “Normalnet: A voxel-based cnn for 3d object classification and retrieval,” Neurocomputing, vol. 323, pp. 139–147, 2019.
T. Le and Y. Duan, “Pointgrid: A deep network for 3d shape understanding,”in IEEE conference on computer vision and pattern recognition, 2018.
P. K. Jayaraman, A. Sanghi, J. G. Lambourne, K. D. Willis, T. Davies, H. Shayani, and N. Morris, “Uv-net: Learning from boundary representations,”in IEEE Conference on Computer Vision and Pattern Recognition, 2021.
C. Krahe, A. Br¨aunche, A. Jacob, N. Stricker, and G. Lanza, “Deep learning for automated product design,” CIRP Design Conference, 2020.
C. Krahe, M. Marinov, T. Schmutz, Y. Hermann, M. Bonny, M. May,and G. Lanza, “Ai based geometric similarity search supporting component reuse in engineering design,” CIRP Design Conference, 2022.
D. Machalica and M. Matyjewski, “Cad models clustering with machine learning,” Archive of Mechanical Engineering, vol. 66, no. 2, 2019.
B. T. Jones, M. Hu, M. Kodnongbua, V. G. Kim, and A. Schulz, “Selfsupervised representation learning for cad,” in IEEE Conference on Computer Vision and Pattern Recognition, 2023.
J. G. Lambourne, K. D. Willis, P. K. Jayaraman, A. Sanghi, P. Meltzer,and H. Shayani, “Brepnet: A topological message passing system for solid models,” in IEEE Conference on Computer Vision and Pattern Recognition, 2021.
P. K. Jayaraman, J. G. Lambourne, N. Desai, K. D. D. Willis, A. Sanghi, and N. J. W. Morris, “Solidgen: An autoregressive model for direct b-rep synthesis,” Transactions on Machine Learning Research,2023,.
S. Zhou, T. Tang, and B. Zhou, “Cadparser: a learning approach of sequence modeling for b-rep cad,” in International Joint Conference on Artificial Intelligence, 2023.
R. Wu, C. Xiao, and C. Zheng, “Deepcad: A deep generative network for computer-aided design models,” in IEEE International Conference on Computer Vision, 2021.
K. D. Willis, P. K. Jayaraman, H. Chu, Y. Tian, Y. Li, D. Grandi, A. Sanghi, L. Tran, J. G. Lambourne, A. Solar-Lezama et al., “Joinable: Learning bottom-up assembly of parametric cad joints,” in IEEE Conference on Computer Vision and Pattern Recognition, 2022.
B. Jones, D. Hildreth, D. Chen, I. Baran, V. G. Kim, and A. Schulz, “Automate: A dataset and learning approach for automatic mating of cad assemblies,” ACM Transactions on Graphics (TOG), vol. 40, no. 6, pp. 1–18, 2021.
A. Seff, Y. Ovadia, W. Zhou, and R. P. Adams, “Sketchgraphs: A large-scale dataset for modeling relational geometry in computer-aided design,” arXiv preprint arXiv:2007.08506, 2020.
K. D. Willis, P. K. Jayaraman, J. G. Lambourne, H. Chu, and Y. Pu,“Engineering sketch generation for computer-aided design,” in IEEE Conference on Computer Vision and Pattern Recognition, 2021.
C. Li, H. Pan, A. Bousseau, and N. J. Mitra, “Free2cad: Parsing freehand drawings into cad commands,” ACM Transactions on Graphics (TOG), vol. 41, no. 4, pp. 1–16, 2022.
A. Seff, W. Zhou, N. Richardson, and R. P. Adams, “Vitruvion: A generative model of parametric CAD sketches,” in International Conference on Learning Representations, ICLR, 2022.
Singha, A. K., Tiwari, P. K., Shukla, N., & Yadav, G. Transformative Trends: Analyzing the Integration of Block chain in Banking Operations.
Singha, A. K., & Zubair, S. Combination of optimization methods in a multistage approach for a deep neural network model. International Journal of Information Technology, 16(3), 1855-1861.
Singh, A. K., Zubair, S., Malibari, A., Pathak, N., Urooj, S., & Sharma, N. (2023). Design of ANN Based Non-Linear Network Using Interconnection of Parallel Processor. Comput. Syst. Sci. Eng., 46(3), 3491-3508.
Singha, A. K., Jena, M., Zubair, S., Tiwari, P. K., & Bhadauria, A. P. S. (2023, August). Deep Neural Networks Performance Comparison for Handwritten Text Recognition. In International Conference on Mobile Radio Communications & 5G Networks (pp. 539-553). Singapore: Springer Nature Singapore.
Zubair, S., Singha, A. K., Pathak, N., Sharma, N., Urooj, S., & Larguech, S. R. (2023). Performance Enhancement of Adaptive Neural Networks Based on Learning Rate. Computers, Materials & Continua, 74(1).
Singha, A. K., Pathak, N., Sharma, N., Tiwari, P. K., & Joel, J. P. C. (2022). COVID-19 disease classification model using deep dense convolutional neural networks. In Emerging Technologies in Data Mining and Information Security: Proceedings of IEMIS 2022, Volume 2 (pp. 671-682). Singapore: Springer Nature Singapore.
Singha, A. K., Pathak, N., Sharma, N., Tiwari, P. K., & Joel, J. P. C. (2022). Forecasting COVID-19 confirmed cases in China using an optimization method. In Emerging Technologies in Data Mining and Information Security: Proceedings of IEMIS 2022, Volume 2 (pp. 683-695). Singapore: Springer Nature Singapore.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.