Performance Benchmarking of State-of-the-Art GAN-Based Video Super-Resolution Algorithms Using PSNR and SSIM Metrics
Keywords:
Video super-resolution, Generative Adversarial Networks, GFPGAN, ESRGAN, TecoGAN, RRDB-ESRGAN, Deep learning, Perceptual quality, Temporal consistency, Benchmark evaluationAbstract
Video super-resolution (VSR) has emerged as a critical research domain with extensive applications spanning surveillance, medical imaging, entertainment, and remote sensing. This study presents a rigorous and comprehensive evaluation of four state-of-the-art Generative Adversarial Network (GAN) architectures for video super-resolution: GFPGAN (Generative Facial Prior GAN), ESRGAN (Enhanced Super-Resolution GAN), TecoGAN (Temporally Coherent GAN), and RRDB-ESRGAN (Residual-in-Residual Dense Block ESRGAN). We conduct exhaustive experiments on the Low-Dose Video (LDV) benchmark dataset, employing a multi-faceted evaluation framework encompassing both distortion-based metrics (Peak Signal-to-Noise Ratio and Structural Similarity Index) and perceptual quality metrics (Learned Perceptual Image Patch Similarity and Natural Image Quality Evaluator). Additionally, we introduce temporal consistency analysis using optical flow warping error and inter-frame similarity metrics to assess motion coherence in reconstructed video sequences. Our experimental findings reveal that GFPGAN achieves the highest PSNR (34.052 dB) and SSIM (0.952), while TecoGAN demonstrates superior temporal consistency with the lowest temporal warping error (0.0234). Furthermore, we present comprehensive ablation studies examining the impact of architectural components, loss function configurations, and training strategies on reconstruction quality. Computational complexity analysis reveals significant variations in inference time and memory requirements across algorithms, providing practical guidance for deployment scenarios. This research contributes valuable insights for researchers and practitioners seeking optimal GAN-based solutions for video enhancement applications.
Downloads
References
Kappeler, A., Yoo, S., Dai, Q., & Katsaggelos, A. K. (2016). Video super-resolution with convolutional neural networks. IEEE Transactions on Computational Imaging, 2(2), 109-122.
Liu, H., Ruan, Z., Zhao, P., Dong, C., Shang, F., Liu, Y., ... & Timofte, R. (2022). Video super-resolution based on deep learning: A comprehensive survey. Artificial Intelligence Review, 55(8), 5981-6035.
Xiao, J., Jiang, X., Zheng, N., Yang, H., Yang, Y., Yang, Y., ... & Lam, K. M. (2023). Online video super-resolution with convolutional kernel bypass grafts. IEEE Transactions on Multimedia, 25, 8972-8987.
Tao, X., Gao, H., Liao, R., Wang, J., & Jia, J. (2017). Detail-revealing deep video super-resolution. In Proceedings of the IEEE International Conference on Computer Vision (pp. 4472-4480).
Khaledyan, D., Amirany, A., Jafari, K., Moaiyeri, M. H., Khuzani, A. Z., & Mashhadi, N. (2020). Low-cost implementation of bilinear and bicubic image interpolation for real-time image super-resolution. In 2020 IEEE Global Humanitarian Technology Conference (GHTC) (pp. 1-5). IEEE.
Mishra, S. R., Mohapatra, H., & Saxena, S. (2024). Leveraging data analytics and a deep learning framework for advancements in image super-resolution techniques. In Data Analytics and Machine Learning (pp. 105-126). Springer Nature Singapore.
Duong, H. T., Le, V. T., & Hoang, V. T. (2023). Deep learning-based anomaly detection in video surveillance: A survey. Sensors, 23(11), 5024.
Himeur, Y., Al-Maadeed, S., Kheddar, H., et al. (2023). Video surveillance using deep transfer learning and deep domain adaptation. Engineering Applications of Artificial Intelligence, 119, 105698.
Yang, H., Wang, Z., Liu, X., Li, C., Xin, J., & Wang, Z. (2023). Deep learning in medical image super resolution: A review. Applied Intelligence, 53(18), 20891-20916.
Qiu, D., Cheng, Y., & Wang, X. (2023). Medical image super-resolution reconstruction algorithms based on deep learning: A survey. Computer Methods and Programs in Biomedicine, 238, 107590.
Baniya, A. A., Lee, T. K., Eklund, P. W., & Aryal, S. (2024). A survey of deep learning video super-resolution. IEEE Transactions on Emerging Topics in Computational Intelligence.
Morrison, L. (2021). Utilizing machine learning for video processing. SMPTE Motion Imaging Journal, 130(8), 107-111.
Wang, P., Bayram, B., & Sertel, E. (2022). A comprehensive review on deep learning based remote sensing image super-resolution methods. Earth-Science Reviews, 232, 104110.
Wang, X., Yi, J., Guo, J., Song, Y., et al. (2022). A review of image super-resolution approaches based on deep learning and applications in remote sensing. Remote Sensing, 14(21), 5423.
Saxena, D., & Cao, J. (2021). Generative adversarial networks (GANs) challenges, solutions, and future directions. ACM Computing Surveys, 54(3), 1-42.
Alqahtani, H., Kavakli-Thorne, M., & Kumar, G. (2021). Applications of generative adversarial networks (GANs): An updated review. Archives of Computational Methods in Engineering, 28, 525-552.
Creswell, A., White, T., Dumoulin, V., et al. (2018). Generative adversarial networks: An overview. IEEE Signal Processing Magazine, 35(1), 53-65.
Xiong, Y., Guo, S., Chen, J., et al. (2020). Improved SRGAN for remote sensing image super-resolution across locations and sensors. Remote Sensing, 12(8), 1263.
Wang, X., Yu, K., Wu, S., et al. (2018). ESRGAN: Enhanced super-resolution generative adversarial networks. In Proceedings of the ECCV Workshops (pp. 0-0).
Tian, Y., Zhang, Y., Fu, Y., & Xu, C. (2020). TDAN: Temporally-deformable alignment network for video super-resolution. In Proceedings of the IEEE/CVF CVPR (pp. 3360-3369).
Chan, K. C., Wang, X., Yu, K., Dong, C., & Loy, C. C. (2021). BasicVSR: The search for essential components in video super-resolution and beyond. In Proceedings of the IEEE/CVF CVPR (pp. 4947-4956).
Jones, E. A., Wang, F. F., & Costinett, D. (2016). Review of commercial GaN power devices and GaN-based converter design challenges. IEEE JESTPE, 4(3), 707-719.
Kumar, A., & Vatsa, A. (2022). Influence of GFP GAN on melanoma classification. In 2022 IEEE Integrated STEM Education Conference (ISEC) (pp. 334-339). IEEE.
Chu, M., Xie, Y., Leal-Taixé, L., & Thuerey, N. (2018). Temporally coherent GANs for video super-resolution (TecoGAN). arXiv preprint arXiv:1811.09393.
Chen, Y. Z., Liu, T. J., & Liu, K. H. (2022). Super-resolution of satellite images by two-dimensional RRDB and edge-enhancement GAN. In ICASSP 2022 (pp. 1825-1829). IEEE.
Chaudhuri, S. (Ed.). (2006). Super-resolution imaging (Vol. 632). Springer Science & Business Media.
Katsaggelos, A. K., Molina, R., & Mateos, J. (2007). Super resolution of images and video (Vol. 3). Morgan & Claypool Publishers.
Van der Walt, S. J. (2010). Super-resolution imaging. Doctoral dissertation, University of Stellenbosch.
Yang, J., & Huang, T. (2017). Image super-resolution: Historical overview and future challenges. In Super-resolution imaging (pp. 1-34). CRC Press.
Rohith, G., & Sutha, G. L. (2022). Super-resolution for remote sensing applications using deep learning techniques. Cambridge Scholars Publishing.
Karthick, S., & Muthukumaran, N. (2024). Deep RegNet-150 architecture for single image super resolution. Applied Soft Computing, 111837.
Johnson, E. (2023). Deep learning-based super-resolution techniques for enhancing satellite imagery. African Journal of AI and Sustainable Development, 3(2), 342-347.
Xu, X. (2023). Applications research of deep learning in super-resolution reconstruction of medical image. In 2023 CSMIS (pp. 405-410). IEEE.
Deng, J., Song, W., Liu, D., et al. (2021). Improving the spatial resolution of solar images using GAN and self-attention mechanism. The Astrophysical Journal, 923(1), 76.
Chen, W., Lu, Y., Ma, H., et al. (2022). Self-attention mechanism in person re-identification models. Multimedia Tools and Applications, 1-19.
Fernandez-Beltran, R., Latorre-Carmona, P., & Pla, F. (2017). Single-frame super-resolution in remote sensing: A practical overview. International Journal of Remote Sensing, 38(1), 314-354.
Chen, R., Tang, X., Zhao, Y., et al. (2023). Single-frame deep-learning super-resolution microscopy for intracellular dynamics imaging. Nature Communications, 14(1), 2854.
Wronski, B., Garcia-Dorado, I., Ernst, M., et al. (2019). Handheld multi-frame super-resolution. ACM Transactions on Graphics (ToG), 38(4), 1-18.
Goodfellow, I., Pouget-Abadie, J., Mirza, M., et al. (2014). Generative adversarial nets. Advances in Neural Information Processing Systems, 27.
Pan, Z., Yu, W., Yi, X., et al. (2019). Recent progress on generative adversarial networks (GANs): A survey. IEEE Access, 7, 36322-36333.
Alqahtani, H., Kavakli-Thorne, M., & Kumar, G. (2021). Applications of generative adversarial networks (GANs): An updated review. Archives of Computational Methods in Engineering, 28, 525-552.
Kumar, R., Kumar Bharti, S., Ahmad, M., et al. (2023). Literature review on blind face restoration using GFPGAN. Available at SSRN 4483846.
Zhang, X., & Feng, J. (2024). A novel blind restoration method for miner face images based on improved GFP-GAN model. IEEE Access.
Yang, R. (2021). NTIRE 2021 challenge on quality enhancement of compressed video: Dataset and study. In Proceedings of the IEEE/CVF CVPR (pp. 667-676).
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.


