Capsule Neural Network and Determinantal Point Process (CAPSDPP) based Summarization of Surveillance Videos

Authors

  • Tabiya Manzoor Beigh, V. Prasanna Venkatesan, C. Punitha Devi

Keywords:

Capsule Neural Network; Determinantal point process; Keyframes; Redundancy; Segmentation; Summarization; Shot; Video surveillanc

Abstract

Seamless deployment and the low cost of surveillance cameras have benefited various agencies, such as schools, colleges, airports, railway stations, and shopping malls. However, the data generated by these cameras is enormous.  Accessing a specific clip requires users to invest time and energy in watching the entire video. Video summarization aims to produce a brief and comprehensive depiction of the essential content of video. The information can be presented using keyframes or video summaries, avoiding redundancy and emphasizing important and varied segments. Constraints such as finite computational capacity and bandwidth restriction limit the availability of resources at the edge. This work proposes a lightweight model based on the Capsule neural network (CapsNet) to summarize surveillance videos. Capsule neural networks are employed to extract spatiotemporal features that capture both motion and visual information. Deep CapsNet features are utilized for shot segmentation. A determinantal point process (DPP) selects diverse keyframes within segmented shots. We assessed the effectiveness of the proposed method with benchmark datasets from the Open Video Project (OVP) and YouTube (YT). Our findings illustrate that the proposed approach surpasses the performance of existing methodologies.

Downloads

Download data is not yet available.

References

K. Budati, S. Islam, M. K. Hasan, N. Safie, N. Bahar, and T. M. Ghazal, “Optimized Visual Internet of Things for Video Streaming Enhancement in 5G Sensor Network Devices,” Sensors, vol. 23, no. 11, p. 5072, May 2023, doi: 10.3390/s23115072.

R. Arunachalam, G. Sunitha, S. K. Shukla, S. N. pandey, S. Urooj, and S. Rawat, “A smart Alzheimer’s patient monitoring system with IoT-assisted technology through enhanced deep learning approach,” Knowl Inf Syst, vol. 65, no. 12, pp. 5561–5599, Dec. 2023, doi: 10.1007/s10115-023-01890-x.

D. J. Cassidy et al., “#SurgEdVidz: Using Social Media to Create a Supplemental Video-Based Surgery Didactic Curriculum,” Journal of Surgical Research, vol. 256, pp. 680–686, Dec. 2020, doi: 10.1016/j.jss.2020.04.004.

D. M. Davids, A. A. E. Raj, and C. S. Christopher, “Hybrid multi scale hard switch YOLOv4 network for cricket video summarization,” Wireless Networks, vol. 30, no. 1, pp. 17–35, Jan. 2024, doi: 10.1007/s11276-023-03449-8.

Singh and M. Kumar, “Bayesian fuzzy clustering and deep CNN-based automatic video summarization,” Multimed Tools Appl, vol. 83, no. 1, pp. 963–1000, Jan. 2024, doi: 10.1007/s11042-023-15431-9.

G. Yasmin, S. Chowdhury, J. Nayak, P. Das, and A. K. Das, “Key moment extraction for designing an agglomerative clustering algorithm-based video summarization framework,” Neural Comput Appl, vol. 35, no. 7, pp. 4881–4902, Mar. 2023, doi: 10.1007/s00521-021-06132-1.

Muhammad, B. Sadiq, I. Umoh, and H. Bello Salau, “A K-Means Clustering Approach for Extraction of Keyframes in Fast- Moving Videos,” pp. 147–157, Jul. 2020.

A. Pandian and S. Maheswari, “A keyframe selection for summarization of informative activities using clustering in surveillance videos,” Multimed Tools Appl, vol. 83, no. 3, pp. 7021–7034, Jan. 2024, doi: 10.1007/s11042-023-15859-z.

DeMenthon, V. Kobla, and D. Doermann, “Video summarization by curve simplification,” in Proceedings of the sixth ACM international conference on Multimedia - MULTIMEDIA ’98, New York, New York, USA: ACM Press, 1998, pp. 211–218. doi: 10.1145/290747.290773.

K. Muhammad, T. Hussain, and S. W. Baik, “Efficient CNN based summarization of surveillance videos for resource-constrained devices,” Pattern Recognit Lett, vol. 130, pp. 370–375, Feb. 2020, doi: 10.1016/j.patrec.2018.08.003.

G. Balamurugan and J. Jayabharathy, “Abnormal Event Detection using Additive Summarization Model for Intelligent Transportation Systems.” [Online]. Available: www.ijacsa.thesai.org

Sabha and A. Selwal, “Data-driven enabled approaches for criteria-based video summarization: a comprehensive survey, taxonomy, and future directions,” Multimed Tools Appl, vol. 82, no. 21, pp. 32635–32709, Sep. 2023, doi: 10.1007/s11042-023-14925-w.

S. Sabour, N. Frosst, and G. E. Hinton, “Dynamic Routing Between Capsules,” in 31st Conference on Neural Information Processing Systems (NIPS 2017, Long Beach, CA, USA, 2017.

W. Pauli, “The Connection Between Spin and Statistics,” Physical Review, vol. 58, no. 8, pp. 716–722, Oct. 1940, doi: 10.1103/PhysRev.58.716.

B. Gong, W.-L. Chao, K. Grauman, and S. Fei, “Diverse Sequential Subset Selection for Supervised Video Summarization,” in Advances in neural information processing systems , 2014.

R. H. Affandi, E. B. Fox, R. P. Adams, and B. Taskar, “Learning the Parameters of Determinantal Point Process Kernels.”

Kulesza and B. Taskar, “Structured Determinantal Point Processes,” in Neural Information Processing Systems, 2010. [Online]. Available: https://api.semanticscholar.org/CorpusID:13192203

Kulesza, “Determinantal Point Processes for Machine Learning,” Foundations and Trends® in Machine Learning, vol. 5, no. 2–3, pp. 123–286, 2012, doi: 10.1561/2200000044.

G. Geisler and G. Marchionini, “The open video project,” in The open video project. Proceedings of the Fifth ACM Conference on Digital Libraries, Association for Computing Machinery (ACM), Jun. 2000, pp. 258–259. doi: 10.1145/336597.336693.

M. Furini, F. Geraci, M. Montangero, and M. Pellegrini, “STIMO: STIll and MOving video storyboard for the web scenario,” Multimed Tools Appl, vol. 46, no. 1, pp. 47–69, Jan. 2010, doi: 10.1007/s11042-009-0307-7.

S. E. F. De Avila, A. P. B. Lopes, A. Da Luz, and A. De Albuquerque Araújo, “VSUMM: A mechanism designed to produce static video summaries and a novel evaluation method,” Pattern Recognit Lett, vol. 32, no. 1, pp. 56–68, 2011, doi: 10.1016/j.patrec.2010.08.004.

M. Fei, W. Jiang, and W. Mao, “Memorable and rich video summarization,” J Vis Commun Image Represent, vol. 42, pp. 207–217, Jan. 2017, doi: 10.1016/j.jvcir.2016.12.001.

Yu-Chyeh Wu, Yue-Shi Lee, and Chia-Hui Chang, “VSUM: Summarizing from Videos,” in IEEE Sixth International Symposium on Multimedia Software Engineering, IEEE, pp. 302–309. doi: 10.1109/MMSE.2004.90.

Downloads

Published

14.08.2024

How to Cite

Tabiya Manzoor Beigh. (2024). Capsule Neural Network and Determinantal Point Process (CAPSDPP) based Summarization of Surveillance Videos. International Journal of Intelligent Systems and Applications in Engineering, 12(4), 2518 –. Retrieved from https://www.ijisae.org/index.php/IJISAE/article/view/6678

Issue

Section

Research Article

Most read articles by the same author(s)