Optimized Support Vector Machine for Big Data Classification NPC
Keywords:
Neighbours Progressive Competition, Big Data, Support Vector Machine, Quantum particle swarm optimization, Feature selectionAbstract
The rapid advancement of contemporary technology and smart systems has resulted in a large influx of big data. A phenomenon known as the class imbalance problem limits learning from many real-world datasets. When one class (the majority class) contains disproportionately more instances than the other class, the dataset is unbalanced (the minority class). Because of these datasets, traditional machine learning algorithms struggle to perform well on classification tasks. To compensate for the imbalance, NPC employs an innovative hybrid machine learning approach for grading the training samples.Both local and global data are used to generate the grades. The contribution of this article is a totally new classifier for efficiently dealing with the imbalance issue without the requirement for manually-set parameters or expert knowledge. To address this problem, in this research a novel approach Hybrid Support Vector Machine is designed by incorporating three major steps like pre-processing, dimension reduction and classification. Initially, the pre-processing phase is enabled by the data normalization process. The extensive sets of features are reduced using dimension reduction process and are achieved by using Quantum Theory-based Particle Swarm Optimization (QPSO). With this technique, a better solution can be obtained for classifying the big data; therefore, the existing problems related to accuracy metrics can resolved. Finally, a hybrid optimized support vector machine technique is proposed to accomplish the big data classification task. The suggested technique is compared to sample algorithms on unbalanced datasets in order to demonstrate the algorithm's efficacy.
Downloads
References
Devi, S. G., &Sabrigiriraj, M. (2019). A hybrid multi‐objective firefly and simulated annealing based algorithm for big data classification. Concurrency and Computation: Practice and Experience, 31(14), e4985.
Xing, W., & Bei, Y. (2019). Medical health big data classification based on KNN classification algorithm. IEEE Access, 8, 28808-28819.
Pintye, I., Kail, E., Kacsuk, P., &Lovas, R. (2021). Big data and machine learning framework for clouds and its usage for text classification. Concurrency and Computation: Practice and Experience, 33(19), e6164.
Li, H., Li, H., & Wei, K. (2018). Automatic fast double KNN classification algorithm based on ACC and hierarchical clustering for big data. International Journal of Communication Systems, 31(16), e3488.
Vennila, V., & Kannan, A. R. (2019). Hybrid parallel linguistic fuzzy rules with canopy mapreduce for big data classification in cloud. International Journal of Fuzzy Systems, 21(3), 809-822.
Fernández, A., del Río, S., Bawakid, A., & Herrera, F. (2017). Fuzzy rule based classification systems for big data with MapReduce: granularity analysis. Advances in Data Analysis and Classification, 11(4), 711-730.
Ulfarsson, M. O., Palsson, F., Sigurdsson, J., & Sveinsson, J. R. (2016). Classification of big data with application to imaging genetics. Proceedings of the IEEE, 104(11), 2137-2154.
Banchhor, C., &Srinivasu, N. (2020). Integrating Cuckoo search-Grey wolf optimization and Correlative Naive Bayes classifier with Map Reduce model for big data classification. Data & Knowledge Engineering, 127, 101788.
Al-Sharo, Y. M., Shakah, G., Alkhaswneh, M. S., Alju-Naeidi, B. Z., &Alazzam, M. B. (2018). Classification of big data: machine learning problems and challenges in network intrusion prediction. Int. J. Eng. Technol, 7(4), 3865-3869.
L’heureux, A., Grolinger, K., Elyamany, H. F., &Capretz, M. A. (2017). Machine learning with big data: Challenges and approaches. Ieee Access, 5, 7776-7797.
Hassanat, A. B. (2018). Two-point-based binary search trees for accelerating big data classification using KNN. PloS one, 13(11), e0207772.
Al-Thanoon, N. A., Algamal, Z. Y., &Qasim, O. S. (2021). Feature selection based on a crow search algorithm for big data classification. Chemometrics and Intelligent Laboratory Systems, 212, 104288.
Segatori, A., Marcelloni, F., &Pedrycz, W. (2017). On distributed fuzzy decision trees for big data. IEEE Transactions on Fuzzy Systems, 26(1), 174-192.
Ma, Z., Yang, L. T., & Zhang, Q. (2020). Support Multimode Tensor Machine for Multiple Classification on Industrial Big Data. IEEE Transactions on Industrial Informatics, 17(5), 3382-3390.
Ali, A. H., & Abdullah, M. Z. (2020). A parallel grid optimization of SVM hyperparameter for big data classification using spark Radoop. Karbala International Journal of Modern Science, 6(1), 3.
Saki, M., Abolhasan, M., & Lipman, J. (2019). A novel approach for big data classification and transportation in rail networks. IEEE Transactions on Intelligent Transportation Systems, 21(3), 1239-1249.
Jiang, C., & Li, Y. (2019). Health big data classification using improved radial basis function neural network and nearest neighbor propagation algorithm. IEEE Access, 7, 176782-176789.
Hababeh, I., Gharaibeh, A., Nofal, S., & Khalil, I. (2018). An integrated methodology for big data classification and security for improving cloud systems data mobility. IEEE Access, 7, 9153-9163.
Maillo, J., Triguero, I., & Herrera, F. (2020). Redundancy and complexity metrics for big data classification: towards smart data. IEEE Access, 8, 87918-87928.
Ma, Z., Yang, L. T., & Zhang, Q. (2020). Support Multimode Tensor Machine for Multiple Classification on Industrial Big Data. IEEE Transactions on Industrial Informatics, 17(5), 3382-3390.
Zhu, M., & Chen, Q. (2020). Big data image classification based on distributed deep representation learning model. IEEE Access, 8, 133890-133904.
Yang, L. H., Liu, J., Wang, Y. M., & Martínez, L. (2018). A micro-extended belief rule-based system for big data multiclass classification problems. IEEE Transactions on Systems, Man, and Cybernetics: Systems.
Lakshmanaprabu, S.K., Shankar, K., Ilayaraja, M., Nasir, A.W., Vijayakumar, V. and Chilamkurti, N., 2019. Random forest for big data classification in the internet of things using optimal features. International journal of machine learning and cybernetics, 10(10), pp.2609-2618.
Wang, L., Qian, Q., Zhang, Q., Wang, J., Cheng, W. and Yan, W., 2020. Classification model on big data in medical diagnosis based on semi-supervised learning. The Computer Journal.
Azar, A.T. and Hassanien, A.E., 2015. Dimensionality reduction of medical big data using neural-fuzzy classifier. Soft computing, 19(4), pp.1115-1127.
Lee, C.H. and Yoon, H.J., 2017. Medical big data: promise and challenges. Kidney research and clinical practice, 36(1), p.3.
Fernández, A., del Río, S., Chawla, N.V. and Herrera, F., 2017. An insight into imbalanced big data classification: outcomes and challenges. Complex & Intelligent Systems, 3(2), pp.105-120.
Yang, Y., 2020. Medical multimedia big data analysis modeling based on DBN algorithm. IEEE Access, 8, pp.16350-16361.
Pramanik, P.K.D., Mukhopadhyay, M. and Pal, S., 2021. Big data classification: Applications and challenges. In Artificial Intelligence and IoT (pp. 53-84). Springer, Singapore.
Hassib, E., El-Desouky, A., Labib, L. and El-kenawy, E.S.M., 2020. WOA+ BRNN: An imbalanced big data classification framework using Whale optimization and deep neural network. Soft Computing, 24(8), pp.5573-5592.
Hassib, E.M., El-Desouky, A.I., El-Kenawy, E.S.M. and El-Ghamrawy, S.M., 2019. An imbalanced big data mining framework for improving optimization algorithms performance. IEEE Access, 7, pp.170774-170795.
Jayasri, N.P. and Aruna, R., 2021. Big data analytics in health care by data mining and classification techniques. ICT Express.
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.