Conversion of Unstructured File to Structured File in Cloud Computing

Authors

  • Manjunath Singh H, Tanuja R

Keywords:

Unstructured Data, Cloud Computing, Machine Learning, Natural Language Processing, Data Structuring

Abstract

Data in the cloud are rapidly accumulating and are typically unstructured, which makes it difficult to address issues related to storage as well as data search and analysis. This research work focuses on how unstructured computer files can be properly converted to structured format for archival and easy retrieval, in the course of which a cloud based platform that employ NLP, ML and Database Optmisation techniques will be developed. The study uses NLP for text segmentation, ML for classification, and efficient query in the structured databases. The proposed method was tested with set of 10, 000 unstructured files and showed the increase of efficiency of data search for 35% and decrease of the processing time for 28% in comparing with traditional algorithm based on the rules. The data obtained also validated the positive and significant relationship between file size and processing time at R² = 0.82 indicating the GetGood solution should indeed be designed to be scalably. Analyzing the results of the check of the hypothesis through the use of ANOVA showed that the methods yielded statistically significant difference in structuring accuracy at p < 0.05. These studies provide evidence that the structuring that is done by AI improves the access to data and the work of the cloud system. Therefore the key decision for industries with large volumes of unstructured data is to implement automated structuring systems.

DOI: https://doi.org/10.17762/ijisae.v12i3.7375

Downloads

Download data is not yet available.

References

Singh, S. K., & Singh, S. (2020). Transforming unstructured data to structured data using MapReduce and HBase. International Journal of Engineering Technology and Research, 13(9), 5755.

Khan, S., Shakil, K. A., & Alam, M. (2017). Big Data Computing Using Cloud-Based Technologies: Challenges and Future Perspectives. International Journal of Advanced Research in Computer Science and Engineering, 6(3), 1–10.

El-Seoud, S. A., El-Sofany, H. F., Abdelfattah, M. A., & Mohamed, R. M. (2017). Big Data and Cloud Computing: Trends and Challenges. International Journal of Advanced Research in Computer Science and Engineering, 6(3), 11–20.

Majhi, S. K., & Shial, G. R. (2015). Challenges in Big Data Cloud Computing And Future Research Prospects: A Review. International Journal of Advanced Research in Computer Science and Engineering, 4(5), 1–10.

Chen, G., & Jagadish, H. V. (2017). Big Data and Cloud Computing: Challenges and Opportunities. Proceedings of the VLDB Endowment, 10(12), 1733–1744.

Parekh, R. B., & Patel, A. C. (2020). Analysis to Optimize Performance of Unstructured Data from Cloud Environment. Paripex - Indian Journal of Research, 9(3), 1–8.

Yi, J. (2015). Key technology research for unstructured data cloud storage: New exploring. In Proceedings of the 2015 2nd International Workshop on Materials Engineering and Computer Sciences (pp. 882–886). Atlantis Press.

Choudhury, A., Roy, B., & Misra, S. K. (2017). Data integrity and compression in cloud computing. International Journal of Computer Applications, 168(13), 14–19.

Huang, X., Lin, Z., Chen, L., Wu, W., & Liu, X. (2024). Research on unstructured data storage in large-scale grid edge cloud. Proceedings of SPIE, 13159, 1315905.

Trandabat, D., & Gifu, D. (2017). Social media and the web of linked data. ACM/IEEE Joint Conference on Digital Libraries.

Nicolae, B. (2021). High throughput data-compression for cloud storage. In Data Management in Grid and Cloud Computing (pp. 123-145). Springer.

Singh, S. K., & Singh, S. (2020). Transforming unstructured data to structured data using MapReduce and HBase. International Journal of Engineering Technology and Research, 13(9), 5755-5762.

Choudhury, A., Roy, B., & Misra, S. K. (2017). Data integrity and compression in cloud computing. International Journal of Computer Applications, 168(13), 14-19.

Yi, J. (2015). Key technology research for unstructured data cloud storage: New exploring. In Proceedings of the 2015 2nd International Workshop on Materials Engineering and Computer Sciences (pp. 882-886). Atlantis Press.

Trandabat, D., & Gifu, D. (2017). Social media and the web of linked data. In Proceedings of the 2017 ACM/IEEE Joint Conference on Digital Libraries (pp. 1-4). IEEE.

Nicolae, B., & Gienow, M. (2020). Unified storage models for heterogeneous unstructured data: Challenges and solutions. Journal of Cloud Computing Advances, 5(2), 78-95.

Hammad, S., Telfah, A., & Ezzeldien, M. (2021). Conversion of unstructured IoT data using machine learning algorithms: A comparative study. International Journal of IoT Research, 3(4), 45-62.

Chen, Y., & Zhang, J. (2020). Unstructured data processing in cloud computing: Challenges and solutions. Journal of Cloud Computing: Advances, Systems and Applications, 9(1), 1-15.

Parekh, R., & Patel, A. C. (2020). Analysis to optimize performance of unstructured data from cloud environments. Paripex - Indian Journal of Research, 9(3), 1-8.

Huang, X., Lin, Z., Chen, L., Wu, W., & Liu, X. (2024). Research on unstructured data storage in large-scale grid edge cloud. Proceedings of SPIE, 13159, 1315905.

El-Seoud, S., El-Sofany, H., Abdelfattah, M., & Mohamed, R. (2017). Big Data and Cloud Computing: Trends and Challenges in Unstructured Data Management. International Journal of Advanced Research in Computer Science and Engineering, 6(3), 11-20.

Majhi, S., & Shial, G. R. (2015). Challenges in Big Data Cloud Computing and Future Research Prospects: A Review on Unstructured Data Handling Techniques. International Journal of Advanced Research in Computer Science and Engineering, 4(5), 1-10.

Anwaruddin, M., & Khan, F. (2023). Natural language processing techniques for unstructured-to-structured conversion in enterprise systems. Journal of Artificial Intelligence Applications, 12(2), 156-173.

Kumar, P., & Singh, M. (2022). Unstructured data to structured data conversion techniques and tools: A comprehensive review. Journal of Intelligent Information Systems, 59(2), 257-275.

Goyal, S., & Kumar, N. (2023). Cloud Computing for Unstructured Data Management: A Systematic Literature Review. Journal of Cloud Computing: Advances, Systems and Applications, 12(1), 1-15.

Dixit, A., & Singh, S. (2022). Big Data Analytics for Unstructured Data Conversion: Challenges and Opportunities. International Journal of Advanced Research in Computer Science and Engineering, 11(5), 1234-1241.

Bhatia, M. P. S., & Singh, S. (2021). Techniques for Converting Unstructured Data to Structured Data: A Comparative Analysis. International Journal of Advanced Research in Computer Science and Engineering, 10(3), 1456-1463.

Agrawal, D., & Singh, S. (2023). Cloud-Based Unstructured Data Management: Current Trends and Future Directions. Journal of Cloud Computing: Advances, Systems and Applications, 12(2), 1-14.

Li, X., & Wang, Y. (2021). A Survey on Unstructured Data Processing and Analytics in the Cloud. IEEE Transactions on Big Data, 7(3), 435-450.

Zhang, L., Chen, C., & Bu, J. (2022). Efficient Conversion of Unstructured Data to Structured Data Using Deep Learning Techniques. Neural Computing and Applications, 34(12), 9876-9890.

Sharma, R., & Gupta, A. (2020). Cloud-based Framework for Unstructured Data Analysis: A Case Study. International Journal of Cloud Applications and Computing, 10(4), 18-32.

Liu, H., & Wu, J. (2023). Scalable Unstructured Data Processing in Cloud Environments: Algorithms and Performance Evaluation. IEEE Transactions on Parallel and Distributed Systems, 34(8), 2345-2360.

Patel, S., & Mehta, R. (2021). Unstructured to Structured Data Conversion: A Machine Learning Approach. Journal of Big Data, 8(1), 1-18.

Wang, Z., & Li, Y. (2022). Efficient Storage and Retrieval of Unstructured Data in Cloud-based Systems. ACM Transactions on Storage, 18(3), 1-25.

Chen, X., & Liu, D. (2023). A Comprehensive Survey on Unstructured Data Management in Cloud Computing. ACM Computing Surveys, 55(4), 1-35.

Kumar, A., & Singh, R. (2021). Cloud-native Approaches for Handling Unstructured Data: Challenges and Solutions. Journal of Systems and Software, 171, 110823.

Zhao, Y., & Wu, X. (2022). Unstructured Data Integration in Cloud Environments: A Semantic Web Approach. Journal of Web Semantics, 72, 100679.

Gupta, V., & Sharma, S. (2023). Optimizing Unstructured Data Processing in Hybrid Cloud Environments. Future Generation Computer Systems, 138, 34-48.

Lin, J., & Zhang, W. (2021). Privacy-preserving Techniques for Unstructured Data Conversion in Cloud Computing. IEEE Transactions on Information Forensics and Security, 16, 3456-3470.

Downloads

Published

24.03.2024

How to Cite

Manjunath Singh H. (2024). Conversion of Unstructured File to Structured File in Cloud Computing. International Journal of Intelligent Systems and Applications in Engineering, 12(3), 4469 –. Retrieved from https://www.ijisae.org/index.php/IJISAE/article/view/7375

Issue

Section

Research Article