Named Entity Recognition Driven Synthesis of IT Job Descriptions in Morocco: A Comparative Analysis of BERT and BiLSTM Models
Keywords:
BERT, BiLSTM, Information Technology, Job descriptions, Named Entity Recognition, Summarization.Abstract
The information technology (IT) sector, characterized by its dynamism and diversity, represents a major challenge for jobseekers and recruiters alike, who have to navigate through massive lists of job offers to extract relevant information. This article proposes a new approach to meeting this challenge by integrating Named Entity Recognition (NER) into the synthesis of job descriptions in the IT domain. This exploration in the IT sector offers a significant contribution to the optimization of job search processes and recruitment strategies specific to this sector. Our approach, which includes the conceptualization, data preparation and training of BERT (Bidirectional Encoder Representations from Transformers) and BiLSTM (Bi-directional Long Short-Term Memory) models, enables us to compare the performance of two NER models through in-depth evaluation. The originality of our approach lies in the use of Named Entity Recognition (NER) as the cornerstone of automatic synthesis. By harnessing the power of NER, we simplify and streamline the process of efficiently extracting crucial information such as organizations, locations and job titles. The results underline the transformative potential of NER in improving the accessibility and comprehensibility of complex information contained in job advertisements in the IT sector. By automating the extraction of relevant entities such as job titles, skills required, company names, work locations, responsibilities requested, technical and non-technical skills, diplomas and years of experience required, we facilitate the job search process. Our evaluations show that BERT models outperform BiILSTM models in terms of accuracy and performance in named entity recognition, demonstrating their superiority for this specific task.
Downloads
References
Nadkarni, P. M., Ohno-Machado, L., & Chapman, W. W. (2011). Natural language processing: an introduction. Journal of the American Medical Informatics Association, 18(5), 544-551.
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
Chiu, J. P., & Nichols, E. (2016). Named entity recognition with bidirectional LSTM-CNNs. Transactions of the association for computational linguistics, 4, 357-370.
Darji, H., Mitrović, J., & Granitzer, M. (2023). German BERT model for legal named entity recognition. arXiv preprint arXiv:2303.05388.
Zhang, Y., & Zhang, H. (2023). FinBERT–MRC: Financial Named Entity Recognition Using BERT Under the Machine Reading Comprehension Paradigm. Neural Processing Letters, 1-21
Srivastava, S., Paul, B., & Gupta, D. (2023). Study of Word Embeddings for Enhanced Cyber Security Named Entity Recognition. Procedia Computer Science, 218, 449-460
An, Q., Pan, B., Liu, Z., Du, S., & Cui, Y. (2023). Chinese Named Entity Recognition in Football Based on ALBERT-Bilstm Model. Applied Sciences, 13(19), 10814.
Veena, G., Kanjirangat, V., & Gupta, D. (2023). AGRONER: An unsupervised agriculture named entity recognition using weighted distributional semantic model. Expert Systems with Applications, 229, 120440.
Novo, A. S., & Gedikli, F. (2023, February). Explaining BERT model decisions for near-duplicate news article detection based on named entity recognition. In 2023 IEEE 17th International Conference on Semantic Computing (ICSC) (pp. 278-281). IEEE.
Shen, H., Cao, H., Sun, G., & Chen, D. (2023). Research on Chinese Semantic Named Entity Recognition in Marine Engine Room Systems Based on BERT. Journal of Marine Science and Engineering, 11(7), 1266.
Yuan, T., Qin, X., & Wei, C. (2023). A Chinese Named Entity Recognition Method Based on ERNIE-Bilstm-CRF for Food Safety Domain. Applied Sciences, 13(5), 2849.
Çetindağ, C., Yazıcıoğlu, B., & Koç, A. (2023). Named-entity recognition in Turkish legal texts. Natural Language Engineering, 29(3), 615-642.
Leng, T., Altenbek, G., Ma, Y., & Haisa, G. (2023, October). Tourism named entity recognition method based on knowledge enhancement. In Fifth International Conference on Artificial Intelligence and Computer Science (AICS 2023) (Vol. 12803, pp. 782-789). SPIE.
Fareri, Silvia, et al. "SkillNER: Mining and mapping soft skills from any text." Expert Systems with Applications 184 (2021): 115544.
Kesim, E., & Deliahmetoglu, A. (2023). Named entity recognition in resumes. arXiv preprint arXiv:2306.13062.
Liu, J., Ng, Y. C., Gui, Z., Singhal, T., Blessing, L. T., Wood, K. L., & Lim, K. H. (2022). Title2Vec: A contextual job title embedding for occupational named entity recognition and other applications. Journal of Big Data, 9(1), 99.
Dobreva, J., Jofche, N., Jovanovik, M., & Trajanov, D. (2020). Improving NER performance by applying text summarization on pharmaceutical articles. In ICT Innovations 2020. Machine Learning and Applications: 12th International Conference, ICT Innovations 2020, Skopje, North Macedonia, September 24–26, 2020, Proceedings 12 (pp. 87-97). Springer International Publishing.
Kouris, P., Alexandridis, G., & Stafylopatis, A. (2021). Abstractive text summarization: Enhancing sequence-to-sequence models using word sense disambiguation and semantic content generalization. Computational Linguistics, 47(4), 813-859.
Marek, P., Müller, Š., Konrád, J., Lorenc, P., Pichl, J., & Šedivý, J. (2021). Text summarization of czech news articles using named entities. arXiv preprint arXiv:2104.10454.
Senthamizh, S. R., & Arutchelvan, K. (2022). Automatic text summarization using document clustering named entity recognition. International Journal of Advanced Computer Science and Applications, 13(9).
Diab, M. (2009, April). Second generation AMIRA tools for Arabic processing: Fast and robust tokenization, POS tagging, and base phrase chunking. In 2nd international conference on Arabic language resources and tools (Vol. 110, p. 198).
Elkaimbillah, Z., El Asri, B., Mikram, M., & Rhanoui, M. (2023). Construction of an Ontology-based Document Collection for the IT Job Offer in Morocco. International Journal of Advanced Computer Science and Applications, 14(7).
Roy, Bipraneel, and Hon Cheung. "A deep learning approach for intrusion detection in internet of things using bi-directional long short-term memory recurrent neural network." 2018 28th international telecommunication networks and applications conference (ITNAC). IEEE, 2018.
Hou, Linlin, et al. "Method and dataset entity mining in scientific literature: a CNN+ Bilstm model with self-attention." Knowledge-Based Systems 235 (2022): 107621.
An, Y., Xia, X., Chen, X., Wu, F. X., & Wang, J. (2022). Chinese clinical named entity recognition via multi-head self-attention based Bilstm-CRF. Artificial Intelligence in Medicine, 127, 102282.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.