MiniMind-Dense: a Small Language Model that Supports HR Services by Adopting MiniMind with Supervised Fine-Tuning and LoRA
Keywords:
large language model; human resources (HR); Malaysia; reinforcement learning; supervised fine-tuning; LoRA; SME AI.Abstract
Small and medium-sized enterprises (SMEs) form the backbone of Malaysia’s economy, yet they often lack resources to adopt advanced AI tools. Generative AI and large language models (LLMs) promise to transform human resources (HR) by automating tasks like policy guidance, recruitment support, and employee coaching [1]. However, generic LLMs trained on broad data do not capture local legal norms or HR-specific knowledge and may produce irrelevant or non-compliant advice. To address this, we adopted and adapted MiniMind [2] , a Malaysian HR-focused language assistant. Starting from a compact GPT-style base model, we implemented a three-stage refinement pipeline: (1) Supervised fine-tuning (SFT) on an expanded, HR-domain dialogue dataset; (2) Reinforcement learning from human feedback (RLHF) via Proximal Policy Optimization to align outputs with HR professionals’ preferences; and (3) LoRA parameter-efficient tuning to inject final domain expertise. Through the refinement pipeline, We have designed a new architecture, namely, “MiniMind-Dense”, by incorporating other Transformer improvements such as grouped-query attention, rotary embeddings, SwiGLU to achieve the goals of the research. Extensive evaluation on Malaysian HR queries shows dramatic improvements: e.g., BLEU score [3] rose from ~5% in the pretrained model to ~79% after LoRA tuning (and ROUGE‑L [4] from ~36% to ~92%). Qualitative analysis confirms highly fluent, relevant, and human-like responses, unlike the generic outputs of the base model. These results demonstrate the feasibility of a localized, aligned SLM as an AI HR assistant. Future work will integrate retrieval (RAG) [5] for factual grounding and expand multilingual capabilities.
Downloads
References
J. Bersin, “The role of generative AI and large language models in HR,” Industry blog article, Mar. 10, 2023
J. Gong, MiniMind [Computer software], 2025.
K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “BLEU: A method for automatic evaluation of machine translation,” in Proc. 40th Annu. Meeting Assoc. Comput. Linguistics, 2002, pp. 311–318..
C.-Y. Lin, “ROUGE: A package for automatic evaluation of summaries,” in Proc. ACL-04 Workshop: Text Summarization Branches Out, 2004, pp. 74–81.
M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy et al., “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks,” arXiv preprint arXiv:2005.11401, 2020.
R. Gong, “Not leaving MSMEs behind in the AI race,” Policy report, Khazanah Research Institute, Oct. 30, 2024.
D. Manickam, “Govt to develop localised large language model,” News Article, Dec. 4, 2024.
H. Zolkepli, A. Razak, K. Adha, and A. Nazhan, “MaLLaM - Malaysia Large Language Model,” arXiv preprint arXiv:2401.14680v2, Jan. 2024
. Lee, W. Yoon, S. Kim, D. Kim, S. Kim, C. H. So, and J. Kang, “BioBERT: A pre-trained biomedical language representation model for biomedical text mining,” Bioinformatics, vol. 36, no. 4, pp. 1234–1240, 2020.
I. Beltagy, K. Lo, and A. Cohan, “SciBERT: A pretrained language model for scientific text,” in Proc. EMNLP, 2019, pp. 3615–3620
J. Su, Y. Lu, S. Pan, A. Murtadha, B. Wen, and Y. Liu, RoFormer: Enhanced transformer with rotary position embedding, arXiv:2104.09864v5, Nov. 8, 2023. [Online]. Available: https://arxiv.org/abs/2104.09864
H. Touvron, T. Lavril, G. Izacard, X. Martinet, M. A. Lachaux, T. Lacroix et al., “LLaMA: Open and efficient foundation language models,” arXiv preprint arXiv:2302.13971, 2023.
N. Shazeer, GLU variants improve transformer, arXiv:2002.05202, Feb. 12, 2020. [Online]. Available: https://arxiv.org/abs/2002.05202
Z. Zhang et al., ReLU2 wins: Discovering efficient activation functions for sparse LLMs, arXiv:2402.03804v1, Feb. 6, 2024. [Online]. Available: https://arxiv.org/abs/2402.03804
J. Ainslie, J. Lee-Thorp, M. de Jong, Y. Zemlyanskiy, F. Lebron, and S. Sanghai, “GQA: Training generalized multi-query transformer models from multi-head checkpoints,” in Proc. 2023 Conf. Empirical Methods in Natural Language Processing (EMNLP), H. Bouamor, J. Pino, and K. Bali, Eds., Singapore, pp. 4895–4901, Dec. 2023. [Online]. Available: https://doi.org/10.18653/v1/2023.emnlp-main.298
C. Fang, C. Qin, Q. Zhang, and K. Yao, “RecruitPro: A pretrained language model with skill-aware prompt learning for intelligent recruitment,” in Proc. 29th ACM SIGKDD Conf. Knowledge Discovery and Data Mining (KDD '23), Aug. 2023. [Online]. Available: https://doi.org/10.1145/3580305.3599894
N. Otani, N. Bhutani, and E. Hruschka, Natural language processing for human resources: A survey, arXiv:2410.16498v2, Mar. 25, 2025. [Online]. Available: https://arxiv.org/abs/2410.16498v2
L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. L. Wainwright, P. Mishkin et al., “Training language models to follow instructions with human feedback,” arXiv preprint arXiv:2203.02155, 2022.
J. Chen, X. Han, Y. Ma, X. Zhou, and L. Xiang, Unlock the correlation between supervised fine-tuning and reinforcement learning in training code large language models, arXiv:2406.10305v2, Dec. 17, 2024. [Online]. Available: https://arxiv.org/abs/2406.10305v2
E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang et al., “LoRA: Low-rank adaptation of large language models,” arXiv preprint arXiv:2106.09685, 2022.
R. Rafailov, A. Sharma, E. Mitchell, S. Ermon, C. D. Manning, and C. Finn, Direct preference optimization: Your language model is secretly a reward model, arXiv:2305.18290v3, Jul. 29, 2024. [Online]. Available: https://arxiv.org/abs/2305.18290v3
M. Madanchian, “From recruitment to retention: AI tools for human resource decision-making,” Appl. Sci., vol. 14, no. 24, p. 11750, 2024. [Online]. Available: https://doi.org/10.3390/app142411750
A. Lacroux and C. Martin Lacroux, “Should I trust the artificial intelligence to recruit? Recruiters’ perceptions and behavior when faced with algorithm-based recommendation systems during resume screening,” Front. Psychol., vol. 13, p. 895997, 2022. [Online]. Available: https://doi.org/10.3389/fpsyg.2022.895997
D. Zielinski, “How HR is using generative AI in performance management,” Society for Human Resource Management, Aug. 8, 2023. [Online]. Available: https://www.shrm.org/topics-tools/news/technology/how-hr-using-generative-ai-performance-management
S. K. Ho, “Amendments to Malaysia’s Personal Data Protection Act 2010 (PDPA),” Deloitte Southeast Asia, Oct. 29, 2024. [Online]. Available: https://www.deloitte.com/southeast-asia/en/services/consulting-risk/perspectives/my-pdpa-amendments.html
C. S. Seah, A. N. A. Nuar, Y. X. Loh, F. W. Jalaludin, H. Y. Foo, and L. L. Har, “Exploring the adoption of artificial intelligence in SMEs: An investigation into the Malaysian business landscape,” Pacific Corporate Sustainability, vol. 2, no. 3, Article 35, 2023. [Online]. Available: https://doi.org/10.55092/pcs2023020035
V. B. Parthasarathy, A. Zafar, A. Khan, and A. Shahid, “The ultimate guide to fine-tuning LLMs from basics to breakthroughs: An exhaustive review,” arXiv preprint arXiv:2408.13296, 2024.
S. Raschka, Build a Large Language Model (from scratch). Manning Publications, 2024.
A. T. Leejoy, “Training language models to follow instructions with human feedback: A comprehensive review,” Medium blog article, Mar. 2, 2025
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017.
P. F. Christiano, J. Leike, T. Brown, M. Martic, S. Legg, and D. Amodei, “Deep reinforcement learning from human preferences,” in Adv. Neural Inf. Process. Syst., vol. 30, 2017
S. Chatterjee, N. P. Rana, K. Tamilmani, and A. Sharma, “The adoption of artificial intelligence in human resource management: Towards a research agenda,” Int. J. Inf. Manage., vol. 52, p. 102019, 2020.
A. Conneau, K. Khandelwal, N. Goyal, V. Chaudhary, G. Wenzek, F. Guzmán et al., “Unsupervised cross-lingual representation learning at scale,” arXiv preprint arXiv:1911.02116, 2020.
S. Zhang, S. Roller, N. Goyal, M. Artetxe, M. Chen, S. Chen et al., “OPT: Open pre-trained transformer language models,” arXiv preprint arXiv:2205.01068, 2022.
M. Delange, R. Aljundi, M. Masana, S. Parisot, X. Jia, A. Leonardis et al., “A continual learning survey: Defying forgetting in classification tasks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 7, pp. 3366–3385, 2021
B. Heo, S. Park, D. Han, and S. Yun, Rotary position embedding for vision transformer, arXiv:2403.13298v2, Jul. 16, 2024. [Online]. Available: https://arxiv.org/abs/2403.13298
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.