MiniMind-Dense: a Small Language Model that Supports HR Services by Adopting MiniMind with Supervised Fine-Tuning and LoRA

Darren Chai Xin Lun

Authors

Darren Chai Xin Lun, Lim Tong Ming

Keywords:

large language model; human resources (HR); Malaysia; reinforcement learning; supervised fine-tuning; LoRA; SME AI.

Abstract

Small and medium-sized enterprises (SMEs) form the backbone of Malaysia’s economy, yet they often lack resources to adopt advanced AI tools. Generative AI and large language models (LLMs) promise to transform human resources (HR) by automating tasks like policy guidance, recruitment support, and employee coaching [1]. However, generic LLMs trained on broad data do not capture local legal norms or HR-specific knowledge and may produce irrelevant or non-compliant advice. To address this, we adopted and adapted MiniMind [2] , a Malaysian HR-focused language assistant. Starting from a compact GPT-style base model, we implemented a three-stage refinement pipeline: (1) Supervised fine-tuning (SFT) on an expanded, HR-domain dialogue dataset; (2) Reinforcement learning from human feedback (RLHF) via Proximal Policy Optimization to align outputs with HR professionals’ preferences; and (3) LoRA parameter-efficient tuning to inject final domain expertise. Through the refinement pipeline, We have designed a new architecture, namely, “MiniMind-Dense”, by incorporating other Transformer improvements such as grouped-query attention, rotary embeddings, SwiGLU to achieve the goals of the research. Extensive evaluation on Malaysian HR queries shows dramatic improvements: e.g., BLEU score [3] rose from ~5% in the pretrained model to ~79% after LoRA tuning (and ROUGE‑L [4] from ~36% to ~92%). Qualitative analysis confirms highly fluent, relevant, and human-like responses, unlike the generic outputs of the base model. These results demonstrate the feasibility of a localized, aligned SLM as an AI HR assistant. Future work will integrate retrieval (RAG) [5] for factual grounding and expand multilingual capabilities.

Downloads

Download data is not yet available.

References

J. Bersin, “The role of generative AI and large language models in HR,” Industry blog article, Mar. 10, 2023

J. Gong, MiniMind [Computer software], 2025.

K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “BLEU: A method for automatic evaluation of machine translation,” in Proc. 40th Annu. Meeting Assoc. Comput. Linguistics, 2002, pp. 311–318..

C.-Y. Lin, “ROUGE: A package for automatic evaluation of summaries,” in Proc. ACL-04 Workshop: Text Summarization Branches Out, 2004, pp. 74–81.

M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy et al., “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks,” arXiv preprint arXiv:2005.11401, 2020.

R. Gong, “Not leaving MSMEs behind in the AI race,” Policy report, Khazanah Research Institute, Oct. 30, 2024.

D. Manickam, “Govt to develop localised large language model,” News Article, Dec. 4, 2024.

H. Zolkepli, A. Razak, K. Adha, and A. Nazhan, “MaLLaM - Malaysia Large Language Model,” arXiv preprint arXiv:2401.14680v2, Jan. 2024

. Lee, W. Yoon, S. Kim, D. Kim, S. Kim, C. H. So, and J. Kang, “BioBERT: A pre-trained biomedical language representation model for biomedical text mining,” Bioinformatics, vol. 36, no. 4, pp. 1234–1240, 2020.

I. Beltagy, K. Lo, and A. Cohan, “SciBERT: A pretrained language model for scientific text,” in Proc. EMNLP, 2019, pp. 3615–3620

J. Su, Y. Lu, S. Pan, A. Murtadha, B. Wen, and Y. Liu, RoFormer: Enhanced transformer with rotary position embedding, arXiv:2104.09864v5, Nov. 8, 2023. [Online]. Available: https://arxiv.org/abs/2104.09864

H. Touvron, T. Lavril, G. Izacard, X. Martinet, M. A. Lachaux, T. Lacroix et al., “LLaMA: Open and efficient foundation language models,” arXiv preprint arXiv:2302.13971, 2023.

N. Shazeer, GLU variants improve transformer, arXiv:2002.05202, Feb. 12, 2020. [Online]. Available: https://arxiv.org/abs/2002.05202

Z. Zhang et al., ReLU2 wins: Discovering efficient activation functions for sparse LLMs, arXiv:2402.03804v1, Feb. 6, 2024. [Online]. Available: https://arxiv.org/abs/2402.03804

J. Ainslie, J. Lee-Thorp, M. de Jong, Y. Zemlyanskiy, F. Lebron, and S. Sanghai, “GQA: Training generalized multi-query transformer models from multi-head checkpoints,” in Proc. 2023 Conf. Empirical Methods in Natural Language Processing (EMNLP), H. Bouamor, J. Pino, and K. Bali, Eds., Singapore, pp. 4895–4901, Dec. 2023. [Online]. Available: https://doi.org/10.18653/v1/2023.emnlp-main.298

C. Fang, C. Qin, Q. Zhang, and K. Yao, “RecruitPro: A pretrained language model with skill-aware prompt learning for intelligent recruitment,” in Proc. 29th ACM SIGKDD Conf. Knowledge Discovery and Data Mining (KDD '23), Aug. 2023. [Online]. Available: https://doi.org/10.1145/3580305.3599894

N. Otani, N. Bhutani, and E. Hruschka, Natural language processing for human resources: A survey, arXiv:2410.16498v2, Mar. 25, 2025. [Online]. Available: https://arxiv.org/abs/2410.16498v2

L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. L. Wainwright, P. Mishkin et al., “Training language models to follow instructions with human feedback,” arXiv preprint arXiv:2203.02155, 2022.

J. Chen, X. Han, Y. Ma, X. Zhou, and L. Xiang, Unlock the correlation between supervised fine-tuning and reinforcement learning in training code large language models, arXiv:2406.10305v2, Dec. 17, 2024. [Online]. Available: https://arxiv.org/abs/2406.10305v2

E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang et al., “LoRA: Low-rank adaptation of large language models,” arXiv preprint arXiv:2106.09685, 2022.

R. Rafailov, A. Sharma, E. Mitchell, S. Ermon, C. D. Manning, and C. Finn, Direct preference optimization: Your language model is secretly a reward model, arXiv:2305.18290v3, Jul. 29, 2024. [Online]. Available: https://arxiv.org/abs/2305.18290v3

M. Madanchian, “From recruitment to retention: AI tools for human resource decision-making,” Appl. Sci., vol. 14, no. 24, p. 11750, 2024. [Online]. Available: https://doi.org/10.3390/app142411750

A. Lacroux and C. Martin Lacroux, “Should I trust the artificial intelligence to recruit? Recruiters’ perceptions and behavior when faced with algorithm-based recommendation systems during resume screening,” Front. Psychol., vol. 13, p. 895997, 2022. [Online]. Available: https://doi.org/10.3389/fpsyg.2022.895997

D. Zielinski, “How HR is using generative AI in performance management,” Society for Human Resource Management, Aug. 8, 2023. [Online]. Available: https://www.shrm.org/topics-tools/news/technology/how-hr-using-generative-ai-performance-management

S. K. Ho, “Amendments to Malaysia’s Personal Data Protection Act 2010 (PDPA),” Deloitte Southeast Asia, Oct. 29, 2024. [Online]. Available: https://www.deloitte.com/southeast-asia/en/services/consulting-risk/perspectives/my-pdpa-amendments.html

C. S. Seah, A. N. A. Nuar, Y. X. Loh, F. W. Jalaludin, H. Y. Foo, and L. L. Har, “Exploring the adoption of artificial intelligence in SMEs: An investigation into the Malaysian business landscape,” Pacific Corporate Sustainability, vol. 2, no. 3, Article 35, 2023. [Online]. Available: https://doi.org/10.55092/pcs2023020035

V. B. Parthasarathy, A. Zafar, A. Khan, and A. Shahid, “The ultimate guide to fine-tuning LLMs from basics to breakthroughs: An exhaustive review,” arXiv preprint arXiv:2408.13296, 2024.

S. Raschka, Build a Large Language Model (from scratch). Manning Publications, 2024.

A. T. Leejoy, “Training language models to follow instructions with human feedback: A comprehensive review,” Medium blog article, Mar. 2, 2025

J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017.

P. F. Christiano, J. Leike, T. Brown, M. Martic, S. Legg, and D. Amodei, “Deep reinforcement learning from human preferences,” in Adv. Neural Inf. Process. Syst., vol. 30, 2017

S. Chatterjee, N. P. Rana, K. Tamilmani, and A. Sharma, “The adoption of artificial intelligence in human resource management: Towards a research agenda,” Int. J. Inf. Manage., vol. 52, p. 102019, 2020.

A. Conneau, K. Khandelwal, N. Goyal, V. Chaudhary, G. Wenzek, F. Guzmán et al., “Unsupervised cross-lingual representation learning at scale,” arXiv preprint arXiv:1911.02116, 2020.

S. Zhang, S. Roller, N. Goyal, M. Artetxe, M. Chen, S. Chen et al., “OPT: Open pre-trained transformer language models,” arXiv preprint arXiv:2205.01068, 2022.

M. Delange, R. Aljundi, M. Masana, S. Parisot, X. Jia, A. Leonardis et al., “A continual learning survey: Defying forgetting in classification tasks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 7, pp. 3366–3385, 2021

B. Heo, S. Park, D. Han, and S. Yun, Rotary position embedding for vision transformer, arXiv:2403.13298v2, Jul. 16, 2024. [Online]. Available: https://arxiv.org/abs/2403.13298

MiniMind-Dense: a Small Language Model that Supports HR Services by Adopting MiniMind with Supervised Fine-Tuning and LoRA

Authors

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

Similar Articles

Announcements

Information for Authors

ijisae

Information

Indexed By

MiniMind-Dense: a Small Language Model that Supports HR Services by Adopting MiniMind with Supervised Fine-Tuning and LoRA

Authors

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

Similar Articles

Announcements

Information for Authors

Like, Subscribe and Share This Video

ijisae

Information

Indexed By