Talk Smart, Talk Small: Crafting Domain-Specific LLMs for SME Customer Support
Keywords:
Artificial Intelligence, Customer Engagement, Fine-tuning, Large Language Model, Reinforcement LearningAbstract
This project addresses key challenges faced by commercial large language models (LLMs) in customer engagement, such as inconsistent responses, inaccuracies, hallucinations, and lack of follow-up questions. The goal was to develop a domain-specific LLM from scratch for small and medium enterprises (SMEs), capable of delivering relevant, consistent, and human-like responses. The methodology involved studying LLM architectures, preparing and expanding datasets, developing a base model, fine-tuning with larger domain-specific data, applying reinforcement learning, and evaluating model performance. The initial model, trained on 1.5 million tokens, lacked the language understanding needed for coherence. Scaling the dataset to 445 million tokens with general and domain-specific data improved training dynamics and model stability. Fine-tuning with 550 million tokens enhanced relevance, consistency, and human-likeness, outperforming parameter-efficient methods such as LoRA. Reinforcement learning using Identity Preference Optimization (IPO) yielded mixed results. The Normal IPO approach maintained training stability and preserved response quality at both sentence and response levels. However, the Checkpoint and EMA strategies showed fluctuating training behavior and declines in response-level consistency, human-likeness, and relevance, likely due to the small reinforcement learning dataset and instability from evolving reference models. Despite these challenges, the project demonstrated the feasibility of building a domain-specific LLM tailored for SME customer engagement. Future directions include expanding the reinforcement learning dataset, exploring alternative optimization strategies, and incorporating human feedback to further refine performance.
Downloads
References
DOSM, “Department of Statistics Malaysia,” www.dosm.gov.my, Jul. 28, 2021. https://www.dosm.gov.my/portal-main/release-content/small-and-medium-enterprises-smes-performance-2020
M. F. Abas, P. Pardiman, and S. Supriyanto, “Unlocking Human Potential: A Literature Review on HR Challenges and Innovations in SME Entrepreneurship,” Jurnal Manajemen Bisnis, vol. 11, no. 2, pp. 785–799, Jun. 2024, doi: https://doi.org/10.33096/jmb.v11i2.837.
S. M. Yong, “4th Industry Revolution Digital Marketing Adoption Challenges in SMEs and Its Effect on Customer Responsiveness,” Information Management and Business Review, vol. 15, no. 2(I)SI, pp. 152–172, Jun. 2023.
V. Obradovic, “CRM software as a service and importance of the approach for SMEs,” IJEEC - INTERNATIONAL JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTING, vol. 6, no. 1, Jun. 2022, doi: https://doi.org/10.7251/ijeec2206042o.
J. Wulf and J. Meierhofer, “Utilizing Large Language Models for Automating Technical Customer Support,” arXiv.org, 2024. https://arxiv.org/abs/2406.01407
D. M. Anisuzzaman, J. G. Malins, P. A. Friedman, and Z. I. Attia, “Fine-Tuning LLMs for Specialized Use Cases,” Mayo Clinic Proceedings: Digital Health, vol. 3, no. 1, Nov. 2024, doi: https://doi.org/10.1016/j.mcpdig.2024.11.005.
A. Dutta, N. Ghosh, and A. Chatterjee, “CARE: A QLoRA-Fine Tuned Multi-Domain Chatbot With Fast Learning On Minimal Hardware,” arXiv.org, 2025. https://arxiv.org/abs/2503.14136 (accessed May 23, 2025).
InfoWorld, “3 big challenges of commercial LLMs,” InfoWorld, Nov. 27, 2023. https://www.infoworld.com/article/2335381/3-big-challenges-of-commercial-llms.html
rasbt, “GitHub - rasbt/LLMs-from-scratch: Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step,” GitHub, 2023. https://github.com/rasbt/LLMs-from-scratch
A. Dubey et al., “The Llama 3 Herd of Models,” arXiv.org, 2024. https://arxiv.org/abs/2407.21783
J. Su, Y. Lu, S.-F. Pan, B. Wen, and Y. Liu, “RoFormer: Enhanced Transformer with Rotary Position Embedding,” Apr. 2021, doi: https://doi.org/10.48550/arxiv.2104.09864.
J. Ainslie, J. Lee-Thorp, M. de Jong, Y. Zemlyanskiy, F. Lebrón, and S. Sanghai, “GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints,” arXiv.org, Oct. 23, 2023. https://arxiv.org/abs/2305.13245
B. Zhang and R. Sennrich, “Root Mean Square Layer Normalization,” arXiv.org, Oct. 16, 2019. https://arxiv.org/abs/1910.07467
N. Shazeer, “GLU Variants Improve Transformer,” arXiv:2002.05202 [cs, stat], Feb. 2020, Available: https://arxiv.org/abs/2002.05202
N. Ding, “ultrachat,” Huggingface.co, 2025. https://huggingface.co/datasets/stingning/ultrachat (accessed May 24, 2025).
R. Shuttleworth, J. Andreas, A. Torralba, and P. Sharma, “LoRA vs Full Fine-tuning: An Illusion of Equivalence,” arXiv.org, 2024. https://arxiv.org/abs/2410.21228 (accessed May 25, 2025).
E. J. Hu et al., “LoRA: Low-Rank Adaptation of Large Language Models,” arXiv:2106.09685 [cs], Oct. 2021, Available: https://arxiv.org/abs/2106.09685
A. Kumar, A. Raghunathan, R. Jones, T. Ma, and P. Liang, “Fine-Tuning can Distort Pretrained Features and Underperform Out-of-Distribution,” arXiv:2202.10054 [cs], Feb. 2022, Available: https://arxiv.org/abs/2202.10054
Y. Zeng and K. Lee, “The Expressive Power of Low-Rank Adaptation,” arXiv (Cornell University), Oct. 2023, doi: https://doi.org/10.48550/arxiv.2310.17513.
Bitext, “Bitext-customer-support-llm-chatbot-training-dataset,” Huggingface.co, 2025. https://huggingface.co/datasets/bitext/Bitext-customer-support-llm-chatbot-training-dataset
M. Wu, A. Waheed, C. Zhang, M. Abdul-Mageed, and A. F. Aji, “LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions,” arXiv.org, May 24, 2023. https://arxiv.org/abs/2304.14402
M. G. Azar et al., “A General Theoretical Paradigm to Understand Learning from Human Preferences,” arXiv.org, Nov. 21, 2023. https://arxiv.org/abs/2310.12036
P. Izmailov, D. Podoprikhin, T. Garipov, D. Vetrov, and A. G. Wilson, “Averaging Weights Leads to Wider Optima and Better Generalization,” arXiv:1803.05407 [cs, stat], Feb. 2019, Available: https://arxiv.org/abs/1803.05407
P. Christiano, J. Leike, T. B. Brown, M. Martic, S. Legg, and D. Amodei, “Deep reinforcement learning from human preferences,” arXiv.org, 2017. https://arxiv.org/abs/1706.03741
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.