Talk Smart, Talk Small: Crafting Domain-Specific LLMs for SME Customer Support

Inn Keat Ng

Authors

Inn Keat Ng, Tong Ming Lim

Keywords:

Artificial Intelligence, Customer Engagement, Fine-tuning, Large Language Model, Reinforcement Learning

Abstract

This project addresses key challenges faced by commercial large language models (LLMs) in customer engagement, such as inconsistent responses, inaccuracies, hallucinations, and lack of follow-up questions. The goal was to develop a domain-specific LLM from scratch for small and medium enterprises (SMEs), capable of delivering relevant, consistent, and human-like responses. The methodology involved studying LLM architectures, preparing and expanding datasets, developing a base model, fine-tuning with larger domain-specific data, applying reinforcement learning, and evaluating model performance. The initial model, trained on 1.5 million tokens, lacked the language understanding needed for coherence. Scaling the dataset to 445 million tokens with general and domain-specific data improved training dynamics and model stability. Fine-tuning with 550 million tokens enhanced relevance, consistency, and human-likeness, outperforming parameter-efficient methods such as LoRA. Reinforcement learning using Identity Preference Optimization (IPO) yielded mixed results. The Normal IPO approach maintained training stability and preserved response quality at both sentence and response levels. However, the Checkpoint and EMA strategies showed fluctuating training behavior and declines in response-level consistency, human-likeness, and relevance, likely due to the small reinforcement learning dataset and instability from evolving reference models. Despite these challenges, the project demonstrated the feasibility of building a domain-specific LLM tailored for SME customer engagement. Future directions include expanding the reinforcement learning dataset, exploring alternative optimization strategies, and incorporating human feedback to further refine performance.

Downloads

Download data is not yet available.

References

DOSM, “Department of Statistics Malaysia,” www.dosm.gov.my, Jul. 28, 2021. https://www.dosm.gov.my/portal-main/release-content/small-and-medium-enterprises-smes-performance-2020

M. F. Abas, P. Pardiman, and S. Supriyanto, “Unlocking Human Potential: A Literature Review on HR Challenges and Innovations in SME Entrepreneurship,” Jurnal Manajemen Bisnis, vol. 11, no. 2, pp. 785–799, Jun. 2024, doi: https://doi.org/10.33096/jmb.v11i2.837.

S. M. Yong, “4th Industry Revolution Digital Marketing Adoption Challenges in SMEs and Its Effect on Customer Responsiveness,” Information Management and Business Review, vol. 15, no. 2(I)SI, pp. 152–172, Jun. 2023.

V. Obradovic, “CRM software as a service and importance of the approach for SMEs,” IJEEC - INTERNATIONAL JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTING, vol. 6, no. 1, Jun. 2022, doi: https://doi.org/10.7251/ijeec2206042o.

J. Wulf and J. Meierhofer, “Utilizing Large Language Models for Automating Technical Customer Support,” arXiv.org, 2024. https://arxiv.org/abs/2406.01407

D. M. Anisuzzaman, J. G. Malins, P. A. Friedman, and Z. I. Attia, “Fine-Tuning LLMs for Specialized Use Cases,” Mayo Clinic Proceedings: Digital Health, vol. 3, no. 1, Nov. 2024, doi: https://doi.org/10.1016/j.mcpdig.2024.11.005.

A. Dutta, N. Ghosh, and A. Chatterjee, “CARE: A QLoRA-Fine Tuned Multi-Domain Chatbot With Fast Learning On Minimal Hardware,” arXiv.org, 2025. https://arxiv.org/abs/2503.14136 (accessed May 23, 2025).

InfoWorld, “3 big challenges of commercial LLMs,” InfoWorld, Nov. 27, 2023. https://www.infoworld.com/article/2335381/3-big-challenges-of-commercial-llms.html

rasbt, “GitHub - rasbt/LLMs-from-scratch: Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step,” GitHub, 2023. https://github.com/rasbt/LLMs-from-scratch

A. Dubey et al., “The Llama 3 Herd of Models,” arXiv.org, 2024. https://arxiv.org/abs/2407.21783

J. Su, Y. Lu, S.-F. Pan, B. Wen, and Y. Liu, “RoFormer: Enhanced Transformer with Rotary Position Embedding,” Apr. 2021, doi: https://doi.org/10.48550/arxiv.2104.09864.

J. Ainslie, J. Lee-Thorp, M. de Jong, Y. Zemlyanskiy, F. Lebrón, and S. Sanghai, “GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints,” arXiv.org, Oct. 23, 2023. https://arxiv.org/abs/2305.13245

B. Zhang and R. Sennrich, “Root Mean Square Layer Normalization,” arXiv.org, Oct. 16, 2019. https://arxiv.org/abs/1910.07467

N. Shazeer, “GLU Variants Improve Transformer,” arXiv:2002.05202 [cs, stat], Feb. 2020, Available: https://arxiv.org/abs/2002.05202

N. Ding, “ultrachat,” Huggingface.co, 2025. https://huggingface.co/datasets/stingning/ultrachat (accessed May 24, 2025).

R. Shuttleworth, J. Andreas, A. Torralba, and P. Sharma, “LoRA vs Full Fine-tuning: An Illusion of Equivalence,” arXiv.org, 2024. https://arxiv.org/abs/2410.21228 (accessed May 25, 2025).

E. J. Hu et al., “LoRA: Low-Rank Adaptation of Large Language Models,” arXiv:2106.09685 [cs], Oct. 2021, Available: https://arxiv.org/abs/2106.09685

A. Kumar, A. Raghunathan, R. Jones, T. Ma, and P. Liang, “Fine-Tuning can Distort Pretrained Features and Underperform Out-of-Distribution,” arXiv:2202.10054 [cs], Feb. 2022, Available: https://arxiv.org/abs/2202.10054

Y. Zeng and K. Lee, “The Expressive Power of Low-Rank Adaptation,” arXiv (Cornell University), Oct. 2023, doi: https://doi.org/10.48550/arxiv.2310.17513.

Bitext, “Bitext-customer-support-llm-chatbot-training-dataset,” Huggingface.co, 2025. https://huggingface.co/datasets/bitext/Bitext-customer-support-llm-chatbot-training-dataset

M. Wu, A. Waheed, C. Zhang, M. Abdul-Mageed, and A. F. Aji, “LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions,” arXiv.org, May 24, 2023. https://arxiv.org/abs/2304.14402

M. G. Azar et al., “A General Theoretical Paradigm to Understand Learning from Human Preferences,” arXiv.org, Nov. 21, 2023. https://arxiv.org/abs/2310.12036

P. Izmailov, D. Podoprikhin, T. Garipov, D. Vetrov, and A. G. Wilson, “Averaging Weights Leads to Wider Optima and Better Generalization,” arXiv:1803.05407 [cs, stat], Feb. 2019, Available: https://arxiv.org/abs/1803.05407

P. Christiano, J. Leike, T. B. Brown, M. Martic, S. Legg, and D. Amodei, “Deep reinforcement learning from human preferences,” arXiv.org, 2017. https://arxiv.org/abs/1706.03741

Talk Smart, Talk Small: Crafting Domain-Specific LLMs for SME Customer Support

Authors

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

Similar Articles

Announcements

Information for Authors

ijisae

Information

Indexed By

Talk Smart, Talk Small: Crafting Domain-Specific LLMs for SME Customer Support

Authors

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

Similar Articles

Announcements

Information for Authors

Like, Subscribe and Share This Video

ijisae

Information

Indexed By