Automatic Program Repair: A Comparative Study of LLMs on QuixBugs
Keywords:
Bugs, Debugging, Automatic Program Repair, ChatGPT, Gemini.Abstract
Software bugs are errors or flaws in a program's code that can lead to incorrect or unexpected behavior, making their detection and resolution crucial for reliable and secure software development. Debugging is a human-centric, time-consuming and resource-intensive process, making it one of the most expensive phases in software development. Automatic Program Repair (APR) is an emerging area of research that aims to automatically fix software bugs with minimal human intervention. Traditional APR tools use search-based or learning-based techniques to find software bugs based on test suites and bug patterns, thereby having heavy reliance on test cases. AI-driven APR tools are trained on large-scale codebases, open-source bug-fix histories, and benchmarks like QuixBugs. They can analyze buggy code, fix bugs and generate code patches that are syntactically and semantically correct. This reduces the debugging time and improves software reliability The QuixBugs benchmark has 40 programs from the Quixey Challenge in two languages: Python and Java. Each program contains a one-line defect and failing testcases. This paper presents a comparative study of APR techniques on the QuixBugs benchmark, which includes 40 buggy programs in both Python and Java. This study evaluates and compares the automatic bug fixing capability of LLMs such as ChatGPT and Google Gemini on the QuixBugs benchmark, thereby contributing to the understanding of LLMs’ role in automatic program repair.
Downloads
References
Fan, Y., Wang, S., Liu, Y., & Zhang, L. (2023). Towards generalizable program repair with large language models: An empirical study. Proceedings of the 45th International Conference on Software Engineering.
Feng, Z., Guo, D., Tang, D., Duan, N., Feng, X., Gong, M., ... & Zhou, M. (2020). CodeBERT: A pre-trained model for programming and natural languages. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1536–1547.
Finnie-Ansley, J., Sivaraman, A., Vasilescu, B., & DeLine, R. (2023). Robots need social skills: Exploring social behavior in code generation tools. IEEE Transactions on Software Engineering.
Jiang, J., Zhang, D., Wang, S., Yin, G., & Zhou, J. (2021). CURE: Code-aware neural machine translation for automatic program repair. IEEE Transactions on Software Engineering.
Just, R., Jalali, D., & Ernst, M. D. (2014). Defects4J: A database of existing faults to enable controlled testing studies for Java programs. In Proceedings of the 2014 International Symposium on Software Testing and Analysis (ISSTA) (pp. 437–440). ACM. https://doi.org/10.1145/2610384.2628055
Le Goues, C., Nguyen, T., Forrest, S., & Weimer, W. (2012). GenProg: A generic method for automatic software repair. IEEE Transactions on Software Engineering, 38(1), 54–72.
Lin, D., Koppel, J., Chen, A., & Solar-Lezama, A. (2017). QuixBugs: A Multi-Lingual Program Repair Benchmark Set Based on the Quixey Challenge. SPLASH Companion 2017. https://doi.org/10.1145/3135932.3135941. GitHub. https://github.com/jkoppel/QuixBugs
Martinez, M., & Monperrus, M. (2019). Astor: A program repair library for Java. Proceedings of ISSTA, ACM.
Mechtaev, S., Yi, J., & Roychoudhury, A. (2016). Angelix: Scalable multiline program patch synthesis via symbolic analysis. Proceedings of the 38th International Conference on Software Engineering, 691–701.
Monperrus, M. (2018). Automatic software repair: A bibliography. ACM Computing Surveys (CSUR), 51(1), 1–24.
Nguyen, H. D. T., Qi, D., Roychoudhury, A., & Chandra, S. (2013). SemFix: Program repair via semantic analysis. Proceedings of the 2013 International Conference on Software Engineering, 772–781.
Prenner, J. A., Babii, H., & Robbes, R. (2022). Can OpenAI's Codex Fix Bugs? An Evaluation on QuixBugs. International Workshop on Automated Program Repair (APR’22). https://doi.org/10.1145/3524459.3527351
Sobania, D., Briesch, M., Hanna, C., & Petke, J. (2023). An analysis of the automatic bug fixing performance of chatgpt. In 2023 IEEE/ACM International Workshop on Automated Program Repair (APR) (pp. 23-30). IEEE.
Wuisang, M. C., Kurniawan, M., Santosa, K. A. W., Gunawan, A. A. S., & Saputra, K. E. (2023). An Evaluation of the Effectiveness of OpenAI's ChatGPT for Automated Python Program Bug Fixing Using QuixBugs. 2023 International Seminar on Application for Technology of Information and Communication (iSemantic), IEEE. https://doi.org/10.1109/iSemantic59612.2023.10295323
Xia, C. S., Wei, Y., & Zhang, L. (2023). Automated program repair in the era of large pre-trained language models. Proceedings of the 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), 1482–1494. https://doi.org/10.1109/ICSE48619.2023.00129
Xuan, J., Martinez, M., Demarco, F., Clement, M., Danglot, B., Le Berre, D., & Monperrus, M. (2017). Nopol: Automatic repair of conditional statements in Java programs. IEEE Transactions on Software Engineering, 43(1), 34–55.
Ye, H., Martinez, M., Durieux, T., & Monperrus, M. (2020). A comprehensive study of automatic program repair on the QuixBugs benchmark. Journal of Systems and Software, 171, 110825. https://doi.org/10.1016/j.jss.2020.110825
Zhang, D., Liu, Y., Wang, S., & Zhou, J. (2023). A survey of learning-based automated program repair. ACM Computing Surveys (CSUR), 55(9), 1–39.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.