Lineage, Traceability, and Reproducibility as Reliability Requirements in Enterprise AI Systems
Keywords:
Enterprise Artificial Intelligence, Traceability, Data Lineage, Reproducibility, Data Governance, AI Reliability, Model VersioningAbstract
Artificial intelligence is being applied to key business and compliance choices by more systems in the enterprise. One of the most common systems is concerned with the accuracy of the model and does not factor in the reliability aspect, like the lineage or traceability, or reproducibility. In this paper, we obtain these three aspects as fundamental reliability expectations of enterprise AI. The study was a real enterprise AI applied in 12 months with a before and after quantitative design. Lineage coverage, version control and reproducibility controls were introduced thus, the lineage coverage rose to 0.91 and the success of reproducibility rose to 92% after these tools were applied on structured lineage. Rapid time to incident investigation was less by 66%, audit preparation was also less by 62% and compliance findings were also less by 75%. Monte Carlo simulation also indicated that the risk variability was smaller when the lineage controls had been incorporated. This observation is in full agreement with the results that indicated that integrating lineage, traceability, and reproducibility into AI platforms enhances reliability, audit readiness, and trust in AI results.
Downloads
References
Mora-Cantallops, M., Sánchez-Alonso, S., García-Barriocanal, E., & Sicilia, M. (2021). Traceability for Trustworthy AI: A review of Models and tools. Big Data and Cognitive Computing, 5(2), 20. https://doi.org/10.3390/bdcc5020020
Souza, R., Azevedo, L., Lourenço, V., Soares, E., Thiago, R., Brandão, R., Civitarese, D., Brazil, E. V., Moreno, M., Valduriez, P., Mattoso, M., Cerqueira, R., & Netto, M. a. S. (2019, October 9). Provenance data in the machine learning lifecycle in computational science and engineering. arXiv.org. https://arxiv.org/abs/1910.04223
Magagna, B., Goldfarb, D., Martin, P., Atkinson, M., Koulouzis, S., & Zhao, Z. (2020). Data provenance. In Lecture notes in computer science (pp. 208–225). https://doi.org/10.1007/978-3-030-52829-4_12
Johns, M., Meurers, T., Wirth, F. N., Haber, A. C., Müller, A., Halilovic, M., Balzer, F., & Prasser, F. (2023). Data Provenance in Biomedical Research: Scoping Review. Journal of Medical Internet Research, 25, e42289. https://doi.org/10.2196/42289
Spoczynski, M., Melara, M. S., & Szyller, S. (2025). Atlas: A Framework for ML Lifecycle Provenance & Transparency. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2502.19567
Schneider, J., Abraham, R., Meske, C., & Brocke, J. V. (2022). Artificial Intelligence governance for businesses. Information Systems Management, 40(3), 229–249. https://doi.org/10.1080/10580530.2022.2085825
Yang, W., Fu, R., Amin, M. B., & Kang, B. (2025). The impact of modern AI in metadata management. Human-Centric Intelligent Systems, 5(3), 323–350. https://doi.org/10.1007/s44230-025-00106-5
Yin, J., Chen, Y., Lee, M., & Liu, X. (2025). Schema Lineage Extraction at scale: multilingual pipelines, composite evaluation, and Language-Model Benchmarks. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2508.07179
Longpre, S., Mahari, R., Chen, A., Obeng-Marnu, N., Sileo, D., Brannon, W., Muennighoff, N., Khazam, N., Kabbara, J., Perisetla, K., Wu, X., Shippole, E., Bollacker, K., Wu, T., Villa, L., Pentland, S., & Hooker, S. (2023). The Data Provenance Initiative: A Large Scale Audit of Dataset Licensing & Attribution in AI. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2310.16787
Mason-Williams, I., & Mason-Williams, G. (2025). Reproducibility: the new frontier in AI governance. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2510.11595
Raghupathi, W., Raghupathi, V., & Ren, J. (2022). Reproducibility in Computing Research: An Empirical study. IEEE Access, 10, 29207–29223. https://doi.org/10.1109/access.2022.3158675
Pineau, J., Vincent-Lamarre, P., Sinha, K., Larivière, V., Beygelzimer, A., D’Alché-Buc, F., Fox, E., & Larochelle, H. (2020). Improving Reproducibility in Machine Learning Research (A Report from the NeurIPS 2019 Reproducibility Program). arXiv (Cornell University). https://doi.org/10.48550/arxiv.2003.12206
Hasham, K., Munir, K., & McClatchey, R. (2017). Cloud infrastructure provenance collection and management to reproduce scientific workflows execution. Future Generation Computer Systems, 86, 799–820. https://doi.org/10.1016/j.future.2017.07.015
Rupprecht, L., Davis, J. C., Arnold, C., Gur, Y., & Bhagwat, D. (2020). Improving reproducibility of data science pipelines through transparent provenance capture. Proceedings of the VLDB Endowment, 13(12), 3354–3368. https://doi.org/10.14778/3415478.3415556
Kalokyri, V., Tachos, N. S., Kalantzopoulos, C. N., Sfakianakis, S., Kondylakis, H., Zaridis, D. I., Colantonio, S., Regge, D., Papanikolaou, N., Marias, K., Fotiadis, D. I., & Tsiknakis, M. (2025). AI Model Passport: Data and system traceability framework for transparent AI in health. Computational and Structural Biotechnology Journal, 28, 386–404. https://doi.org/10.1016/j.csbj.2025.09.041
Li, Z., Kesselman, C., Nguyen, T. H., Xu, B. Y., Bolo, K., & Yu, K. (2025). From Data to Decision: Data-Centric Infrastructure for Reproducible ML in Collaborative eScience. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2506.16051
Safronov, V., McCaigue, A., Allott, N., & Martin, A. (2025). TAIBOM: Bringing Trustworthiness to AI-Enabled Systems. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2510.02169
Kang, B. H., Yang, W., & Amin, M. B. (2025). Trustworthy Orchestration Artificial Intelligence by the Ten Criteria with Control-Plane Governance. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2512.10304
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.


