Lineage, Traceability, and Reproducibility as Reliability Requirements in Enterprise AI Systems

Divya Bonthala

Authors

Divya Bonthala

Keywords:

Enterprise Artificial Intelligence, Traceability, Data Lineage, Reproducibility, Data Governance, AI Reliability, Model Versioning

Abstract

Artificial intelligence is being applied to key business and compliance choices by more systems in the enterprise. One of the most common systems is concerned with the accuracy of the model and does not factor in the reliability aspect, like the lineage or traceability, or reproducibility. In this paper, we obtain these three aspects as fundamental reliability expectations of enterprise AI. The study was a real enterprise AI applied in 12 months with a before and after quantitative design. Lineage coverage, version control and reproducibility controls were introduced thus, the lineage coverage rose to 0.91 and the success of reproducibility rose to 92% after these tools were applied on structured lineage. Rapid time to incident investigation was less by 66%, audit preparation was also less by 62% and compliance findings were also less by 75%. Monte Carlo simulation also indicated that the risk variability was smaller when the lineage controls had been incorporated. This observation is in full agreement with the results that indicated that integrating lineage, traceability, and reproducibility into AI platforms enhances reliability, audit readiness, and trust in AI results.

Downloads

Download data is not yet available.

References

Mora-Cantallops, M., Sánchez-Alonso, S., García-Barriocanal, E., & Sicilia, M. (2021). Traceability for Trustworthy AI: A review of Models and tools. Big Data and Cognitive Computing, 5(2), 20. https://doi.org/10.3390/bdcc5020020

Souza, R., Azevedo, L., Lourenço, V., Soares, E., Thiago, R., Brandão, R., Civitarese, D., Brazil, E. V., Moreno, M., Valduriez, P., Mattoso, M., Cerqueira, R., & Netto, M. a. S. (2019, October 9). Provenance data in the machine learning lifecycle in computational science and engineering. arXiv.org. https://arxiv.org/abs/1910.04223

Magagna, B., Goldfarb, D., Martin, P., Atkinson, M., Koulouzis, S., & Zhao, Z. (2020). Data provenance. In Lecture notes in computer science (pp. 208–225). https://doi.org/10.1007/978-3-030-52829-4_12

Johns, M., Meurers, T., Wirth, F. N., Haber, A. C., Müller, A., Halilovic, M., Balzer, F., & Prasser, F. (2023). Data Provenance in Biomedical Research: Scoping Review. Journal of Medical Internet Research, 25, e42289. https://doi.org/10.2196/42289

Spoczynski, M., Melara, M. S., & Szyller, S. (2025). Atlas: A Framework for ML Lifecycle Provenance & Transparency. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2502.19567

Schneider, J., Abraham, R., Meske, C., & Brocke, J. V. (2022). Artificial Intelligence governance for businesses. Information Systems Management, 40(3), 229–249. https://doi.org/10.1080/10580530.2022.2085825

Yang, W., Fu, R., Amin, M. B., & Kang, B. (2025). The impact of modern AI in metadata management. Human-Centric Intelligent Systems, 5(3), 323–350. https://doi.org/10.1007/s44230-025-00106-5

Yin, J., Chen, Y., Lee, M., & Liu, X. (2025). Schema Lineage Extraction at scale: multilingual pipelines, composite evaluation, and Language-Model Benchmarks. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2508.07179

Longpre, S., Mahari, R., Chen, A., Obeng-Marnu, N., Sileo, D., Brannon, W., Muennighoff, N., Khazam, N., Kabbara, J., Perisetla, K., Wu, X., Shippole, E., Bollacker, K., Wu, T., Villa, L., Pentland, S., & Hooker, S. (2023). The Data Provenance Initiative: A Large Scale Audit of Dataset Licensing & Attribution in AI. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2310.16787

Mason-Williams, I., & Mason-Williams, G. (2025). Reproducibility: the new frontier in AI governance. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2510.11595

Raghupathi, W., Raghupathi, V., & Ren, J. (2022). Reproducibility in Computing Research: An Empirical study. IEEE Access, 10, 29207–29223. https://doi.org/10.1109/access.2022.3158675

Pineau, J., Vincent-Lamarre, P., Sinha, K., Larivière, V., Beygelzimer, A., D’Alché-Buc, F., Fox, E., & Larochelle, H. (2020). Improving Reproducibility in Machine Learning Research (A Report from the NeurIPS 2019 Reproducibility Program). arXiv (Cornell University). https://doi.org/10.48550/arxiv.2003.12206

Hasham, K., Munir, K., & McClatchey, R. (2017). Cloud infrastructure provenance collection and management to reproduce scientific workflows execution. Future Generation Computer Systems, 86, 799–820. https://doi.org/10.1016/j.future.2017.07.015

Rupprecht, L., Davis, J. C., Arnold, C., Gur, Y., & Bhagwat, D. (2020). Improving reproducibility of data science pipelines through transparent provenance capture. Proceedings of the VLDB Endowment, 13(12), 3354–3368. https://doi.org/10.14778/3415478.3415556

Kalokyri, V., Tachos, N. S., Kalantzopoulos, C. N., Sfakianakis, S., Kondylakis, H., Zaridis, D. I., Colantonio, S., Regge, D., Papanikolaou, N., Marias, K., Fotiadis, D. I., & Tsiknakis, M. (2025). AI Model Passport: Data and system traceability framework for transparent AI in health. Computational and Structural Biotechnology Journal, 28, 386–404. https://doi.org/10.1016/j.csbj.2025.09.041

Li, Z., Kesselman, C., Nguyen, T. H., Xu, B. Y., Bolo, K., & Yu, K. (2025). From Data to Decision: Data-Centric Infrastructure for Reproducible ML in Collaborative eScience. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2506.16051

Safronov, V., McCaigue, A., Allott, N., & Martin, A. (2025). TAIBOM: Bringing Trustworthiness to AI-Enabled Systems. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2510.02169

Kang, B. H., Yang, W., & Amin, M. B. (2025). Trustworthy Orchestration Artificial Intelligence by the Ten Criteria with Control-Plane Governance. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2512.10304

Lineage, Traceability, and Reproducibility as Reliability Requirements in Enterprise AI Systems

Authors

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

Similar Articles

Announcements

Information for Authors

ijisae

Information

Indexed By

Lineage, Traceability, and Reproducibility as Reliability Requirements in Enterprise AI Systems

Authors

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

Similar Articles

Announcements

Information for Authors

Like, Subscribe and Share This Video

ijisae

Information

Indexed By