Multi-Version Infrastructure for Privacy-Preserving AI/ML Inference at Scale

Authors

  • Jay Bankimchandra Desai

Keywords:

Privacy-Preserving Inference, Multi-Version Feature Representations, Embedding Compliance, Differential Privacy, Regulatory-Aware Machine Learning

Abstract

As the number of regulatory regimes, multi-stakeholder data relationships, and compliance requirements grows, privacy becomes an increasing architectural concern for large-scale AI/ML systems for data inference. Inference pipelines that apply a single, globally cast restrictive data policy to every inference context incur a measurable decrease in model performance. To avoid degrading model performance through globally restrictive policies while also avoiding potential policy violations introduced by dynamically modifying data usage per request, our multi-version architecture explicitly maintains multiple versions of user and participant information at the feature and embedding levels. In conjunction, context-aware version selection mechanisms deterministically map the metadata describing an incoming request to the appropriate data usage policy at runtime. In turn, versioned feature vectors are generated from superset representations of available signals, with the appropriate version selected based on the incoming request context and its corresponding data usage policy. Model-specific embeddings are derived from their privacy-compliant feature vectors to ensure end-to-end compliance. Rule-based selection schemes, implemented as abstractions decoupled from inference execution code, allow rapid regulatory adaptation without requiring service redeployment. Continuous monitoring helps validate selection quality and detect performance regressions in production environments. The computational overhead introduced by generating and maintaining multiple feature and embedding versions can be reduced through centralized build-once orchestration, shared feature storage schemas, and hybrid offline–online embedding generation within internet-scale latency budgets. Beyond privacy, this architectural pattern generalizes to fairness-aware inference, multi-tenant data isolation, and auditable policy enforcement, enabling versioned features and embedding representations as a foundational primitive for developing trustworthy, policy-compliant AI/ML systems.

DOI: https://doi.org/10.17762/ijisae.v14i1s.8129

Downloads

Download data is not yet available.

References

H. Brendan McMahan et al., "Communication-Efficient Learning of Deep Networks from Decentralized Data," Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS) 2017. [Online]. Available: https://proceedings.mlr.press/v54/mcmahan17a/mcmahan17a.pdf

Arvind Narayanan and Vitaly Shmatikov, "Robust De-anonymization of Large Sparse Datasets," 2007. [Online]. Available: https://www.stat.cmu.edu/~brian/303-2012-full/303-2011/303-2010/0-from%20the%20world/2010-03-12-de-anonymizing%20netflix.pdf

Martín Abadi et al., "Deep Learning with Differential Privacy," ACM Digital Library, 2016. [Online]. Available: https://dl.acm.org/doi/epdf/10.1145/2976749.2978318

Reza Shokri, Vitaly Shmatikov, "Privacy-Preserving Deep Learning," ACM Digital Library, 2015. [Online]. Available: https://dl.acm.org/doi/pdf/10.1145/2810103.2813687

Jacob Devlin, "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding," 2019. [Online]. Available: https://aclanthology.org/N19-1423.pdf

Christian Janiesch et al., "Machine Learning and Deep Learning," Electronic Markets, 2021. [Online]. Available: https://link.springer.com/content/pdf/10.1007/s12525-021-00475-2.pdf

Latanya Sweeney, "k-Anonymity: A Model for Protecting Privacy," International Journal on Uncertainty, 2002. [Online]. Available: https://homepage.divms.uiowa.edu/~sriram/5980/spring16/k-anonymity1.pdf

Foot Anstey, "The General Data Protection Regulation: A Practical Guide to the Changes Ahead," 2018. [Online]. Available: https://www.faintranet.co.uk/wp-content/uploads/FOOT-ANSTEY-GDPR-_Digital-Version.pdf

Matei Zaharia et al., “Apache Spark: A Unified Engine for Big Data Processing," ACM Digital Library, 2016. https://dl.acm.org/doi/pdf/10.1145/2934664

Alexandros G. Dimakis et al., “A Survey on Network Codes for Distributed Storage,” Proceedings of the IEEE, Vol. 99, No. 3, March 2011.

https://www.academia.edu/10344819/I_N_V_I_A_Survey_on_Network_Codes_for_Distributed_Storage

Peter I. Frazier. “A Tutorial on Bayesian Optimization.” arXiv, 2018. https://arxiv.org/pdf/1807.02811

Finale Doshi-Velez and Been Kim, “Towards a Rigorous Science of Interpretable Machine Learning," arXiv, 2017. https://arxiv.org/pdf/1702.08608

Downloads

Published

26.03.2026

How to Cite

Jay Bankimchandra Desai. (2026). Multi-Version Infrastructure for Privacy-Preserving AI/ML Inference at Scale. International Journal of Intelligent Systems and Applications in Engineering, 14(1s), 106–111. Retrieved from https://www.ijisae.org/index.php/IJISAE/article/view/8129

Issue

Section

Research Article