SelfSuper-ID: Self-Supervised Deep Learning for Synthetic Identity Detection Under Extreme Label Sparsity

Authors

  • Suman Kumar Sanjeev Prasanna

Keywords:

Anomaly detection, Deep learning, Fraud detection, Identity representation learning, Label sparsity, Self-supervised learning, Synthetic identity fraud.

Abstract

Synthetic identity fraud is increasingly pervasive, yet supervised detection models often fail when labeled datasets are sparse or incomplete. This research introduces SelfSuper-ID, a self-supervised deep learning framework designed to detect synthetic identities under extreme label scarcity. The approach leverages contrastive representation learning and pseudo-label propagation to extract high-fidelity latent embeddings from multi-modal identity data, including biometric, behavioral, and transactional signals. By constructing a latent similarity graph and optimizing a cluster-aware contrastive objective, the model identifies anomalies indicative of synthetic or manipulated identities without relying on extensive labeled data. The framework also incorporates adversarial regularization to enhance robustness against emerging manipulation strategies. Empirical evaluation on large-scale, partially labeled synthetic identity datasets demonstrates that SelfSuper-ID achieves a 25–30% improvement in detection precision and recall compared to semi-supervised and unsupervised baselines, while maintaining stable performance under extreme label sparsity. These results establish self-supervised representation learning as a scalable, practical, and resilient methodology for operational identity verification in resource-constrained or rapidly evolving digital environments.

Downloads

Download data is not yet available.

References

S. M. Bellovin, P. K. Dutta, and N. Reitinger, “Privacy and synthetic datasets,” SSRN Electronic Journal, 2018, doi: 10.2139/ssrn.3255766.

N. Papernot, N. Carlini, Ú. Erlingsson, I. Goodfellow, and I. Mironov, “Semi-supervised knowledge transfer for deep learning from private training data,” in Proc. Int. Conf. Learning Representations (ICLR), 2017.

G. Liu, J. Guo, Y. Zuo, J. Wu, and R. Y. Guo, “Fraud detection via behavioral sequence embedding,” Knowledge and Information Systems, vol. 62, no. 7, pp. 2685–2708, Jul. 2020, doi: 10.1007/s10115-019-01433-3.

S. Kumar, S. Prasanna, and X. Ruan, “A unified hybrid machine learning architecture for robust identity anomaly detection in large-scale digital ecosystems,” Journal of Electrical Systems, vol. 14, no. 1, pp. 160–173, 2018.

A. O. Adewumi and A. A. Akinyelu, “A survey of machine-learning and nature-inspired based credit card fraud detection techniques,” International Journal of System Assurance Engineering and Management, vol. 8, pp. 937–953, Nov. 2017, doi: 10.1007/s13198-016-0551-y.

S. K. S. Prasanna, “Heterogeneous ensemble learning for robust adversarial pattern recognition in digital ecosystems,” Journal of Computational Analysis and Applications, vol. 27, no. 5, pp. 18–28, 2019.

A. M. Nejad, Evolutionary Models for Adaptive Artificial Neural Networks in Accounting and Finance Trends. 2020.

S. K. S. Prasanna, “GeoDNN: Geometry-aware deep neural networks for cross-domain fingerprint spoof detection,” International Journal of Intelligent Systems and Applications in Engineering, vol. 6, no. 1, pp. 97–107, Mar. 2018.

A. Dal Pozzolo, G. Boracchi, O. Caelen, C. Alippi, and G. Bontempi, “Credit card fraud detection: A realistic modeling and a novel learning strategy,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 8, pp. 3784–3797, 2018, doi: 10.1109/TNNLS.2017.2736643.

Y. Dou, Z. Liu, L. Sun, Y. Deng, H. Peng, and P. S. Yu, “Enhancing graph neural network-based fraud detectors against camouflaged fraudsters,” in Proc. ACM Int. Conf. Information and Knowledge Management (CIKM), Oct. 2020, pp. 315–324, doi: 10.1145/3340531.3411903.

W. Wang, J. Zhang, Q. Li, C. Zong, and Z. Li, “Are you for real? Detecting identity fraud via dialogue interactions,” in Proc. Conf. Empirical Methods in Natural Language Processing (EMNLP-IJCNLP), 2019, pp. 1762–1771, doi: 10.18653/v1/D19-1185.

W. Zhang, K. Shu, H. Liu, and Y. Wang, “Graph neural networks for user identity linkage,” arXiv preprint arXiv:1903.02174, Mar. 2019.

S. Makki, Z. Assaghir, Y. Taher, R. Haque, M. S. Hacid, and H. Zeineddine, “An experimental study with imbalanced classification approaches for credit card fraud detection,” IEEE Access, vol. 7, pp. 93010–93022, 2019, doi: 10.1109/ACCESS.2019.2927266.

A. Martignano, Real-time anomaly detection on financial data. 2020.

J. O. Awoyemi, A. O. Adetunmbi, and S. A. Oluwadare, “Credit card fraud detection using machine learning techniques: A comparative analysis,” in Proc. IEEE Int. Conf. Computing, Networking and Informatics (ICCNI), 2017, pp. 1–9, doi: 10.1109/ICCNI.2017.8123782.

N. K. Trivedi, S. Simaiya, U. K. Lilhore, and S. K. Sharma, “An efficient credit card fraud detection model based on machine learning methods,” International Journal of Advanced Science and Technology, vol. 29, no. 5, pp. 3414–3424, 2020.

Y. Fang, Y. Zhang, and C. Huang, “Credit card fraud detection based on machine learning,” Computers, Materials & Continua, vol. 61, no. 1, pp. 185–195, 2019, doi: 10.32604/cmc.2019.06144.

F. E. Botchey, Z. Qin, and K. Hughes-Lartey, “Mobile money fraud prediction: A cross-case analysis on the efficiency of support vector machines, gradient boosted decision trees, and Naïve Bayes algorithms,” Information, vol. 11, no. 8, 2020, doi: 10.3390/info11080383.

Y. K. Saheed, M. A. Hambali, M. O. Arowolo, and Y. A. Olasupo, “Application of GA feature selection on Naive Bayes, random forest and SVM for credit card fraud detection,” in Proc. Int. Conf. Decision Aid Sciences and Application (DASA), 2020, pp. 1091–1097, doi: 10.1109/DASA51403.2020.9317228.

J. Johannes, “Context-aware credit card fraud detection,” 2019.

S. Bagga, A. Goyal, N. Gupta, and A. Goyal, “Credit card fraud detection using pipelining and ensemble learning,” Procedia Computer Science, 2020, pp. 104–112, doi: 10.1016/j.procs.2020.06.014.

Z. Chen, L. D. Van Khoa, E. N. Teoh, A. Nazir, E. K. Karuppiah, and K. S. Lam, “Machine learning techniques for anti-money laundering (AML) solutions in suspicious transaction detection: A review,” Knowledge and Information Systems, 2018, doi: 10.1007/s10115-017-1144-z.

D. Dighe, S. Patil, and S. Kokate, “Detection of credit card fraud transactions using machine learning algorithms and neural networks: A comparative study,” in Proc. IEEE Int. Conf. Computing, Communication, Control and Automation (ICCUBEA), 2018, doi: 10.1109/ICCUBEA.2018.8697799.

E. O. Kane, “Detecting patterns in the Ethereum transactional data using unsupervised learning,” 2018.

S. K. S. Prasanna, “DeepSynth: A robust multi-layer neural detection of coordinated latent anomalies in high-dimensional identity systems,” International Journal of Intelligent Systems and Applications in Engineering, vol. 7, no. 1, pp. 66–77, Mar. 2019.

S. Bhatore, L. Mohan, and Y. R. Reddy, “Machine learning techniques for credit risk evaluation: A systematic literature review,” Journal of Banking and Financial Technology, vol. 4, no. 1, pp. 111–138, Apr. 2020, doi: 10.1007/s42786-020-00020-3.

Downloads

Published

26.03.2021

How to Cite

Suman Kumar Sanjeev Prasanna. (2021). SelfSuper-ID: Self-Supervised Deep Learning for Synthetic Identity Detection Under Extreme Label Sparsity. International Journal of Intelligent Systems and Applications in Engineering, 9(1), 164–174. Retrieved from https://www.ijisae.org/index.php/IJISAE/article/view/8160

Issue

Section

Research Article

Most read articles by the same author(s)