SelfSuper-ID: Self-Supervised Deep Learning for Synthetic Identity Detection Under Extreme Label Sparsity
Keywords:
Anomaly detection, Deep learning, Fraud detection, Identity representation learning, Label sparsity, Self-supervised learning, Synthetic identity fraud.Abstract
Synthetic identity fraud is increasingly pervasive, yet supervised detection models often fail when labeled datasets are sparse or incomplete. This research introduces SelfSuper-ID, a self-supervised deep learning framework designed to detect synthetic identities under extreme label scarcity. The approach leverages contrastive representation learning and pseudo-label propagation to extract high-fidelity latent embeddings from multi-modal identity data, including biometric, behavioral, and transactional signals. By constructing a latent similarity graph and optimizing a cluster-aware contrastive objective, the model identifies anomalies indicative of synthetic or manipulated identities without relying on extensive labeled data. The framework also incorporates adversarial regularization to enhance robustness against emerging manipulation strategies. Empirical evaluation on large-scale, partially labeled synthetic identity datasets demonstrates that SelfSuper-ID achieves a 25–30% improvement in detection precision and recall compared to semi-supervised and unsupervised baselines, while maintaining stable performance under extreme label sparsity. These results establish self-supervised representation learning as a scalable, practical, and resilient methodology for operational identity verification in resource-constrained or rapidly evolving digital environments.
Downloads
References
S. M. Bellovin, P. K. Dutta, and N. Reitinger, “Privacy and synthetic datasets,” SSRN Electronic Journal, 2018, doi: 10.2139/ssrn.3255766.
N. Papernot, N. Carlini, Ú. Erlingsson, I. Goodfellow, and I. Mironov, “Semi-supervised knowledge transfer for deep learning from private training data,” in Proc. Int. Conf. Learning Representations (ICLR), 2017.
G. Liu, J. Guo, Y. Zuo, J. Wu, and R. Y. Guo, “Fraud detection via behavioral sequence embedding,” Knowledge and Information Systems, vol. 62, no. 7, pp. 2685–2708, Jul. 2020, doi: 10.1007/s10115-019-01433-3.
S. Kumar, S. Prasanna, and X. Ruan, “A unified hybrid machine learning architecture for robust identity anomaly detection in large-scale digital ecosystems,” Journal of Electrical Systems, vol. 14, no. 1, pp. 160–173, 2018.
A. O. Adewumi and A. A. Akinyelu, “A survey of machine-learning and nature-inspired based credit card fraud detection techniques,” International Journal of System Assurance Engineering and Management, vol. 8, pp. 937–953, Nov. 2017, doi: 10.1007/s13198-016-0551-y.
S. K. S. Prasanna, “Heterogeneous ensemble learning for robust adversarial pattern recognition in digital ecosystems,” Journal of Computational Analysis and Applications, vol. 27, no. 5, pp. 18–28, 2019.
A. M. Nejad, Evolutionary Models for Adaptive Artificial Neural Networks in Accounting and Finance Trends. 2020.
S. K. S. Prasanna, “GeoDNN: Geometry-aware deep neural networks for cross-domain fingerprint spoof detection,” International Journal of Intelligent Systems and Applications in Engineering, vol. 6, no. 1, pp. 97–107, Mar. 2018.
A. Dal Pozzolo, G. Boracchi, O. Caelen, C. Alippi, and G. Bontempi, “Credit card fraud detection: A realistic modeling and a novel learning strategy,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 8, pp. 3784–3797, 2018, doi: 10.1109/TNNLS.2017.2736643.
Y. Dou, Z. Liu, L. Sun, Y. Deng, H. Peng, and P. S. Yu, “Enhancing graph neural network-based fraud detectors against camouflaged fraudsters,” in Proc. ACM Int. Conf. Information and Knowledge Management (CIKM), Oct. 2020, pp. 315–324, doi: 10.1145/3340531.3411903.
W. Wang, J. Zhang, Q. Li, C. Zong, and Z. Li, “Are you for real? Detecting identity fraud via dialogue interactions,” in Proc. Conf. Empirical Methods in Natural Language Processing (EMNLP-IJCNLP), 2019, pp. 1762–1771, doi: 10.18653/v1/D19-1185.
W. Zhang, K. Shu, H. Liu, and Y. Wang, “Graph neural networks for user identity linkage,” arXiv preprint arXiv:1903.02174, Mar. 2019.
S. Makki, Z. Assaghir, Y. Taher, R. Haque, M. S. Hacid, and H. Zeineddine, “An experimental study with imbalanced classification approaches for credit card fraud detection,” IEEE Access, vol. 7, pp. 93010–93022, 2019, doi: 10.1109/ACCESS.2019.2927266.
A. Martignano, Real-time anomaly detection on financial data. 2020.
J. O. Awoyemi, A. O. Adetunmbi, and S. A. Oluwadare, “Credit card fraud detection using machine learning techniques: A comparative analysis,” in Proc. IEEE Int. Conf. Computing, Networking and Informatics (ICCNI), 2017, pp. 1–9, doi: 10.1109/ICCNI.2017.8123782.
N. K. Trivedi, S. Simaiya, U. K. Lilhore, and S. K. Sharma, “An efficient credit card fraud detection model based on machine learning methods,” International Journal of Advanced Science and Technology, vol. 29, no. 5, pp. 3414–3424, 2020.
Y. Fang, Y. Zhang, and C. Huang, “Credit card fraud detection based on machine learning,” Computers, Materials & Continua, vol. 61, no. 1, pp. 185–195, 2019, doi: 10.32604/cmc.2019.06144.
F. E. Botchey, Z. Qin, and K. Hughes-Lartey, “Mobile money fraud prediction: A cross-case analysis on the efficiency of support vector machines, gradient boosted decision trees, and Naïve Bayes algorithms,” Information, vol. 11, no. 8, 2020, doi: 10.3390/info11080383.
Y. K. Saheed, M. A. Hambali, M. O. Arowolo, and Y. A. Olasupo, “Application of GA feature selection on Naive Bayes, random forest and SVM for credit card fraud detection,” in Proc. Int. Conf. Decision Aid Sciences and Application (DASA), 2020, pp. 1091–1097, doi: 10.1109/DASA51403.2020.9317228.
J. Johannes, “Context-aware credit card fraud detection,” 2019.
S. Bagga, A. Goyal, N. Gupta, and A. Goyal, “Credit card fraud detection using pipelining and ensemble learning,” Procedia Computer Science, 2020, pp. 104–112, doi: 10.1016/j.procs.2020.06.014.
Z. Chen, L. D. Van Khoa, E. N. Teoh, A. Nazir, E. K. Karuppiah, and K. S. Lam, “Machine learning techniques for anti-money laundering (AML) solutions in suspicious transaction detection: A review,” Knowledge and Information Systems, 2018, doi: 10.1007/s10115-017-1144-z.
D. Dighe, S. Patil, and S. Kokate, “Detection of credit card fraud transactions using machine learning algorithms and neural networks: A comparative study,” in Proc. IEEE Int. Conf. Computing, Communication, Control and Automation (ICCUBEA), 2018, doi: 10.1109/ICCUBEA.2018.8697799.
E. O. Kane, “Detecting patterns in the Ethereum transactional data using unsupervised learning,” 2018.
S. K. S. Prasanna, “DeepSynth: A robust multi-layer neural detection of coordinated latent anomalies in high-dimensional identity systems,” International Journal of Intelligent Systems and Applications in Engineering, vol. 7, no. 1, pp. 66–77, Mar. 2019.
S. Bhatore, L. Mohan, and Y. R. Reddy, “Machine learning techniques for credit risk evaluation: A systematic literature review,” Journal of Banking and Financial Technology, vol. 4, no. 1, pp. 111–138, Apr. 2020, doi: 10.1007/s42786-020-00020-3.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.


