DART-Net: A Dual-Path Transformer Architecture Robust to Adversarial Attacks for Efficient and Flexible Spam Detection
Subject Areas : Natural Language Processing
Amin Hadi
1
,
Mahdi Mosleh
2
,
Keyvan Mohebbi
3
1 -
2 -
3 -
Keywords: Spam detection, Adversarial attacks, Adversarial learning, Dual-path architecture, Transformer, DART-Net,
Abstract :
With the increasing complexity of spam, especially phishing messages. Adversarially generated samples, the need for detection systems that simultaneously possess high accuracy, efficiency,. Flexibility is felt more than ever. current models often excel in one of these dimensions. large language models, despite high accuracy, have an unacceptable delay in processing, while lightweight models, although fast, show high vulnerability to new. Hostile attacks. in the face of hostile attacks, dart-net introduces a novel architecture that dynamically balances performance. Resilience. dart-net leverages two parallel processing paths: a lightweight path based on distilbert for rapid. Initial evaluation,. A powerful roberta-based pathway enhanced with online adversarial training to enable deeper analyses. a smart gating mechanism aware of uncertainty intelligently routes inputs. Activates the robust pathway only for samples that are confident. have been detected as low or potentially harmful. we have evaluated dart-net on a diverse set of public datasets, including the contemporary spamdam dataset. experimental results show that dart-net achieves performance in the classification achieves performance that is competitive with large state-of-the-art models, while reducing the average inference latency by 70%. more importantly, under adversarial attacks such as textfooler, dart-net outperforms standard models significantly.. Reduces the attack success rate (asr) by more than 40 percentage points. this research introduces a new paradigm for designing. Implementing functional, secure,. Scalable spam detection systems.
[1] Fields, B., et al. (2024). "Large Language Models for Text Classification." arXiv preprint.
[2] Wang, Y., et al. (2024). "DA3: A Distribution-Aware Adversarial Attack against Language Models." Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing.
[3] Ahmadi, M., et al. (2025). "Leveraging Large Language Models for Cybersecurity: Enhancing SMS Spam Detection." arXiv preprint arXiv:2502.11014.
[4] Sanh, V., et al. (2019). "DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter." arXiv preprint arXiv:1910.01108.
[5] Liu, Y., et al. (2019). "RoBERTa: A Robustly Optimized BERT Pretraining Approach." arXiv preprint arXiv:1907.11692.
[6] Jin, D., et al. (2020). "Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment." Proceedings of the AAAI Conference on Artificial Intelligence.
[7] Li, L., et al. (2020). "BERT-Attack: Adversarial Attack Against BERT Using BERT." Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.
[8] Vazhentsev, A., et al. (2022). "Uncertainty Estimation of Transformer Predictions for Misclassification Detection." Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.
[9] Salman, M., et al. (2024). "Investigating Evasive Techniques in SMS Spam Filtering." IEEE Access.
[10] Hou, Y., et al. (2024). "Uncertainty Quantification for Language Models." arXiv preprint.
[11] Waghela, H., et al. (2024). "Enhancing Adversarial Text Attacks on BERT Models with Projected Gradient Descent." arXiv preprint arXiv:2407.21073.
[12] Chen, T., et al. (2025). "Debate-Driven Multi-Agent LLMs for Phishing Email Detection." arXiv preprint arXiv:2503.22038.
[13] Al-Kaabi, H., et al. (2025). "Smart Spam Detection: An AI-Based Machine Learning Framework." International Journal for Multidisciplinary Research.
[14] Thota, P., & Nilizadeh, S. (2024). "Attacks against Abstractive Text Summarization Models through Lead Bias and Influence Functions." Findings of the Association for Computational Linguistics: EMNLP 2024.
[15] Tang, R., et al. (2025). "ADVERSARIAL TRAINING STRATEGIES FOR ENHANCING THE SECURITY OF LARGE LANGUAGE MODELS." Journal of Information Security.
[16] Sanh, V., et al. (2022). "Multitask Prompted Training Enables Zero-Shot Task Generalization." International Conference on Learning Representations.
[17] Cohen, W. W. (2015). "Enron Email Dataset." Carnegie Mellon University.
[18] Almeida, T.A., et al. (2013). "Towards SMS Spam Filtering: Results under a New Dataset." International Journal of Information Security Science.
[19] Tang, K., et al. (2023). "SpamDam: An End-to-End Framework for Privacy-Preserving and Adversary-Resistant SMS Spam Detection." Proceedings of the ACM SIGSAC Conference on Computer and Communications Security.
[20] Devlin, J., et al. (2019). "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding." Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics.
[21] Madry, A., et al. (2018). "Towards Deep Learning Models Resistant to Adversarial Attacks." International Conference on Learning Representations.
[22] Goyal, P., et al. (2025). "LLM-Powered Intent-Based Categorization of Phishing Emails." arXiv preprint arXiv:2506.14337.
[23] Li, J., et al. (2025). "Quantification of Large Language Model Distillation." arXiv preprint arXiv:2501.12619.
[24] Fang, Y., et al. (2021). "Dual Gating: A Dynamic Gating and Routing Mechanism for Efficient CNNs." IEEE Transactions on Neural Networks and Learning Systems.
[25] NVIDIA. (2024). "LLM Benchmarking: Fundamental Concepts." NVIDIA Developer Blog.
