Volume 18, Number 3

A Cross-Domain Benchmark and Evaluation of Cybersecurity-Specific Bert Models

  Authors

Laeeba Javed and Aasim Zafar, Aligarh Muslim University, India

  Abstract

Cybersecurity language models are typically evaluated under fragmented ways, which impedes meaningful comparison and operational comprehension. Existing cybersecurity-specific BERT models are often tested in isolation, with inconsistent preprocessing, tokenization, and evaluation procedures. This paper presents a unified cross-domain benchmarking study for systematically evaluating cybersecurity-adapted BERT models under identical experimental conditions. The evaluation spans CTI, phishing, logs, and CVE domains using CTI-BERT, SecureBERT, CySecBERT, and SecBERT. Results reveal strong performance convergence across models and highlight domain-driven failure modes rather than architectural superiority. To examine real-world resilience, the study expands on this paradigm with zero-shot and fewshot cross-domain evaluations, revealing asymmetric transfer behavior and domain-dependent adaptation efficiency. A controlled training method ablation is also performed, indicating that aggressive optimization does not always increase performance and can decrease stability in semantically rich domains. Stress filtering further exposes brittle reliance on lexical shortcuts and limited semantic grounding. These findings provide practical guidance for model–domain alignment and real-world cybersecurity deployment. The findings of this study are especially relevant to network security and operational situations when cybersecurity models are deployed across heterogeneous data streams

  Keywords

Cybersecurity , NLP, BERT, Threat Intelligence, Domain-Adaptive Pretraining, Network Security.