Model Card for cisco-ai/SecureBERT2.0-NER

The Secure Modern BERT NER Model is a fine-tuned transformer based on SecureBERT 2.0, designed for Named Entity Recognition (NER) in cybersecurity text.

It extracts domain-specific entities such as Indicators, Malware, Organizations, Systems, and Vulnerabilities from unstructured data sources like threat reports, incident analyses, advisories, and blogs.

NER in cybersecurity enables:

Automated extraction of indicators of compromise (IOCs)
Structuring of unstructured threat intelligence text
Improved situational awareness for analysts
Faster incident response and vulnerability triage

Model Details

Model Description

Developed by: Cisco AI
Model Type: ModernBertForTokenClassification
Framework: TensorFlow / Transformers
Tokenizer Type: PreTrainedTokenizerFast
Number of Labels: 11
Task: Named Entity Recognition (NER)
License: Apache-2.0
Language: English
Base Model: cisco-ai/SecureBERT2.0

Supported Entity Labels

Entity	Description
`B-Indicator`, `I-Indicator`	Indicators of Compromise (e.g., IPs, domains, hashes)
`B-Malware`, `I-Malware`	Malware or exploit names
`B-Organization`, `I-Organization`	Companies or groups mentioned
`B-System`, `I-System`	Affected software or platforms
`B-Vulnerability`, `I-Vulnerability`	Specific CVEs or flaw descriptions
`O`	Outside token

Model Configuration

Parameter	Value
Hidden size	768
Intermediate size	1152
Hidden layers	22
Attention heads	12
Max sequence length	8192
Vocabulary size	50368
Activation	GELU
Dropout	0.0 (embedding, attention, MLP, classifier)

Uses

Direct Use

Named Entity Recognition (NER) on cybersecurity text
Threat intelligence enrichment
IOC extraction and normalization
Incident report analysis
Vulnerability mention detection

Downstream Use

This model can be integrated into:

Threat intelligence platforms (TIPs)
SOC automation tools
Cybersecurity knowledge graphs
Vulnerability management and CVE monitoring systems

Out-of-Scope Use

Non-technical or general-domain NER tasks
Generative or conversational AI applications

Benchmark Cybersecurity NER Corpus

Dataset Overview

Aspect	Description
Purpose	Benchmark dataset for extracting cybersecurity entities from unstructured reports
Data Source	Curated threat intelligence documents emphasizing malware and system analysis
Annotation Methodology	Fully hand-labeled by domain experts
Entity Types	Malware, Indicator, System, Organization, Vulnerability
Size	3.4k training samples + 717 test samples

How to Get Started with the Model

Example Usage (Transformers)

from transformers import AutoTokenizer, TFAutoModelForTokenClassification, pipeline

model_name = "cisco-ai/SecureBERT2.0-NER"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = TFAutoModelForTokenClassification.from_pretrained(model_name)

ner_pipeline = pipeline("ner", model=model, tokenizer=tokenizer)

text = "Stealc malware targets browser cookies and passwords."
entities = ner_pipeline(text)
print(entities)

Training Details

Training Objective and Procedure

The SecureBERT2.0-NER was fine-tuned for token-level classification on cybersecurity text using Cross Entropy Loss.
Training focused on accurately classifying entity boundaries and types across five cybersecurity-specific categories: Malware, Indicator, System, Organization, and Vulnerability.

The AdamW optimizer was used with a linear learning rate scheduler, and gradient clipping ensured stability during fine-tuning.

Training Configuration

Setting	Value
Objective	Token-wise Cross Entropy
Optimizer	AdamW
Learning Rate	1e-5
Weight Decay	0.001
Batch Size per GPU	8
Epochs	20
Max Sequence Length	1024
Gradient Clipping Norm	1.0
Scheduler	Linear
Mixed Precision	fp16
Framework	TensorFlow / Transformers

Training Dataset

The model was fine-tuned on a cybersecurity-specific NER corpus, containing annotated threat intelligence reports, advisories, and technical documentation.

Property	Description
Dataset Type	Manually annotated corpus
Language	English
Entity Types	Malware, Indicator, System, Organization, Vulnerability
Train Size	3,400 samples
Test Size	717 samples
Annotation Method	Expert hand-labeling for accuracy and consistency

Preprocessing

Texts were tokenized using the PreTrainedTokenizerFast tokenizer from SecureBERT 2.0.
All sequences were truncated or padded to 1024 tokens.
Labels were aligned with subword tokens to maintain token–label consistency.

Hardware and Training Setup

Component	Description
GPUs Used	8× NVIDIA A100
Precision	Mixed precision (fp16)
Batch Size	8 per GPU
Framework	Transformers (TensorFlow backend)

Optimization Summary

The model converged after approximately 20 epochs, with loss stabilizing at a low level.
Validation metrics (F1, precision, recall) showed steady improvement from epoch 3 onward, confirming effective domain-specific adaptation.

Evaluation

Testing Data, Factors & Metrics

Testing Data

Evaluation was conducted on a cybersecurity-specific NER benchmark corpus containing annotated threat reports, advisories, and incident analysis texts.
This benchmark includes five key entity types: Malware, Indicator, System, Organization, and Vulnerability.

Metrics

The following metrics were used to assess model performance:

F1-score: Harmonic mean of precision and recall
Recall: Measures how many true entities were correctly identified
Precision: Measures how many predicted entities were correct

Results

Model	F1	Recall	Precision
CyBERT	0.351	0.281	0.467
SecureBERT	0.734	0.759	0.717
SecureBERT 2.0 (Ours)	0.945	0.965	0.927

Summary

The SecureBERT 2.0 NER model significantly outperforms both CyBERT and the original SecureBERT across all metrics.

It achieves a F1-score of 0.945, a +21% absolute improvement over SecureBERT.
Its recall (0.965) indicates excellent coverage of cybersecurity entities.
Its precision (0.927) shows strong accuracy and low false-positive rates.

This demonstrates that domain-adaptive pretraining and fine-tuning on cybersecurity corpora dramatically improves NER performance compared to general or earlier models.

Reference

@article{aghaei2025securebert,
  title={SecureBERT 2.0: Advanced Language Model for Cybersecurity Intelligence},
  author={Aghaei, Ehsan and Jain, Sarthak and Arun, Prashanth and Sambamoorthy, Arjun},
  journal={arXiv preprint arXiv:2510.00240},
  year={2025}
}

Model Card Authors

Cisco AI

Model Card Contact

For inquiries, please contact [email protected]

Downloads last month: 6

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for cisco-ai/SecureBERT2.0-NER

Base model

answerdotai/ModernBERT-base

Finetuned

cisco-ai/SecureBERT2.0-base

Finetuned

(4)

this model