DeepSeek‑Prover‑V2‑7B · LoRA Adapter
This repository hosts a LoRA adapter fine‑tuned on top of deepseek-ai/DeepSeek-Prover-V2-7B using 🤗 trl’s SFTTrainer.
Training Setup
| Hyper‑parameter | Value |
|---|---|
| Learning rate | 2 × 10⁻⁴ |
| Batch size / device | 16 |
| Gradient accumulation steps | 1 |
| Effective batch size | 16 |
| Epochs | 1 |
| Scheduler | linear |
| Warm‑up ratio | 0.03 |
| Weight decay | 0.01 |
| Seed | 42 |
| Sequence length | 1 792 |
| Flash‑Attention‑2 | ✅ (use_flash_attention_2=True) |
LoRA configuration
| Setting | Value |
|---|---|
| Rank r | 16 |
| α | 32 |
| Dropout | 0.05 |
| Target modules | all linear layers |
| Modules saved | embed_tokens, lm_head |
| Bias | none |
RoPE scaling: YARN, factor = 16.0, β_fast = 32.0, β_slow = 1.0
Training was performed on GPUs with bfloat16 precision (torch_dtype=torch.bfloat16).
Loss Curve
Usage
from transformers import AutoTokenizer, AutoPeftModelForCausalLM
model = AutoPeftModelForCausalLM.from_pretrained(
"your‑username/DeepSeek-Prover-V2-7B-conjecture-chat-new-config-20250724_0955",
trust_remote_code=True,
)
tok = AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-Prover-V2-7B", trust_remote_code=True)
prompt = "Prove that the sum of two even numbers is even."
out = model.generate(**tok(prompt, return_tensors="pt").to(model.device), max_new_tokens=128)
print(tok.decode(out[0], skip_special_tokens=True))
- Downloads last month
- -
Model tree for haielab/DeepSeek-Prover-V2-7B-conjecture-base-FineTune-20250724_0955
Base model
deepseek-ai/DeepSeek-Prover-V2-7B