SmolLM3 German Teacher V6 (Ξ»=2500)

A finetuned 3B parameter model specialized for teaching German at A1-B1 levels, optimized for grammatical accuracy using Elastic Weight Consolidation (EWC).

Model Details

Property Value
Base Model HuggingFaceTB/SmolLM3-3B
Training Method QLoRA + EWC (Two-Stage Continual Learning)
Training Data ~1,800 examples (900 per stage)
EWC Lambda 2500 (best from grid search)
Hardware RTX 4090 (24GB VRAM)

Evaluation Results

Metric V6 (This Model) V5 Baseline Improvement
CoLA MCC 0.6236 0.5927 +5.2%
GEC Macro F1 0.1909 0.1121 +70.3%
Generation Quality 3.29/5.0 3.71/5.0 -11.3%
Overall Score 0.5158 0.4987 +3.4%

Why V6?

We tested multiple approaches (V5-V9) to balance accuracy and fluency:

Model Method CoLA MCC GEC F1 Generation Overall Status
V5 EWC Baseline 0.5927 0.1121 3.71 0.4987 Reference
V6 EWC Ξ»=2500 0.6236 0.1909 3.29 0.5158 Best
V7 Multi-Task SFT 0.3333 0.0479 3.00 0.3858 Failed
V8 DPO 0.5833 0.0625 3.63 0.4730 Failed
V9 SimPO 0.5674 0.0731 3.72 0.4787 Failed

V6 has the best grammatical accuracy, which is critical for a German teacher model. Preference optimization (V8/V9) improved fluency but destroyed accuracy.

Capabilities

  • German grammar correction with clear explanations
  • Error identification: haben/sein in Perfekt, article gender, case endings, word order
  • A1-B1 level instruction following CEFR guidelines
  • Bilingual explanations (German/English)

Files

File Size Quantization VRAM Required
smollm3-german-teacher-v6-q4_k_m.gguf 1.9 GB Q4_K_M ~3 GB

Usage

With Ollama

# Download and create model
ollama create smollm3-german:v6 -f - <<EOF
FROM ./smollm3-german-teacher-v6-q4_k_m.gguf

TEMPLATE """{{- if .System }}<|im_start|>system
{{ .System }}<|im_end|>
{{ end }}{{- range .Messages }}<|im_start|>{{ .Role }}
{{ .Content }}<|im_end|>
{{ end }}<|im_start|>assistant
"""

SYSTEM """You are a friendly German language teacher specializing in A1-B1 level instruction. Help students learn German grammar, correct mistakes with clear explanations, and provide examples."""

PARAMETER stop "<|im_end|>"
PARAMETER temperature 0.7
EOF

# Use
ollama run smollm3-german:v6 "Korrigiere: Ich habe nach Berlin gefahren."

With llama.cpp

./llama-cli -m smollm3-german-teacher-v6-q4_k_m.gguf \
  -p "<|im_start|>user\nKorrigiere: Ich habe nach Berlin gefahren.<|im_end|>\n<|im_start|>assistant\n" \
  -n 256 --temp 0.7

With Transformers (Full Model)

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("jaigouk/smollm3-german-teacher")
tokenizer = AutoTokenizer.from_pretrained("jaigouk/smollm3-german-teacher")

prompt = "Korrigiere: Ich habe nach Berlin gefahren."
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0]))

Example Output

Input: Korrigiere: Ich habe nach Berlin gefahren.

Output:

Die korrekte Form ist: "Ich bin nach Berlin gefahren."

ErklΓ€rung: Bei Verben der Bewegung (fahren, gehen, kommen, fliegen) verwendet man im Perfekt das Hilfsverb "sein", nicht "haben".

Regel: Bewegungsverben + sein = Perfekt

  • Ich bin gefahren (nicht: Ich habe gefahren)
  • Ich bin gegangen
  • Ich bin gekommen

Training Details

Two-Stage EWC Training

  1. Stage 1: Grammar fundamentals (articles, cases, verb conjugation)
  2. Stage 2: Error correction and explanations with EWC regularization

EWC (Elastic Weight Consolidation)

EWC prevents catastrophic forgetting by:

  • Computing Fisher Information Matrix after Stage 1
  • Using Fisher as importance weights during Stage 2
  • Loss: L_total = L_task + Ξ» * L_ewc

Lambda=2500 was selected from grid search over [2500, 5000, 10000].

Hyperparameters

# Stage 1 & 2
learning_rate = 2e-5
batch_size = 4
epochs = 3
lora_r = 16
lora_alpha = 32

# EWC
ewc_lambda = 2500
fisher_samples = 500

Limitations

  • Optimized for A1-B1 level; may struggle with advanced grammar (C1+)
  • Generation fluency is lower than V8/V9 (trade-off for accuracy)
  • Best for structured grammar tasks; less natural in open conversation
  • German/English bilingual; other languages not supported

Citation

@misc{smollm3-german-teacher-v6,
  author = {Jaigouk Kim},
  title = {SmolLM3 German Teacher V6: EWC-Optimized Grammar Instruction Model},
  year = {2025},
  publisher = {HuggingFace},
  url = {https://huggingface.co/jaigouk/smollm3-german-teacher}
}

Acknowledgments

License

Apache 2.0 (same as base model)

Downloads last month
116
GGUF
Model size
3B params
Architecture
smollm3
Hardware compatibility
Log In to view the estimation

4-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for jaigouk/smollm3-german-teacher

Quantized
(61)
this model

Evaluation results