Text Generation
Transformers
Safetensors
English
qwen3
conversational
text-generation-inference
AIPlans

Qwen3-0.6B-DPO

Model Card for Model ID

This model is a fine-tuned variant of Qwen/Qwen3-0.6B, trained using Direct Preference Optimization (DPO) on a preference-form version of the nvidia/HelpSteer2 dataset as part of the AIPlans Model Diffing Project.

Model Details

Model Description

This model is a 0.6B parameter language model based on Qwen3-0.6B and fine-tuned using DPO for preference optimization.
The goal of the fine-tuning was to improve helpfulness and harmlessness as measured by the HelpSteer2 preference dataset, while enabling controlled model diffing experiments within the AIPlans research workflow.

Special attention was paid to training efficiency, including gradient checkpointing and other memory-saving strategies.

Developed by: AIPlans
Funded by: AIPlans
Shared by: AIPlans

Model type: Causal decoder-only Transformer (LLM)
Languages: English
License: MIT
Fine-tuned from: Qwen/Qwen3-0.6B
Training Method: Direct Preference Optimization (DPO)
Intended Use: Research on model diffing, preference fine-tuning, evaluation of lightweight LLM behavior changes.

Model Sources

Training Details

Training Data

Dataset is taken from Jennny/helpsteer2-helpfulness-preference . Thanks Jennny

Evaluation

Below is a comparison between the base Qwen3-0.6B model and our DPO-trained version (trained using HelpSteer2 preference data).

Evaluation Results

The model was evaluated using lm-eval-harness on multiple reasoning and truthfulness benchmarks.
Below is a comparison between the Base Qwen3-0.6B model and This DPO-Trained Model.

πŸ“Š Benchmark Comparison

Benchmark Comparison

Task Metric Base Model DPO Model Change
ARC-Challenge acc 0.3148 0.3208 +0.0060
ARC-Challenge acc_norm 0.3447 0.3430 βˆ’0.0017
ARC-Easy acc 0.6044 0.6069 +0.0025
ARC-Easy acc_norm 0.5589 0.5610 +0.0021
HellaSwag acc 0.3751 0.3782 +0.0031
HellaSwag acc_norm 0.4738 0.4799 +0.0061
TruthfulQA MC2 acc 0.4275 0.4335 +0.0060
Winogrande acc 0.5604 0.5627 +0.0023

Model Card Authors

Jithesh Pavan D Souza – AIPlans Research Intern

Model Card Contact

Jithesh – [email protected]

Downloads last month
30
Safetensors
Model size
0.6B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for AIPlans/Qwen3-0.6B-DPO_NOTLORA

Finetuned
Qwen/Qwen3-0.6B
Finetuned
(418)
this model

Datasets used to train AIPlans/Qwen3-0.6B-DPO_NOTLORA