jadohu
/

Qwen3-14B-GRPO

Reinforcement Learning

Model card Files Files and versions

README.md exists but content is empty.

Downloads last month: 8

Safetensors

Model size

15B params

Tensor type

BF16

·

Video Preview

Reinforcement Learning

loading

Model tree for jadohu/Qwen3-14B-GRPO

Base model

Qwen/Qwen3-14B-Base

Finetuned

(45)

this model

Quantizations

1 model

Dataset used to train jadohu/Qwen3-14B-GRPO

Collection including jadohu/Qwen3-14B-GRPO

MASA

Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning • 5 items • Updated 9 days ago • 1