354 3 40

Undi PRO

Undi95

AI & ML interests

I search sleep

Recent Activity

upvoted a collection 7 days ago

Z-Image

liked a model 8 days ago

Comfy-Org/flux2-dev

liked a model 16 days ago

Error410/gpt-oss-20b-RP2

View all activity

Organizations

upvoted a collection 7 days ago

Z-Image

Collection

4 items • Updated 3 days ago • 66

liked a model 8 days ago

Comfy-Org/flux2-dev

Updated 8 days ago • 831k • 115

liked 2 models 16 days ago

Error410/gpt-oss-20b-RP2

21B • Updated Aug 8 • 7 • 2

JJ547777/rwkv7-g0a4-13.3b-20251114-ctx8192-GGUF

Text Generation • Updated 16 days ago • 1

liked a dataset 5 months ago

nvidia/Nemotron-Personas-USA

Viewer • Updated Oct 28 • 1M • 4.49k • 228

liked 2 models 8 months ago

HiDream-ai/HiDream-I1-Full

Text-to-Image • Updated Jul 17 • 26.3k • • 980

HiDream-ai/MotionPro

Image-to-Video • Updated May 27 • 87

New activity in Undi95/QwQ-RP 9 months ago

This model is SUCH A TRYHARD

🧠 2

#1 opened 9 months ago by

Olafangensan

published 2 models 9 months ago

Undi95/QwQ-RP-LoRA

Updated Mar 7 • 3 • 2

Undi95/QwQ-RP

33B • Updated Mar 7 • 9 • 3

New activity in Undi95/QwQ-RP-GGUF 9 months ago

Error in LM Studio

#1 opened 9 months ago by

Sirfrummel

published a model 9 months ago

Undi95/QwQ-RP-GGUF

33B • Updated Mar 7 • 158 • 9

updated 3 models 9 months ago

New activity in Undi95/MistralThinker-v1.1 9 months ago

What about finetuning on qwq 32b?

#5 opened 9 months ago by

Ainonake

New activity in Undi95/Mistral-11B-v0.1 9 months ago

FYI

#5 opened 9 months ago by

yamatazen

replied to their post 9 months ago

That's what some of my dataset do, but then you're still stuck with one reply trained, not an entire conversation.
I break my head around that haha

Edit: I missread,if you add multiple in the context, the model is confused because they are trimmed out of the context by the chat template to not waste token we don't need anymore.
So we can't train it like this either, because the bot will have multiple thinking process in the conversation.

replied to their post 9 months ago

You could do that but in that case the bot will not use <think>because it's not trained on all of the reply to do it.

What I would ideally want is a model that apply the thinking itself without system prompt or prefilling

posted an update 9 months ago

Post

12621

Hi there!

If you want to create your own thinking model or do a better MistralThinker, I just uploaded my entire dataset made on Deepseek R1 and the axolotl config. (well I made them public)

Axolotl config : Undi95/MistralThinker-v1.1

The dataset : Undi95/R1-RP-ShareGPT3

You can also read all I did on those two discord screenshot from two days ago, I'm a little lazy to rewrite all kek.

Hope you will use them!

6 replies