EpistemeAI
/

ReasoningCore-3B-Instruct-r01-Reflect-Math

Text Generation

text-generation-inference

Model card Files Files and versions

ReasoningCore-3B-Instruct-r01-Reflect-Math / README.md

legolasyiu's picture

Update README.md

64e2d54 verified 10 months ago

|

history blame contribute delete

2.75 kB

	---
	base_model: EpistemeAI/ReasoningCore-3B-Instruct-r01-Reflect
	tags:
	- text-generation-inference
	- transformers
	- unsloth
	- llama
	- trl
	license: llama3.2
	language:
	- en
	new_version: EpistemeAI/ReasoningCore-3B-Instruct-r01-Reflect-Math
	---

	This is a reasoning and reflect instruction-tuned generative model in 3B size (text in/text out).

	Model Architecture:
	Llama 3.2 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) with GRPO fine tuning using unsloth, to align with human preferences for helpfulness and safety.
	Fine tune with Numina math dataset.


	### Use with transformers

	Starting with `transformers >= 4.43.0` onward, you can run conversational inference using the Transformers `pipeline` abstraction or by leveraging the Auto classes with the `generate()` function.

	Make sure to update your transformers installation via `pip install --upgrade transformers`.

	```python
	import torch
	from transformers import pipeline

	model_id = "EpistemeAI/ReasoningCore-3B-Instruct-r01-Reflect-Math"
	pipe = pipeline(
	"text-generation",
	model=model_id,
	torch_dtype=torch.bfloat16,
	device_map="auto",
	)
	messages = [
	{"role": "system", "content": "You are a powerful assistant Respond in the following format:
	<reasoning>
	...
	</reasoning>
	<reflecting>
	...
	</reflecting>
	<answer>
	...
	</answer>"},
	{"role": "user", "content": "Which is bigger? 9.11 or 9.9?"},
	]
	outputs = pipe(
	messages,
	max_new_tokens=256,
	)
	print(outputs[0]["generated_text"][-1])
	```

	## Using [SuperTransformer](https://github.com/tomtyiu/SuperTransformer-SHF)
	```python
	import SuperTransformer
	# Load SuperTransformer Class, (1) Loads Huggingface model, (2) System Prompt (3) Text/prompt (4)Max tokens
	SuperTransformers = SuperTransformers("EpistemeAI/ReasoningCore-3B-Instruct-r01-Reflect-Math","You are a highly knowledgeable assistant with expertise in mathematics. <reasoning>...</reasoning><reflecting>...</reflecting><answer>...</answer>","What is the area of a circle, radius=16, reason step by step", 2026)
	# 8-bit quantization
	SuperTransformers.HuggingFaceTransformer8bit()
	# or 4-bit quantization
	SuperTransformers.HuggingFaceTransformer4bit()
	```


	# Uploaded model

	- Developed by: EpistemeAI
	- License: apache-2.0
	- Finetuned from model : EpistemeAI/ReasoningCore-3B-Instruct-r01-Reflect

	This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

	[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)