File size: 1,838 Bytes
64599a2 7d2cd9d 64599a2 7d2cd9d 64599a2 7d2cd9d 64599a2 7d2cd9d 64599a2 7d2cd9d 64599a2 7d2cd9d 64599a2 7d2cd9d 64599a2 7d2cd9d 64599a2 7d2cd9d 64599a2 7d2cd9d 64599a2 7d2cd9d 64599a2 7d2cd9d 64599a2 7d2cd9d 64599a2 7d2cd9d 64599a2 7d2cd9d 64599a2 7d2cd9d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 |
---
base_model: openai/gpt-oss-20b
library_name: peft
license: apache-2.0
tags:
- trl
- sft
- lora
- reasoning
- multilingual
model_type: lora
---
# gpt-oss-20b-multilingual-reasoner
This is a LoRA (Low-Rank Adaptation) fine-tuned model based on [openai/gpt-oss-20b](https://huggingface.co/openai/gpt-oss-20b).
## Model Details
- **Base Model**: openai/gpt-oss-20b
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
- **Training Framework**: TRL (Transformer Reinforcement Learning)
- **LoRA Rank**: 8
- **LoRA Alpha**: 16
- **Target Modules**: q_proj, o_proj, v_proj, k_proj
## Usage
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"openai/gpt-oss-20b",
torch_dtype=torch.float16,
device_map="auto"
)
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "yiwenX/gpt-oss-20b-multilingual-reasoner")
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("yiwenX/gpt-oss-20b-multilingual-reasoner")
# Generate text
inputs = tokenizer("Hello, how are you?", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```
## Training Details
This model was fine-tuned using:
- **Framework**: TRL (Transformer Reinforcement Learning)
- **Method**: Supervised Fine-Tuning (SFT)
- **PEFT Type**: LoRA
- **Transformers Version**: 4.56.0
- **PyTorch Version**: 2.8.0+cu128
## Model Files
- `adapter_config.json`: LoRA configuration
- `adapter_model.safetensors`: LoRA weights
- `tokenizer.json`: Tokenizer vocabulary
- `tokenizer_config.json`: Tokenizer configuration
- `special_tokens_map.json`: Special tokens mapping
- `chat_template.jinja`: Chat template for conversation format
|