File size: 4,346 Bytes
d0e7ef1
 
 
 
 
9af11e0
 
d0e7ef1
74a0df7
d0e7ef1
 
 
 
9af11e0
b059702
 
094e1b0
b059702
094e1b0
0a46bde
a7b2741
 
 
 
 
 
 
 
 
 
 
b059702
 
 
094e1b0
b059702
 
 
 
 
 
 
d64918a
9af11e0
a7b2741
 
 
 
 
 
 
 
094e1b0
a7b2741
 
9af11e0
a7b2741
 
 
094e1b0
a7b2741
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9c030a1
 
 
 
 
 
 
094e1b0
9c030a1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a7b2741
9c030a1
 
 
 
a7b2741
d64918a
 
 
a60037d
050e609
0a46bde
 
 
 
 
 
 
 
d64918a
050e609
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
---
language:
- pt
metrics:
- accuracy
- f1
- pearsonr
base_model:
- Qwen/Qwen2.5-7B
pipeline_tag: text-generation
library_name: transformers
tags:
- text-generation-inference
license: apache-2.0
---

###  Amadeus-Verbo-BI-Qwen2.5-7B-PT-BR-Instruct
#### Introduction
Amadeus-Verbo-BI-Qwen2.5-7B-PT-BR-Instruct is a Brazilian-Portuguese language model (PT-BR-LLM) developed from the base model Qwen2.5-7B through fine-tuning, for 2 epochs, with 600k instructions dataset.
Read our article [here](https://arxiv.org/abs/2506.00019).

## Details

- **Architecture:** a Transformer-based model with RoPE, SwiGLU, RMSNorm, and Attention QKV bias pre-trained via Causal Language Modeling
- **Parameters:** 7.62B parameters
- **Number of Parameters (Non-Embedding):** 6.53B
- **Number of Layers:** 28
- **Number of Attention Heads (GQA):** 28 for Q and 4 for KV
- **Context length:** 131,072 tokens
- **Number of steps:** 78838
- **Language:** Brazilian Portuguese

#### Usage

You can use Amadeus-Verbo-BI-Qwen2.5-7B-PT-BR-Instruct with the latest HuggingFace Transformers library and we advise you to use the latest version of Transformers.

With transformers<4.37.0, you will encounter the following error:

KeyError: 'qwen2'

Below, we have provided a simple example of how to load the model and generate text:

#### Quickstart 
The following code snippet uses `pipeline`, `AutoTokenizer`, `AutoModelForCausalLM` and apply_chat_template to show how to load the tokenizer, the model, and how to generate content.

Using the pipeline:
```python
from transformers import pipeline

messages = [
    {"role": "user", "content": "Faça uma planilha nutricional para uma alimentação fitness e mediterrânea com todos os dias da semana"},
]
pipe = pipeline("text-generation", model="amadeusai/AV-BI-Qwen2.5-7B-PT-BR-Instruct")
pipe(messages)
```
OR
```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "amadeusai/AV-BI-Qwen2.5-7B-PT-BR-Instruct"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "Faça uma planilha nutricional para uma alimentação fitness e mediterrânea com todos os dias da semana."
messages = [
    {"role": "system", "content": "Você é um assistente útil."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
```
OR
```python
from transformers import GenerationConfig, TextGenerationPipeline, AutoTokenizer, AutoModelForCausalLM
import torch

# Specify the model and tokenizer
model_id = "amadeusai/AV-BI-Qwen2.5-7B-PT-BR-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

# Specify the generation parameters as you like
generation_config = GenerationConfig(
    **{
    "do_sample": True,
    "max_new_tokens": 512,
    "renormalize_logits": True,
    "repetition_penalty": 1.2,
    "temperature": 0.1,
    "top_k": 50,
    "top_p": 1.0,
    "use_cache": True, 
  }
)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
generator = TextGenerationPipeline(model=model, task="text-generation", tokenizer=tokenizer, device=device)

# Generate text
prompt = "Faça uma planilha nutricional para uma alimentação fitness e mediterrânea com todos os dias da semana"
completion = generator(prompt, generation_config=generation_config)
print(completion[0]['generated_text'])
```

#### Citation

If you find our work helpful, feel free to cite it.
```
@misc{cruzcastañeda2025amadeusverbotechnicalreportpowerful,
      title={Amadeus-Verbo Technical Report: The powerful Qwen2.5 family models trained in Portuguese}, 
      author={William Alberto Cruz-Castañeda and Marcellus Amadeus},
      year={2025},
      eprint={2506.00019},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2506.00019}, 
}
```