File size: 1,941 Bytes
891414d
 
 
 
beec28e
891414d
beec28e
891414d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
---
language: 
- ht
license: mit
library_name: peft
tags:
- peft
- text2text-generation
- text-generation
base_model: google/mt5-large
---

# PTHQL_language_Haitian_Creole

This is the Haitian Creole (hat_Latn) Phylogenetic Tree Hierarquical QLoRAs (PTHQL) adapter from [Generating from AMRs into High and Low-Resource Languages using Phylogenetic Knowledge and Hierarchical QLoRA Training (HQL)](https://aclanthology.org/2024.inlg-main.7/) used for AMR-to-Text generation.

# Use

This model is the last of 4 hierarquical LoRAs. It is strongly adviseable to load all 4 LoRAs in order.

The following is minimal code to generate Haitian Creole text from an AMR graph:
```
from transformers import MT5ForConditionalGeneration, AutoTokenizer
from peft import PeftModel

model = MT5ForConditionalGeneration.from_pretrained('google/mt5-large')
tokennizer = AutoTokenizer.from_pretrained('google/mt5-large')

model = PeftModel.from_pretrained(model, 'WilliamSotoM/PTHQL_level0_Indo_European')
model = model.merge_and_unload()

model = PeftModel.from_pretrained(model, 'WilliamSotoM/PTHQL_level1_Romance')
model = model.merge_and_unload()

model = PeftModel.from_pretrained(model, 'WilliamSotoM/PTHQL_level2_Gallo_Romance')
model = model.merge_and_unload()

model = PeftModel.from_pretrained(model, 'WilliamSotoM/PTHQL_language_Haitian_Creole')
model = model.merge_and_unload()

graph = '''
(c / contrast-01
    :ARG2 (t / thing
    :quant (l2 / lot)
    :ARG0-of (l / look-02
        :ARG1 (d / dinosaur)
        :mod (s / still))
    :topic (b / bird)))
'''
tokenized_input = tokenizer(graph, return_tensors='pt')

with torch.inference_mode():
    prediction = model.generate(**tokenized_input)
    generated_text = tokenizer.batch_decode(prediction, skip_special_tokens=True)[0]

print(f'Generated text:', generated_text)
```

Expected outpu:
```
Men, en ce qui concerne les oiseaux, il y a beaucoup de coses qui toujou semblen desinòxes.
```