Taiwan Legal LLMs
Collection
Models trained from our paper "Continual Pre-Training is (not) What You Need in Domain Adaptation."
•
8 items
•
Updated
This model is a fine-tuned version of lopentu/Llama-3-8B-Taiwan-Llawa-TCxYZL-Instruct on an unknown dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|---|---|---|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 0.6931 | 0.0 | 0.0 | 0.0 | 0.0 | -39.1966 | -119.6160 | -0.1592 | -0.1378 |
| 0.6468 | 0.9937 | 76 | 0.6482 | -0.0350 | -0.1333 | 0.8596 | 0.0983 | -52.5218 | -123.1127 | -0.2376 | -0.2140 |
| 0.5604 | 1.9873 | 152 | 0.5646 | -0.2442 | -0.6292 | 0.8245 | 0.3850 | -102.1200 | -144.0388 | -0.1309 | -0.0973 |
| 0.5142 | 2.9810 | 228 | 0.5434 | -0.3119 | -0.7953 | 0.8245 | 0.4833 | -118.7237 | -150.8103 | -0.1068 | -0.0702 |