TianyiQ's picture
Upload folder using huggingface_hub
946a64f verified
|
raw
history blame
2.57 kB
metadata
license: other
base_model: meta-llama/Meta-Llama-3-70B
tags:
  - llama-factory
  - full
  - generated_from_trainer
model-index:
  - name: C015_Meta-Llama-3-70B_pretrain_20240509_173017
    results: []

C015_Meta-Llama-3-70B_pretrain_20240509_173017

This model is a fine-tuned version of /mnt/fl/models/llama3/Meta-Llama-3-70B on the C015_data dataset. It achieves the following results on the evaluation set:

  • Loss: 2.0062

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-06
  • train_batch_size: 1
  • eval_batch_size: 2
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 32
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 128
  • total_eval_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: polynomial
  • lr_scheduler_warmup_ratio: 0.075
  • num_epochs: 3.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
2.0376 0.1558 3 2.0736
2.0549 0.3117 6 2.0736
2.1192 0.4675 9 2.0664
2.1346 0.6234 12 2.0558
2.08 0.7792 15 2.0401
2.0576 0.9351 18 2.0248
2.0351 1.0909 21 2.0168
1.9633 1.2468 24 2.0135
2.0389 1.4026 27 2.0117
1.9637 1.5584 30 2.0107
1.9749 1.7143 33 2.0100
2.0231 1.8701 36 2.0094
1.9785 2.0260 39 2.0088
1.9619 2.1818 42 2.0083
1.9971 2.3377 45 2.0078
1.9992 2.4935 48 2.0075
1.9912 2.6494 51 2.0071
1.9894 2.8052 54 2.0067
2.0143 2.9610 57 2.0062

Framework versions

  • Transformers 4.40.2
  • Pytorch 2.3.0
  • Datasets 2.19.1
  • Tokenizers 0.19.1