End of training
Browse files- README.md +1 -1
- all_results.json +13 -0
- eval_results.json +8 -0
- train_results.json +9 -0
- trainer_state.json +0 -0
- training_eval_loss.png +0 -0
- training_loss.png +0 -0
    	
        README.md
    CHANGED
    
    | @@ -17,7 +17,7 @@ should probably proofread and complete it, then remove this comment. --> | |
| 17 |  | 
| 18 | 
             
            # train_svamp_101112_1760638004
         | 
| 19 |  | 
| 20 | 
            -
            This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on  | 
| 21 | 
             
            It achieves the following results on the evaluation set:
         | 
| 22 | 
             
            - Loss: 0.2142
         | 
| 23 | 
             
            - Num Input Tokens Seen: 1430592
         | 
|  | |
| 17 |  | 
| 18 | 
             
            # train_svamp_101112_1760638004
         | 
| 19 |  | 
| 20 | 
            +
            This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on the svamp dataset.
         | 
| 21 | 
             
            It achieves the following results on the evaluation set:
         | 
| 22 | 
             
            - Loss: 0.2142
         | 
| 23 | 
             
            - Num Input Tokens Seen: 1430592
         | 
    	
        all_results.json
    ADDED
    
    | @@ -0,0 +1,13 @@ | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            {
         | 
| 2 | 
            +
                "epoch": 20.0,
         | 
| 3 | 
            +
                "eval_loss": 0.214238703250885,
         | 
| 4 | 
            +
                "eval_runtime": 1.0987,
         | 
| 5 | 
            +
                "eval_samples_per_second": 63.714,
         | 
| 6 | 
            +
                "eval_steps_per_second": 16.384,
         | 
| 7 | 
            +
                "num_input_tokens_seen": 1430592,
         | 
| 8 | 
            +
                "total_flos": 6.442059877318656e+16,
         | 
| 9 | 
            +
                "train_loss": 0.39732415456136194,
         | 
| 10 | 
            +
                "train_runtime": 576.3251,
         | 
| 11 | 
            +
                "train_samples_per_second": 21.863,
         | 
| 12 | 
            +
                "train_steps_per_second": 5.483
         | 
| 13 | 
            +
            }
         | 
    	
        eval_results.json
    ADDED
    
    | @@ -0,0 +1,8 @@ | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            {
         | 
| 2 | 
            +
                "epoch": 20.0,
         | 
| 3 | 
            +
                "eval_loss": 0.214238703250885,
         | 
| 4 | 
            +
                "eval_runtime": 1.0987,
         | 
| 5 | 
            +
                "eval_samples_per_second": 63.714,
         | 
| 6 | 
            +
                "eval_steps_per_second": 16.384,
         | 
| 7 | 
            +
                "num_input_tokens_seen": 1430592
         | 
| 8 | 
            +
            }
         | 
    	
        train_results.json
    ADDED
    
    | @@ -0,0 +1,9 @@ | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            {
         | 
| 2 | 
            +
                "epoch": 20.0,
         | 
| 3 | 
            +
                "num_input_tokens_seen": 1430592,
         | 
| 4 | 
            +
                "total_flos": 6.442059877318656e+16,
         | 
| 5 | 
            +
                "train_loss": 0.39732415456136194,
         | 
| 6 | 
            +
                "train_runtime": 576.3251,
         | 
| 7 | 
            +
                "train_samples_per_second": 21.863,
         | 
| 8 | 
            +
                "train_steps_per_second": 5.483
         | 
| 9 | 
            +
            }
         | 
    	
        trainer_state.json
    ADDED
    
    | The diff for this file is too large to render. 
		See raw diff | 
|  | 
    	
        training_eval_loss.png
    ADDED
    
    |   | 
    	
        training_loss.png
    ADDED
    
    |   | 
