End of training
Browse files- README.md +2 -2
- all_results.json +13 -0
- eval_results.json +8 -0
- train_results.json +9 -0
- trainer_state.json +0 -0
- training_eval_loss.png +0 -0
- training_loss.png +0 -0
    	
        README.md
    CHANGED
    
    | @@ -17,9 +17,9 @@ should probably proofread and complete it, then remove this comment. --> | |
| 17 |  | 
| 18 | 
             
            # train_svamp_101112_1760638001
         | 
| 19 |  | 
| 20 | 
            -
            This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on  | 
| 21 | 
             
            It achieves the following results on the evaluation set:
         | 
| 22 | 
            -
            - Loss: 0. | 
| 23 | 
             
            - Num Input Tokens Seen: 1430592
         | 
| 24 |  | 
| 25 | 
             
            ## Model description
         | 
|  | |
| 17 |  | 
| 18 | 
             
            # train_svamp_101112_1760638001
         | 
| 19 |  | 
| 20 | 
            +
            This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on the svamp dataset.
         | 
| 21 | 
             
            It achieves the following results on the evaluation set:
         | 
| 22 | 
            +
            - Loss: 0.2259
         | 
| 23 | 
             
            - Num Input Tokens Seen: 1430592
         | 
| 24 |  | 
| 25 | 
             
            ## Model description
         | 
    	
        all_results.json
    ADDED
    
    | @@ -0,0 +1,13 @@ | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            {
         | 
| 2 | 
            +
                "epoch": 20.0,
         | 
| 3 | 
            +
                "eval_loss": 0.2259398251771927,
         | 
| 4 | 
            +
                "eval_runtime": 1.6777,
         | 
| 5 | 
            +
                "eval_samples_per_second": 41.723,
         | 
| 6 | 
            +
                "eval_steps_per_second": 10.729,
         | 
| 7 | 
            +
                "num_input_tokens_seen": 1430592,
         | 
| 8 | 
            +
                "total_flos": 6.441891117819494e+16,
         | 
| 9 | 
            +
                "train_loss": 0.06765125470246627,
         | 
| 10 | 
            +
                "train_runtime": 660.6029,
         | 
| 11 | 
            +
                "train_samples_per_second": 19.073,
         | 
| 12 | 
            +
                "train_steps_per_second": 4.784
         | 
| 13 | 
            +
            }
         | 
    	
        eval_results.json
    ADDED
    
    | @@ -0,0 +1,8 @@ | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            {
         | 
| 2 | 
            +
                "epoch": 20.0,
         | 
| 3 | 
            +
                "eval_loss": 0.2259398251771927,
         | 
| 4 | 
            +
                "eval_runtime": 1.6777,
         | 
| 5 | 
            +
                "eval_samples_per_second": 41.723,
         | 
| 6 | 
            +
                "eval_steps_per_second": 10.729,
         | 
| 7 | 
            +
                "num_input_tokens_seen": 1430592
         | 
| 8 | 
            +
            }
         | 
    	
        train_results.json
    ADDED
    
    | @@ -0,0 +1,9 @@ | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            {
         | 
| 2 | 
            +
                "epoch": 20.0,
         | 
| 3 | 
            +
                "num_input_tokens_seen": 1430592,
         | 
| 4 | 
            +
                "total_flos": 6.441891117819494e+16,
         | 
| 5 | 
            +
                "train_loss": 0.06765125470246627,
         | 
| 6 | 
            +
                "train_runtime": 660.6029,
         | 
| 7 | 
            +
                "train_samples_per_second": 19.073,
         | 
| 8 | 
            +
                "train_steps_per_second": 4.784
         | 
| 9 | 
            +
            }
         | 
    	
        trainer_state.json
    ADDED
    
    | The diff for this file is too large to render. 
		See raw diff | 
|  | 
    	
        training_eval_loss.png
    ADDED
    
    |   | 
    	
        training_loss.png
    ADDED
    
    |   | 
