strickvl
/

isafpr-mistral-lora-templatefree

@@ -22,8 +22,8 @@ base_model: mistralai/Mistral-7B-v0.1
 model_type: MistralForCausalLM
 tokenizer_type: LlamaTokenizer
-load_in_8bit: false
-load_in_4bit: true
 strict: false
 data_seed: 42
@@ -38,7 +38,7 @@ output_dir: ./outputs/mistral/lora-out-templatefree
 hub_model_id: strickvl/isafpr-mistral-lora-templatefree
-sequence_len: 4096
 sample_packing: true
 pad_to_sequence_len: true
@@ -110,7 +110,7 @@ special_tokens:
 This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.0297
 ## Model description
@@ -147,23 +147,23 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
-| 1.4053        | 0.0276 | 1    | 1.4080          |
-| 0.1866        | 0.2483 | 9    | 0.1346          |
-| 0.0544        | 0.4966 | 18   | 0.0551          |
-| 0.0516        | 0.7448 | 27   | 0.0442          |
-| 0.0387        | 0.9931 | 36   | 0.0400          |
-| 0.0354        | 1.2138 | 45   | 0.0367          |
-| 0.0396        | 1.4621 | 54   | 0.0352          |
-| 0.0282        | 1.7103 | 63   | 0.0341          |
-| 0.0335        | 1.9586 | 72   | 0.0333          |
-| 0.0257        | 2.1793 | 81   | 0.0317          |
-| 0.0206        | 2.4276 | 90   | 0.0313          |
-| 0.0259        | 2.6759 | 99   | 0.0312          |
-| 0.024         | 2.9241 | 108  | 0.0301          |
-| 0.0219        | 3.1517 | 117  | 0.0300          |
-| 0.0221        | 3.4    | 126  | 0.0298          |
-| 0.0225        | 3.6483 | 135  | 0.0297          |
-| 0.0208        | 3.8966 | 144  | 0.0297          |
 ### Framework versions

 model_type: MistralForCausalLM
 tokenizer_type: LlamaTokenizer
+load_in_8bit: true
+load_in_4bit: false
 strict: false
 data_seed: 42
 hub_model_id: strickvl/isafpr-mistral-lora-templatefree
+sequence_len: 2048
 sample_packing: true
 pad_to_sequence_len: true
 This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.0288
 ## Model description
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
+| 1.5339        | 0.0131 | 1    | 1.5408          |
+| 0.0671        | 0.2492 | 19   | 0.0549          |
+| 0.037         | 0.4984 | 38   | 0.0406          |
+| 0.0424        | 0.7475 | 57   | 0.0361          |
+| 0.035         | 0.9967 | 76   | 0.0351          |
+| 0.0322        | 1.2295 | 95   | 0.0336          |
+| 0.0247        | 1.4787 | 114  | 0.0314          |
+| 0.0229        | 1.7279 | 133  | 0.0313          |
+| 0.0241        | 1.9770 | 152  | 0.0299          |
+| 0.0222        | 2.2098 | 171  | 0.0307          |
+| 0.0183        | 2.4590 | 190  | 0.0296          |
+| 0.0205        | 2.7082 | 209  | 0.0291          |
+| 0.0153        | 2.9574 | 228  | 0.0281          |
+| 0.0162        | 3.1902 | 247  | 0.0286          |
+| 0.0126        | 3.4393 | 266  | 0.0290          |
+| 0.0147        | 3.6885 | 285  | 0.0287          |
+| 0.0157        | 3.9377 | 304  | 0.0288          |
 ### Framework versions

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:21129307a99244d1cb1aee7d135cf43959d8c94b86b4e62ff51ec7720e672542
 size 335706186

 version https://git-lfs.github.com/spec/v1
+oid sha256:1a9f2c96b8754c87ccd86910df8f4514f74e07548f3f863e6c1c99422fe65200
 size 335706186