SetFit with NovaSearch/stella_en_400M_v5
This is a SetFit model that can be used for Text Classification. This SetFit model uses NovaSearch/stella_en_400M_v5 as the Sentence Transformer embedding model. A SetFitHead instance is used for classification.
The model has been trained using an efficient few-shot learning technique that involves:
- Fine-tuning a Sentence Transformer with contrastive learning.
- Training a classification head with features from the fine-tuned Sentence Transformer.
Model Details
Model Description
- Model Type: SetFit
- Sentence Transformer body: NovaSearch/stella_en_400M_v5
- Classification head: a SetFitHead instance
- Maximum Sequence Length: 512 tokens
- Number of Classes: 7 classes
Model Sources
- Repository: SetFit on GitHub
- Paper: Efficient Few-Shot Learning Without Prompts
- Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts
Model Labels
| Label | Examples |
|---|---|
| Extraction |
|
| Math |
|
| Brainstorming |
|
| Factual QA |
|
| Generation |
|
| Coding |
|
| Reasoning |
|
Uses
Direct Use for Inference
First install the SetFit library:
pip install setfit
Then you can load this model and run inference.
from setfit import SetFitModel
# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("setfit_model_id")
# Run inference
preds = model("What is an SRE? Use only Korean in your response and provide a title wrapped in double angular brackets, such as <<SRE>>. Use the keywords 'indicator', 'objective' and 'management'.")
Training Details
Training Set Metrics
| Training set | Min | Median | Max |
|---|---|---|---|
| Word count | 3 | 110.8976 | 8430 |
| Label | Training Sample Count |
|---|---|
| Brainstorming | 250 |
| Coding | 253 |
| Extraction | 250 |
| Factual QA | 255 |
| Generation | 250 |
| Math | 250 |
| Reasoning | 250 |
Training Hyperparameters
- batch_size: (16, 2)
- num_epochs: (1, 15)
- max_steps: 500
- sampling_strategy: oversampling
- body_learning_rate: (2e-05, 1e-05)
- head_learning_rate: 0.0001
- loss: CosineSimilarityLoss
- distance_metric: cosine_distance
- margin: 0.25
- end_to_end: True
- use_amp: False
- warmup_proportion: 0.1
- l2_weight: 0.01
- seed: 42
- run_name: stella_en_400M_v5
- evaluation_strategy: no
- eval_max_steps: -1
- load_best_model_at_end: False
Training Results
| Epoch | Step | Training Loss | Validation Loss |
|---|---|---|---|
| 0.002 | 1 | 0.2869 | - |
| 0.004 | 2 | 0.1469 | - |
| 0.006 | 3 | 0.2431 | - |
| 0.008 | 4 | 0.3568 | - |
| 0.01 | 5 | 0.2769 | - |
| 0.012 | 6 | 0.2425 | - |
| 0.014 | 7 | 0.2001 | - |
| 0.016 | 8 | 0.2825 | - |
| 0.018 | 9 | 0.2433 | - |
| 0.02 | 10 | 0.3096 | - |
| 0.022 | 11 | 0.2856 | - |
| 0.024 | 12 | 0.265 | - |
| 0.026 | 13 | 0.2476 | - |
| 0.028 | 14 | 0.1764 | - |
| 0.03 | 15 | 0.1491 | - |
| 0.032 | 16 | 0.3051 | - |
| 0.034 | 17 | 0.2445 | - |
| 0.036 | 18 | 0.249 | - |
| 0.038 | 19 | 0.1981 | - |
| 0.04 | 20 | 0.1892 | - |
| 0.042 | 21 | 0.1933 | - |
| 0.044 | 22 | 0.2331 | - |
| 0.046 | 23 | 0.2145 | - |
| 0.048 | 24 | 0.1708 | - |
| 0.05 | 25 | 0.2272 | - |
| 0.052 | 26 | 0.1714 | - |
| 0.054 | 27 | 0.2138 | - |
| 0.056 | 28 | 0.2178 | - |
| 0.058 | 29 | 0.1346 | - |
| 0.06 | 30 | 0.1939 | - |
| 0.062 | 31 | 0.1632 | - |
| 0.064 | 32 | 0.1934 | - |
| 0.066 | 33 | 0.1897 | - |
| 0.068 | 34 | 0.1558 | - |
| 0.07 | 35 | 0.1568 | - |
| 0.072 | 36 | 0.1116 | - |
| 0.074 | 37 | 0.1609 | - |
| 0.076 | 38 | 0.1294 | - |
| 0.078 | 39 | 0.1511 | - |
| 0.08 | 40 | 0.1654 | - |
| 0.082 | 41 | 0.1542 | - |
| 0.084 | 42 | 0.0887 | - |
| 0.086 | 43 | 0.0811 | - |
| 0.088 | 44 | 0.0991 | - |
| 0.09 | 45 | 0.0845 | - |
| 0.092 | 46 | 0.0875 | - |
| 0.094 | 47 | 0.0338 | - |
| 0.096 | 48 | 0.0945 | - |
| 0.098 | 49 | 0.0477 | - |
| 0.1 | 50 | 0.0696 | - |
| 0.102 | 51 | 0.136 | - |
| 0.104 | 52 | 0.099 | - |
| 0.106 | 53 | 0.0371 | - |
| 0.108 | 54 | 0.0513 | - |
| 0.11 | 55 | 0.0484 | - |
| 0.112 | 56 | 0.0194 | - |
| 0.114 | 57 | 0.0601 | - |
| 0.116 | 58 | 0.1149 | - |
| 0.118 | 59 | 0.0836 | - |
| 0.12 | 60 | 0.0865 | - |
| 0.122 | 61 | 0.0659 | - |
| 0.124 | 62 | 0.0849 | - |
| 0.126 | 63 | 0.0963 | - |
| 0.128 | 64 | 0.07 | - |
| 0.13 | 65 | 0.0233 | - |
| 0.132 | 66 | 0.1248 | - |
| 0.134 | 67 | 0.0561 | - |
| 0.136 | 68 | 0.0851 | - |
| 0.138 | 69 | 0.0638 | - |
| 0.14 | 70 | 0.0498 | - |
| 0.142 | 71 | 0.0311 | - |
| 0.144 | 72 | 0.1374 | - |
| 0.146 | 73 | 0.0502 | - |
| 0.148 | 74 | 0.0605 | - |
| 0.15 | 75 | 0.0137 | - |
| 0.152 | 76 | 0.065 | - |
| 0.154 | 77 | 0.0846 | - |
| 0.156 | 78 | 0.0347 | - |
| 0.158 | 79 | 0.0517 | - |
| 0.16 | 80 | 0.1447 | - |
| 0.162 | 81 | 0.0609 | - |
| 0.164 | 82 | 0.1423 | - |
| 0.166 | 83 | 0.0917 | - |
| 0.168 | 84 | 0.226 | - |
| 0.17 | 85 | 0.0595 | - |
| 0.172 | 86 | 0.0588 | - |
| 0.174 | 87 | 0.0228 | - |
| 0.176 | 88 | 0.0925 | - |
| 0.178 | 89 | 0.0595 | - |
| 0.18 | 90 | 0.044 | - |
| 0.182 | 91 | 0.0244 | - |
| 0.184 | 92 | 0.0939 | - |
| 0.186 | 93 | 0.0794 | - |
| 0.188 | 94 | 0.0501 | - |
| 0.19 | 95 | 0.1363 | - |
| 0.192 | 96 | 0.0502 | - |
| 0.194 | 97 | 0.0498 | - |
| 0.196 | 98 | 0.0562 | - |
| 0.198 | 99 | 0.0657 | - |
| 0.2 | 100 | 0.0397 | - |
| 0.202 | 101 | 0.0305 | - |
| 0.204 | 102 | 0.0559 | - |
| 0.206 | 103 | 0.0871 | - |
| 0.208 | 104 | 0.063 | - |
| 0.21 | 105 | 0.0143 | - |
| 0.212 | 106 | 0.0706 | - |
| 0.214 | 107 | 0.0627 | - |
| 0.216 | 108 | 0.1047 | - |
| 0.218 | 109 | 0.0487 | - |
| 0.22 | 110 | 0.0086 | - |
| 0.222 | 111 | 0.0562 | - |
| 0.224 | 112 | 0.0101 | - |
| 0.226 | 113 | 0.0235 | - |
| 0.228 | 114 | 0.0511 | - |
| 0.23 | 115 | 0.0295 | - |
| 0.232 | 116 | 0.0549 | - |
| 0.234 | 117 | 0.0554 | - |
| 0.236 | 118 | 0.0301 | - |
| 0.238 | 119 | 0.0152 | - |
| 0.24 | 120 | 0.0234 | - |
| 0.242 | 121 | 0.01 | - |
| 0.244 | 122 | 0.0372 | - |
| 0.246 | 123 | 0.0085 | - |
| 0.248 | 124 | 0.0205 | - |
| 0.25 | 125 | 0.0117 | - |
| 0.252 | 126 | 0.0039 | - |
| 0.254 | 127 | 0.0178 | - |
| 0.256 | 128 | 0.0276 | - |
| 0.258 | 129 | 0.0592 | - |
| 0.26 | 130 | 0.0143 | - |
| 0.262 | 131 | 0.0667 | - |
| 0.264 | 132 | 0.0059 | - |
| 0.266 | 133 | 0.0767 | - |
| 0.268 | 134 | 0.0088 | - |
| 0.27 | 135 | 0.0034 | - |
| 0.272 | 136 | 0.0031 | - |
| 0.274 | 137 | 0.0151 | - |
| 0.276 | 138 | 0.0072 | - |
| 0.278 | 139 | 0.0033 | - |
| 0.28 | 140 | 0.0188 | - |
| 0.282 | 141 | 0.0069 | - |
| 0.284 | 142 | 0.1552 | - |
| 0.286 | 143 | 0.0618 | - |
| 0.288 | 144 | 0.0043 | - |
| 0.29 | 145 | 0.0209 | - |
| 0.292 | 146 | 0.0094 | - |
| 0.294 | 147 | 0.0191 | - |
| 0.296 | 148 | 0.0119 | - |
| 0.298 | 149 | 0.0012 | - |
| 0.3 | 150 | 0.0014 | - |
| 0.302 | 151 | 0.0121 | - |
| 0.304 | 152 | 0.0018 | - |
| 0.306 | 153 | 0.0792 | - |
| 0.308 | 154 | 0.0027 | - |
| 0.31 | 155 | 0.0035 | - |
| 0.312 | 156 | 0.0009 | - |
| 0.314 | 157 | 0.0014 | - |
| 0.316 | 158 | 0.0068 | - |
| 0.318 | 159 | 0.0025 | - |
| 0.32 | 160 | 0.003 | - |
| 0.322 | 161 | 0.0116 | - |
| 0.324 | 162 | 0.0009 | - |
| 0.326 | 163 | 0.0404 | - |
| 0.328 | 164 | 0.0022 | - |
| 0.33 | 165 | 0.0011 | - |
| 0.332 | 166 | 0.0122 | - |
| 0.334 | 167 | 0.0006 | - |
| 0.336 | 168 | 0.0138 | - |
| 0.338 | 169 | 0.0101 | - |
| 0.34 | 170 | 0.0019 | - |
| 0.342 | 171 | 0.0033 | - |
| 0.344 | 172 | 0.0035 | - |
| 0.346 | 173 | 0.007 | - |
| 0.348 | 174 | 0.0008 | - |
| 0.35 | 175 | 0.002 | - |
| 0.352 | 176 | 0.0006 | - |
| 0.354 | 177 | 0.001 | - |
| 0.356 | 178 | 0.0011 | - |
| 0.358 | 179 | 0.0057 | - |
| 0.36 | 180 | 0.0003 | - |
| 0.362 | 181 | 0.001 | - |
| 0.364 | 182 | 0.0007 | - |
| 0.366 | 183 | 0.0016 | - |
| 0.368 | 184 | 0.0018 | - |
| 0.37 | 185 | 0.001 | - |
| 0.372 | 186 | 0.0009 | - |
| 0.374 | 187 | 0.0057 | - |
| 0.376 | 188 | 0.0008 | - |
| 0.378 | 189 | 0.0182 | - |
| 0.38 | 190 | 0.0005 | - |
| 0.382 | 191 | 0.053 | - |
| 0.384 | 192 | 0.0012 | - |
| 0.386 | 193 | 0.0158 | - |
| 0.388 | 194 | 0.0043 | - |
| 0.39 | 195 | 0.0074 | - |
| 0.392 | 196 | 0.0013 | - |
| 0.394 | 197 | 0.0016 | - |
| 0.396 | 198 | 0.0021 | - |
| 0.398 | 199 | 0.0007 | - |
| 0.4 | 200 | 0.002 | - |
| 0.402 | 201 | 0.0004 | - |
| 0.404 | 202 | 0.0008 | - |
| 0.406 | 203 | 0.0002 | - |
| 0.408 | 204 | 0.0026 | - |
| 0.41 | 205 | 0.0012 | - |
| 0.412 | 206 | 0.0004 | - |
| 0.414 | 207 | 0.0017 | - |
| 0.416 | 208 | 0.0038 | - |
| 0.418 | 209 | 0.0008 | - |
| 0.42 | 210 | 0.0008 | - |
| 0.422 | 211 | 0.0007 | - |
| 0.424 | 212 | 0.0577 | - |
| 0.426 | 213 | 0.0013 | - |
| 0.428 | 214 | 0.0005 | - |
| 0.43 | 215 | 0.0015 | - |
| 0.432 | 216 | 0.0006 | - |
| 0.434 | 217 | 0.0005 | - |
| 0.436 | 218 | 0.0017 | - |
| 0.438 | 219 | 0.001 | - |
| 0.44 | 220 | 0.0002 | - |
| 0.442 | 221 | 0.0005 | - |
| 0.444 | 222 | 0.003 | - |
| 0.446 | 223 | 0.0007 | - |
| 0.448 | 224 | 0.0002 | - |
| 0.45 | 225 | 0.001 | - |
| 0.452 | 226 | 0.0006 | - |
| 0.454 | 227 | 0.001 | - |
| 0.456 | 228 | 0.0506 | - |
| 0.458 | 229 | 0.0005 | - |
| 0.46 | 230 | 0.0009 | - |
| 0.462 | 231 | 0.0015 | - |
| 0.464 | 232 | 0.0003 | - |
| 0.466 | 233 | 0.0004 | - |
| 0.468 | 234 | 0.001 | - |
| 0.47 | 235 | 0.0004 | - |
| 0.472 | 236 | 0.0007 | - |
| 0.474 | 237 | 0.0014 | - |
| 0.476 | 238 | 0.0003 | - |
| 0.478 | 239 | 0.0004 | - |
| 0.48 | 240 | 0.0007 | - |
| 0.482 | 241 | 0.0002 | - |
| 0.484 | 242 | 0.0006 | - |
| 0.486 | 243 | 0.0003 | - |
| 0.488 | 244 | 0.0004 | - |
| 0.49 | 245 | 0.0587 | - |
| 0.492 | 246 | 0.0003 | - |
| 0.494 | 247 | 0.0007 | - |
| 0.496 | 248 | 0.0013 | - |
| 0.498 | 249 | 0.0507 | - |
| 0.5 | 250 | 0.0002 | - |
| 0.502 | 251 | 0.0004 | - |
| 0.504 | 252 | 0.0003 | - |
| 0.506 | 253 | 0.0004 | - |
| 0.508 | 254 | 0.0002 | - |
| 0.51 | 255 | 0.0003 | - |
| 0.512 | 256 | 0.0096 | - |
| 0.514 | 257 | 0.0002 | - |
| 0.516 | 258 | 0.0003 | - |
| 0.518 | 259 | 0.0003 | - |
| 0.52 | 260 | 0.0013 | - |
| 0.522 | 261 | 0.0004 | - |
| 0.524 | 262 | 0.0004 | - |
| 0.526 | 263 | 0.0007 | - |
| 0.528 | 264 | 0.0006 | - |
| 0.53 | 265 | 0.0003 | - |
| 0.532 | 266 | 0.0023 | - |
| 0.534 | 267 | 0.0008 | - |
| 0.536 | 268 | 0.0002 | - |
| 0.538 | 269 | 0.0018 | - |
| 0.54 | 270 | 0.0002 | - |
| 0.542 | 271 | 0.0007 | - |
| 0.544 | 272 | 0.0001 | - |
| 0.546 | 273 | 0.0004 | - |
| 0.548 | 274 | 0.0618 | - |
| 0.55 | 275 | 0.0192 | - |
| 0.552 | 276 | 0.0009 | - |
| 0.554 | 277 | 0.0142 | - |
| 0.556 | 278 | 0.0014 | - |
| 0.558 | 279 | 0.0006 | - |
| 0.56 | 280 | 0.0565 | - |
| 0.562 | 281 | 0.0006 | - |
| 0.564 | 282 | 0.0233 | - |
| 0.566 | 283 | 0.0004 | - |
| 0.568 | 284 | 0.0116 | - |
| 0.57 | 285 | 0.0002 | - |
| 0.572 | 286 | 0.0032 | - |
| 0.574 | 287 | 0.0001 | - |
| 0.576 | 288 | 0.0003 | - |
| 0.578 | 289 | 0.0004 | - |
| 0.58 | 290 | 0.0003 | - |
| 0.582 | 291 | 0.0003 | - |
| 0.584 | 292 | 0.0003 | - |
| 0.586 | 293 | 0.0012 | - |
| 0.588 | 294 | 0.0021 | - |
| 0.59 | 295 | 0.0002 | - |
| 0.592 | 296 | 0.0003 | - |
| 0.594 | 297 | 0.0022 | - |
| 0.596 | 298 | 0.0005 | - |
| 0.598 | 299 | 0.0005 | - |
| 0.6 | 300 | 0.0024 | - |
| 0.602 | 301 | 0.0008 | - |
| 0.604 | 302 | 0.0003 | - |
| 0.606 | 303 | 0.0022 | - |
| 0.608 | 304 | 0.0069 | - |
| 0.61 | 305 | 0.0009 | - |
| 0.612 | 306 | 0.0144 | - |
| 0.614 | 307 | 0.0004 | - |
| 0.616 | 308 | 0.0006 | - |
| 0.618 | 309 | 0.0006 | - |
| 0.62 | 310 | 0.0261 | - |
| 0.622 | 311 | 0.0002 | - |
| 0.624 | 312 | 0.0003 | - |
| 0.626 | 313 | 0.0003 | - |
| 0.628 | 314 | 0.0007 | - |
| 0.63 | 315 | 0.0603 | - |
| 0.632 | 316 | 0.0002 | - |
| 0.634 | 317 | 0.0003 | - |
| 0.636 | 318 | 0.0007 | - |
| 0.638 | 319 | 0.0006 | - |
| 0.64 | 320 | 0.0002 | - |
| 0.642 | 321 | 0.0016 | - |
| 0.644 | 322 | 0.0003 | - |
| 0.646 | 323 | 0.0003 | - |
| 0.648 | 324 | 0.0002 | - |
| 0.65 | 325 | 0.0006 | - |
| 0.652 | 326 | 0.0006 | - |
| 0.654 | 327 | 0.0006 | - |
| 0.656 | 328 | 0.0002 | - |
| 0.658 | 329 | 0.0004 | - |
| 0.66 | 330 | 0.0002 | - |
| 0.662 | 331 | 0.0002 | - |
| 0.664 | 332 | 0.0001 | - |
| 0.666 | 333 | 0.0466 | - |
| 0.668 | 334 | 0.0002 | - |
| 0.67 | 335 | 0.0003 | - |
| 0.672 | 336 | 0.0005 | - |
| 0.674 | 337 | 0.0013 | - |
| 0.676 | 338 | 0.0002 | - |
| 0.678 | 339 | 0.0004 | - |
| 0.68 | 340 | 0.0573 | - |
| 0.682 | 341 | 0.0001 | - |
| 0.684 | 342 | 0.0002 | - |
| 0.686 | 343 | 0.0002 | - |
| 0.688 | 344 | 0.0009 | - |
| 0.69 | 345 | 0.024 | - |
| 0.692 | 346 | 0.0003 | - |
| 0.694 | 347 | 0.0011 | - |
| 0.696 | 348 | 0.0002 | - |
| 0.698 | 349 | 0.0191 | - |
| 0.7 | 350 | 0.0001 | - |
| 0.702 | 351 | 0.0002 | - |
| 0.704 | 352 | 0.0009 | - |
| 0.706 | 353 | 0.0004 | - |
| 0.708 | 354 | 0.0001 | - |
| 0.71 | 355 | 0.0 | - |
| 0.712 | 356 | 0.0002 | - |
| 0.714 | 357 | 0.0002 | - |
| 0.716 | 358 | 0.0009 | - |
| 0.718 | 359 | 0.0005 | - |
| 0.72 | 360 | 0.0013 | - |
| 0.722 | 361 | 0.0046 | - |
| 0.724 | 362 | 0.0001 | - |
| 0.726 | 363 | 0.0005 | - |
| 0.728 | 364 | 0.0002 | - |
| 0.73 | 365 | 0.0017 | - |
| 0.732 | 366 | 0.0332 | - |
| 0.734 | 367 | 0.0004 | - |
| 0.736 | 368 | 0.0203 | - |
| 0.738 | 369 | 0.0003 | - |
| 0.74 | 370 | 0.0001 | - |
| 0.742 | 371 | 0.0003 | - |
| 0.744 | 372 | 0.0004 | - |
| 0.746 | 373 | 0.0133 | - |
| 0.748 | 374 | 0.0009 | - |
| 0.75 | 375 | 0.0017 | - |
| 0.752 | 376 | 0.0016 | - |
| 0.754 | 377 | 0.0022 | - |
| 0.756 | 378 | 0.0015 | - |
| 0.758 | 379 | 0.0004 | - |
| 0.76 | 380 | 0.0002 | - |
| 0.762 | 381 | 0.0001 | - |
| 0.764 | 382 | 0.0004 | - |
| 0.766 | 383 | 0.0001 | - |
| 0.768 | 384 | 0.0012 | - |
| 0.77 | 385 | 0.0005 | - |
| 0.772 | 386 | 0.0018 | - |
| 0.774 | 387 | 0.032 | - |
| 0.776 | 388 | 0.0002 | - |
| 0.778 | 389 | 0.0001 | - |
| 0.78 | 390 | 0.0019 | - |
| 0.782 | 391 | 0.001 | - |
| 0.784 | 392 | 0.0003 | - |
| 0.786 | 393 | 0.0001 | - |
| 0.788 | 394 | 0.0005 | - |
| 0.79 | 395 | 0.0016 | - |
| 0.792 | 396 | 0.0005 | - |
| 0.794 | 397 | 0.0018 | - |
| 0.796 | 398 | 0.0007 | - |
| 0.798 | 399 | 0.0002 | - |
| 0.8 | 400 | 0.0004 | - |
| 0.802 | 401 | 0.0002 | - |
| 0.804 | 402 | 0.001 | - |
| 0.806 | 403 | 0.0001 | - |
| 0.808 | 404 | 0.0002 | - |
| 0.81 | 405 | 0.0002 | - |
| 0.812 | 406 | 0.0004 | - |
| 0.814 | 407 | 0.0003 | - |
| 0.816 | 408 | 0.0001 | - |
| 0.818 | 409 | 0.0004 | - |
| 0.82 | 410 | 0.001 | - |
| 0.822 | 411 | 0.0005 | - |
| 0.824 | 412 | 0.0001 | - |
| 0.826 | 413 | 0.0002 | - |
| 0.828 | 414 | 0.0001 | - |
| 0.83 | 415 | 0.0004 | - |
| 0.832 | 416 | 0.0002 | - |
| 0.834 | 417 | 0.0002 | - |
| 0.836 | 418 | 0.0001 | - |
| 0.838 | 419 | 0.0002 | - |
| 0.84 | 420 | 0.0011 | - |
| 0.842 | 421 | 0.0002 | - |
| 0.844 | 422 | 0.0003 | - |
| 0.846 | 423 | 0.0002 | - |
| 0.848 | 424 | 0.0004 | - |
| 0.85 | 425 | 0.0002 | - |
| 0.852 | 426 | 0.0002 | - |
| 0.854 | 427 | 0.0501 | - |
| 0.856 | 428 | 0.0001 | - |
| 0.858 | 429 | 0.0002 | - |
| 0.86 | 430 | 0.0004 | - |
| 0.862 | 431 | 0.0003 | - |
| 0.864 | 432 | 0.0001 | - |
| 0.866 | 433 | 0.0001 | - |
| 0.868 | 434 | 0.0001 | - |
| 0.87 | 435 | 0.0002 | - |
| 0.872 | 436 | 0.0008 | - |
| 0.874 | 437 | 0.0001 | - |
| 0.876 | 438 | 0.0002 | - |
| 0.878 | 439 | 0.0002 | - |
| 0.88 | 440 | 0.0004 | - |
| 0.882 | 441 | 0.0002 | - |
| 0.884 | 442 | 0.0002 | - |
| 0.886 | 443 | 0.0001 | - |
| 0.888 | 444 | 0.0006 | - |
| 0.89 | 445 | 0.0002 | - |
| 0.892 | 446 | 0.0003 | - |
| 0.894 | 447 | 0.0002 | - |
| 0.896 | 448 | 0.0011 | - |
| 0.898 | 449 | 0.0002 | - |
| 0.9 | 450 | 0.0004 | - |
| 0.902 | 451 | 0.0001 | - |
| 0.904 | 452 | 0.0009 | - |
| 0.906 | 453 | 0.0001 | - |
| 0.908 | 454 | 0.0003 | - |
| 0.91 | 455 | 0.0006 | - |
| 0.912 | 456 | 0.0028 | - |
| 0.914 | 457 | 0.0002 | - |
| 0.916 | 458 | 0.0001 | - |
| 0.918 | 459 | 0.0002 | - |
| 0.92 | 460 | 0.0002 | - |
| 0.922 | 461 | 0.0004 | - |
| 0.924 | 462 | 0.0001 | - |
| 0.926 | 463 | 0.0001 | - |
| 0.928 | 464 | 0.0001 | - |
| 0.93 | 465 | 0.002 | - |
| 0.932 | 466 | 0.0003 | - |
| 0.934 | 467 | 0.0006 | - |
| 0.936 | 468 | 0.0001 | - |
| 0.938 | 469 | 0.0002 | - |
| 0.94 | 470 | 0.0002 | - |
| 0.942 | 471 | 0.0001 | - |
| 0.944 | 472 | 0.0002 | - |
| 0.946 | 473 | 0.0003 | - |
| 0.948 | 474 | 0.0003 | - |
| 0.95 | 475 | 0.001 | - |
| 0.952 | 476 | 0.0002 | - |
| 0.954 | 477 | 0.0001 | - |
| 0.956 | 478 | 0.0003 | - |
| 0.958 | 479 | 0.0002 | - |
| 0.96 | 480 | 0.0487 | - |
| 0.962 | 481 | 0.0002 | - |
| 0.964 | 482 | 0.0004 | - |
| 0.966 | 483 | 0.0002 | - |
| 0.968 | 484 | 0.0001 | - |
| 0.97 | 485 | 0.0003 | - |
| 0.972 | 486 | 0.0002 | - |
| 0.974 | 487 | 0.0003 | - |
| 0.976 | 488 | 0.0088 | - |
| 0.978 | 489 | 0.0003 | - |
| 0.98 | 490 | 0.0011 | - |
| 0.982 | 491 | 0.0003 | - |
| 0.984 | 492 | 0.0001 | - |
| 0.986 | 493 | 0.0001 | - |
| 0.988 | 494 | 0.0003 | - |
| 0.99 | 495 | 0.0002 | - |
| 0.992 | 496 | 0.0004 | - |
| 0.994 | 497 | 0.0003 | - |
| 0.996 | 498 | 0.0001 | - |
| 0.998 | 499 | 0.0002 | - |
| 1.0 | 500 | 0.0002 | - |
Framework Versions
- Python: 3.11.3
- SetFit: 1.2.0.dev0
- Sentence Transformers: 3.4.1
- Transformers: 4.49.0
- PyTorch: 2.6.0+cu124
- Datasets: 3.3.2
- Tokenizers: 0.21.0
Citation
BibTeX
@article{https://doi.org/10.48550/arxiv.2209.11055,
doi = {10.48550/ARXIV.2209.11055},
url = {https://arxiv.org/abs/2209.11055},
author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Efficient Few-Shot Learning Without Prompts},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}
- Downloads last month
- 90
Model tree for IIS-NLP/sigma-cls
Base model
NovaSearch/stella_en_400M_v5