Add/update the quantized ONNX model files and README.md for Transformers.js v3
Browse files## Applied Quantizations
### β
Based on `model.onnx` *with* slimming
β³ β
`int8`: `model_int8.onnx` (added)
β³ β
`uint8`: `model_uint8.onnx` (added)
β³ β
`q4`: `model_q4.onnx` (added)
β³ β
`q4f16`: `model_q4f16.onnx` (added)
β³ β
`bnb4`: `model_bnb4.onnx` (added)
### β
Based on `model.onnx` *with* slimming
β³ β
`int8`: `model_int8.onnx` (added)
β³ β
`uint8`: `model_uint8.onnx` (added)
β³ β
`q4`: `model_q4.onnx` (added)
β³ β
`q4f16`: `model_q4f16.onnx` (added)
β³ β
`bnb4`: `model_bnb4.onnx` (added)
- README.md +16 -0
- onnx/model_bnb4.onnx +3 -0
- onnx/model_int8.onnx +3 -0
- onnx/model_q4.onnx +3 -0
- onnx/model_q4f16.onnx +3 -0
- onnx/model_uint8.onnx +3 -0
README.md
CHANGED
|
@@ -5,4 +5,20 @@ library_name: transformers.js
|
|
| 5 |
|
| 6 |
https://huggingface.co/cambridgeltl/SapBERT-from-PubMedBERT-fulltext with ONNX weights to be compatible with Transformers.js.
|
| 7 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 8 |
Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [π€ Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
|
|
|
|
| 5 |
|
| 6 |
https://huggingface.co/cambridgeltl/SapBERT-from-PubMedBERT-fulltext with ONNX weights to be compatible with Transformers.js.
|
| 7 |
|
| 8 |
+
## Usage (Transformers.js)
|
| 9 |
+
|
| 10 |
+
If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
|
| 11 |
+
```bash
|
| 12 |
+
npm i @huggingface/transformers
|
| 13 |
+
```
|
| 14 |
+
|
| 15 |
+
**Example:** Run feature extraction.
|
| 16 |
+
|
| 17 |
+
```js
|
| 18 |
+
import { pipeline } from '@huggingface/transformers';
|
| 19 |
+
|
| 20 |
+
const extractor = await pipeline('feature-extraction', 'Xenova/SapBERT-from-PubMedBERT-fulltext');
|
| 21 |
+
const output = await extractor('This is a simple test.');
|
| 22 |
+
```
|
| 23 |
+
|
| 24 |
Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [π€ Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
|
onnx/model_bnb4.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d96881cf2e3559995b8ce2d98f09331b5e6fac25bd44b34d73813535fb49283a
|
| 3 |
+
size 143893470
|
onnx/model_int8.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:efc4d8a90da476381cb63c6592cfa086bfcbaf7216dbb6a493eac6f9586eced9
|
| 3 |
+
size 109622402
|
onnx/model_q4.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:511013e376a885477c665d961a8860428c542c520f757c3fe88a986a385aae07
|
| 3 |
+
size 149201358
|
onnx/model_q4f16.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d928da553e23e1858bbd37d37e14b42426531fa97601750ac9d65bdc5737bd4f
|
| 3 |
+
size 95979131
|
onnx/model_uint8.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:adbb643048e9d9bb1512bf15fb309eb1d7e7efab90a04ea49d903511723f7a92
|
| 3 |
+
size 109622449
|