Update README.md
Browse files
README.md
CHANGED
|
@@ -67,6 +67,8 @@ Try out Bark yourself!
|
|
| 67 |
</a>
|
| 68 |
|
| 69 |
|
|
|
|
|
|
|
| 70 |
You can run Bark locally with the π€ Transformers library from version 4.31.0 onwards.
|
| 71 |
|
| 72 |
1. First install the π€ [Transformers library](https://github.com/huggingface/transformers) and scipy:
|
|
@@ -125,6 +127,49 @@ scipy.io.wavfile.write("bark_out.wav", rate=sampling_rate, data=speech_values.cp
|
|
| 125 |
|
| 126 |
For more details on using the Bark model for inference using the π€ Transformers library, refer to the [Bark docs](https://huggingface.co/docs/transformers/model_doc/bark).
|
| 127 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 128 |
## Suno Usage
|
| 129 |
|
| 130 |
You can also run Bark locally through the original [Bark library]((https://github.com/suno-ai/bark):
|
|
|
|
| 67 |
</a>
|
| 68 |
|
| 69 |
|
| 70 |
+
## π€ Transformers Usage
|
| 71 |
+
|
| 72 |
You can run Bark locally with the π€ Transformers library from version 4.31.0 onwards.
|
| 73 |
|
| 74 |
1. First install the π€ [Transformers library](https://github.com/huggingface/transformers) and scipy:
|
|
|
|
| 127 |
|
| 128 |
For more details on using the Bark model for inference using the π€ Transformers library, refer to the [Bark docs](https://huggingface.co/docs/transformers/model_doc/bark).
|
| 129 |
|
| 130 |
+
### Optimization tips
|
| 131 |
+
|
| 132 |
+
Refers to this [blog post](https://huggingface.co/blog/optimizing-bark#benchmark-results) to find out more about the following methods and a benchmark of their benefits.
|
| 133 |
+
|
| 134 |
+
#### Get significant speed-ups:
|
| 135 |
+
|
| 136 |
+
**Using π€ Better Transformer**
|
| 137 |
+
|
| 138 |
+
Better Transformer is an π€ Optimum feature that performs kernel fusion under the hood. You can gain 20% to 30% in speed with zero performance degradation. It only requires one line of code to export the model to π€ Better Transformer:
|
| 139 |
+
```python
|
| 140 |
+
model = model.to_bettertransformer()
|
| 141 |
+
```
|
| 142 |
+
Note that π€ Optimum must be installed before using this feature. [Here's how to install it.](https://huggingface.co/docs/optimum/installation)
|
| 143 |
+
|
| 144 |
+
**Using Flash Attention 2**
|
| 145 |
+
|
| 146 |
+
Flash Attention 2 is an even faster, optimized version of the previous optimization.
|
| 147 |
+
```python
|
| 148 |
+
model = BarkModel.from_pretrained("suno/bark-small", torch_dtype=torch.float16, use_flash_attention_2=True).to(device)
|
| 149 |
+
```
|
| 150 |
+
Make sure to load your model in half-precision (e.g. `torch.float16``) and to [install](https://github.com/Dao-AILab/flash-attention#installation-and-features) the latest version of Flash Attention 2.
|
| 151 |
+
|
| 152 |
+
**Note:** Flash Attention 2 is only available on newer GPUs, refer to π€ Better Transformer in case your GPU don't support it.
|
| 153 |
+
|
| 154 |
+
#### Reduce memory footprint:
|
| 155 |
+
|
| 156 |
+
**Using half-precision**
|
| 157 |
+
|
| 158 |
+
You can speed up inference and reduce memory footprint by 50% simply by loading the model in half-precision (e.g. `torch.float16``).
|
| 159 |
+
|
| 160 |
+
**Using CPU offload**
|
| 161 |
+
|
| 162 |
+
Bark is made up of 4 sub-models, which are called up sequentially during audio generation. In other words, while one sub-model is in use, the other sub-models are idle.
|
| 163 |
+
|
| 164 |
+
If you're using a CUDA device, a simple solution to benefit from an 80% reduction in memory footprint is to offload the GPU's submodels when they're idle. This operation is called CPU offloading. You can use it with one line of code.
|
| 165 |
+
|
| 166 |
+
```python
|
| 167 |
+
model.enable_cpu_offload()
|
| 168 |
+
```
|
| 169 |
+
Note that π€ Accelerate must be installed before using this feature. [Here's how to install it.](https://huggingface.co/docs/accelerate/basic_tutorials/install)
|
| 170 |
+
|
| 171 |
+
|
| 172 |
+
|
| 173 |
## Suno Usage
|
| 174 |
|
| 175 |
You can also run Bark locally through the original [Bark library]((https://github.com/suno-ai/bark):
|