nvidia
/

parakeet-ctc-1.1b

@@ -183,9 +183,44 @@ img {
 It is an XXL version of FastConformer CTC [1] (around 1.1B parameters) model.
 See the [model architecture](#model-architecture) section and [NeMo documentation](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/models.html#fast-conformer) for complete architecture details.
-## Transformers
-You can now run Parakeet CTC natively with [Transformers](https://github.com/huggingface/transformers) 🤗
 ```bash
 pip install git+https://github.com/huggingface/transformers
@@ -254,33 +289,7 @@ outputs.loss.backward()
 ```
 </details>
-## NVIDIA NeMo: Training
-To train, fine-tune or play with the model you will need to install [NVIDIA NeMo](https://github.com/NVIDIA/NeMo). We recommend you install it after you've installed latest PyTorch version.
-```
-pip install nemo_toolkit['all']
-```
-## How to Use this Model
-The model is available for use in the NeMo toolkit [3], and can be used as a pre-trained checkpoint for inference or for fine-tuning on another dataset.
-### Automatically instantiate the model
-```python
-import nemo.collections.asr as nemo_asr
-asr_model = nemo_asr.models.EncDecCTCModelBPE.from_pretrained(model_name="nvidia/parakeet-ctc-1.1b")
-```
-### Transcribing using Python
-First, let's get a sample
-```
-wget https://dldata-public.s3.us-east-2.amazonaws.com/2086-149220-0033.wav
-```
-Then simply do:
-```
-asr_model.transcribe(['2086-149220-0033.wav'])
-```
 ### Transcribing many audio files

 It is an XXL version of FastConformer CTC [1] (around 1.1B parameters) model.
 See the [model architecture](#model-architecture) section and [NeMo documentation](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/models.html#fast-conformer) for complete architecture details.
+## NVIDIA NeMo: Training
+To train, fine-tune or play with the model you will need to install [NVIDIA NeMo](https://github.com/NVIDIA/NeMo). We recommend you install it after you've installed latest PyTorch version.
+```
+pip install nemo_toolkit['all']
+```
+## How to Use this Model
+The model is available for use in the NeMo toolkit [3], and can be used as a pre-trained checkpoint for inference or for fine-tuning on another dataset. Moreover, you can now run Parakeet CTC natively with [Transformers](https://github.com/huggingface/transformers) 🤗.
+### Automatically instantiate the model
+Using [NVIDIA NeMo](https://github.com/NVIDIA/NeMo):
+```python
+import nemo.collections.asr as nemo_asr
+asr_model = nemo_asr.models.EncDecCTCModelBPE.from_pretrained(model_name="nvidia/parakeet-ctc-0.6b")
+```
+Using [Transformers](https://github.com/huggingface/transformers) 🤗
+```python
+from transformers import AutoModelForCTC
+model = AutoModelForCTC.from_pretrained("nvidia/parakeet-ctc-0.6b")
+```
+### Transcribing using NeMo
+First, let's get a sample
+```
+wget https://dldata-public.s3.us-east-2.amazonaws.com/2086-149220-0033.wav
+```
+Then simply do:
+```
+asr_model.transcribe(['2086-149220-0033.wav'])
+```
+### Transcribing using [Transformers](https://github.com/huggingface/transformers) 🤗
+Make sure to install `transformers` from source.
 ```bash
 pip install git+https://github.com/huggingface/transformers
 ```
 </details>
+For more details about usage, the refer to [Transformers' documentation](https://huggingface.co/docs/transformers/en/index).
 ### Transcribing many audio files