Add link to Neuron-optimized version
#59
by
badaoui
HF Staff
- opened
README.md
CHANGED
|
@@ -63,4 +63,16 @@ print(outputs[0]["generated_text"])
|
|
| 63 |
# How many helicopters can a human eat in one sitting?</s>
|
| 64 |
# <|assistant|>
|
| 65 |
# ...
|
| 66 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 63 |
# How many helicopters can a human eat in one sitting?</s>
|
| 64 |
# <|assistant|>
|
| 65 |
# ...
|
| 66 |
+
```
|
| 67 |
+
|
| 68 |
+
---
|
| 69 |
+
## π AWS Neuron Optimized Version Available
|
| 70 |
+
|
| 71 |
+
A Neuron-optimized version of this model is available for improved performance on AWS Inferentia/Trainium instances:
|
| 72 |
+
|
| 73 |
+
**[badaoui/TinyLlama-TinyLlama-1.1B-Chat-v1.0-neuron](https://huggingface.co/badaoui/TinyLlama-TinyLlama-1.1B-Chat-v1.0-neuron)**
|
| 74 |
+
|
| 75 |
+
The Neuron-optimized version provides:
|
| 76 |
+
- Pre-compiled artifacts for faster loading
|
| 77 |
+
- Optimized performance on AWS Neuron devices
|
| 78 |
+
- Same model capabilities with improved inference speed
|