Instructions to use raicrits/OpenLLama13b_Loquace_ITA with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use raicrits/OpenLLama13b_Loquace_ITA with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="raicrits/OpenLLama13b_Loquace_ITA")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("raicrits/OpenLLama13b_Loquace_ITA", dtype="auto")

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use raicrits/OpenLLama13b_Loquace_ITA with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "raicrits/OpenLLama13b_Loquace_ITA"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "raicrits/OpenLLama13b_Loquace_ITA",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/raicrits/OpenLLama13b_Loquace_ITA

SGLang

How to use raicrits/OpenLLama13b_Loquace_ITA with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "raicrits/OpenLLama13b_Loquace_ITA" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "raicrits/OpenLLama13b_Loquace_ITA",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "raicrits/OpenLLama13b_Loquace_ITA" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "raicrits/OpenLLama13b_Loquace_ITA",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use raicrits/OpenLLama13b_Loquace_ITA with Docker Model Runner:
```
docker model run hf.co/raicrits/OpenLLama13b_Loquace_ITA
```

OpenLLama13b_Loquace_ITA

Commit History

Update README.md

8e3327b

stefanoscotta commited on Jul 25, 2023

Update README.md

cfe9269

stefanoscotta commited on Jul 25, 2023

Update README.md

ab03ec0

stefanoscotta commited on Jul 25, 2023

Update README.md

8be35b4

stefanoscotta commited on Jul 25, 2023

Update README.md

d0c39e4

stefanoscotta commited on Jul 25, 2023

Update README.md

f22b3d4

amessina71 commited on Jul 25, 2023

Update README.md

92a070a

amessina71 commited on Jul 25, 2023

Update README.md

82d6f1a

amessina71 commited on Jul 25, 2023

First model version

0402336

Ubuntu commited on Jul 24, 2023

Update .gitattributes

39fbe95

stefanoscotta commited on Jul 24, 2023

initial commit

3d675db

stefanoscotta commited on Jul 24, 2023

Commit History

Update README.md 8e3327b

Update README.md cfe9269

Update README.md ab03ec0

Update README.md 8be35b4

Update README.md d0c39e4

Update README.md f22b3d4

Update README.md 92a070a

Update README.md 82d6f1a

First model version 0402336

Update .gitattributes 39fbe95

initial commit 3d675db

Update README.md

8e3327b

Update README.md

cfe9269

Update README.md

ab03ec0

Update README.md

8be35b4

Update README.md

d0c39e4

Update README.md

f22b3d4

Update README.md

92a070a

Update README.md

82d6f1a

First model version

0402336

Update .gitattributes

39fbe95

initial commit

3d675db