Instructions to use EssentialAI/rnj-1-instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use EssentialAI/rnj-1-instruct with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="EssentialAI/rnj-1-instruct") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("EssentialAI/rnj-1-instruct") model = AutoModelForCausalLM.from_pretrained("EssentialAI/rnj-1-instruct") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- HuggingChat
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use EssentialAI/rnj-1-instruct with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "EssentialAI/rnj-1-instruct" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "EssentialAI/rnj-1-instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/EssentialAI/rnj-1-instruct
- SGLang
How to use EssentialAI/rnj-1-instruct with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "EssentialAI/rnj-1-instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "EssentialAI/rnj-1-instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "EssentialAI/rnj-1-instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "EssentialAI/rnj-1-instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use EssentialAI/rnj-1-instruct with Docker Model Runner:
docker model run hf.co/EssentialAI/rnj-1-instruct
gemma3 is absolutely trash.
while this model isn't that, it still uses the same implementation of attention. it's trash for quantization, it's trash for inference, it's trash for any sorta computation beyond take it and inference it on TPUs.
I might be alone in asking for this, but I'd very much appreciate it if you left a dumpster fire like google behind and used a normal implementation that can work with the rest of the libraries instead of fighting with the developer because google's fucking shares might drop. what has google done for us lately?! except sucking our bones dry like the parasite it is?