Instructions to use ruggsea/Llama3.1-Instruct-SEP-Chat with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use ruggsea/Llama3.1-Instruct-SEP-Chat with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="ruggsea/Llama3.1-Instruct-SEP-Chat")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("ruggsea/Llama3.1-Instruct-SEP-Chat")
model = AutoModelForCausalLM.from_pretrained("ruggsea/Llama3.1-Instruct-SEP-Chat")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use ruggsea/Llama3.1-Instruct-SEP-Chat with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "ruggsea/Llama3.1-Instruct-SEP-Chat"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ruggsea/Llama3.1-Instruct-SEP-Chat",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/ruggsea/Llama3.1-Instruct-SEP-Chat

SGLang

How to use ruggsea/Llama3.1-Instruct-SEP-Chat with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "ruggsea/Llama3.1-Instruct-SEP-Chat" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ruggsea/Llama3.1-Instruct-SEP-Chat",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "ruggsea/Llama3.1-Instruct-SEP-Chat" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ruggsea/Llama3.1-Instruct-SEP-Chat",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use ruggsea/Llama3.1-Instruct-SEP-Chat with Docker Model Runner:
```
docker model run hf.co/ruggsea/Llama3.1-Instruct-SEP-Chat
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Llama3-SEP-Chat: Philosophy Expert Assistant

This model is a LoRA-finetuned version of meta-llama/Meta-Llama-3.1-8B-instruct on a curated dataset of Stanford Encyclopedia of Philosophy (SEP) conversations. The model is designed to engage in philosophical discussions with a formal yet accessible tone, leveraging the comprehensive knowledge from SEP.

Model Description

The model was trained using direct finetuning on the instruct variant of Llama 3, preserving its native chat format and instruction-following capabilities while enhancing its philosophical expertise.

Training Dataset

The training data consists of multi-turn conversations derived from the Stanford Encyclopedia of Philosophy, formatted as chat interactions between a user and an assistant. The conversations maintain academic rigor while ensuring accessibility.

Chat Format

The model uses Llama 3's native chat format, which is automatically applied by the tokenizer. No additional tokens or formatting were added during finetuning.

Training Details

Model Configuration

Base Model: meta-llama/Meta-Llama-3.1-8B-instruct
Training Type: LoRA (Low-Rank Adaptation)
Quantization: 4-bit (NF4)
Compute: Mixed Precision (bfloat16)

Training Hyperparameters

Learning Rate: 2e-5
Train Batch Size: 16
Gradient Accumulation Steps: 2
Effective Batch Size: 32
Optimizer: paged_adamw_8bit
Training Epochs: 5
Warmup Ratio: 0.03
LoRA Configuration:
- Rank: 256
- Alpha: 128
- Dropout: 0.05
- Target: all-linear layers

Framework Versions

Transformers: latest
PEFT: latest
PyTorch: 2.1.0+cu121
TRL: latest
Accelerate: latest

Usage

The model can be used with the standard Hugging Face transformers library:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "ruggsea/Llama3.1-Instruct-SEP-Chat"

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    device_map="auto"
)

# Format your input using the chat template
messages = [
    {"role": "user", "content": "What is the categorical imperative?"}
]

# Apply the chat template
prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False
)

# Generate response
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
    repetition_penalty=1.2,
    no_repeat_ngram_size=3,
)

response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
print(response)

Limitations

The model's knowledge is primarily focused on philosophical concepts and may not perform as well on general topics
As with all language models, it may occasionally generate incorrect or inconsistent information
The model inherits any limitations and biases present in the base Llama 3 model and the SEP dataset

License

This model is subject to the Meta Llama 3 license terms. Please refer to Meta's licensing for usage requirements and restrictions.

Downloads last month: 8

Safetensors

Model size

8B params

Tensor type

F16

Model tree for ruggsea/Llama3.1-Instruct-SEP-Chat

Base model

meta-llama/Llama-3.1-8B

Finetuned

meta-llama/Llama-3.1-8B-Instruct

Finetuned

(2760)

this model

Quantizations

2 models

ruggsea
/

Llama3.1-Instruct-SEP-Chat