MetalGPT-1-FP8 / README.md
preductor's picture
Upload README.md with huggingface_hub
642275e verified
metadata
pipeline_tag: text-generation
library_name: transformers
tags:
  - mining
  - fp8
license: apache-2.0
language:
  - ru
base_model: nn-tech/MetalGPT-1

Description

MetalGPT-1 is a model built upon the Qwen/Qwen3-32B and incorporates both continual pre-training and supervised fine-tuning on domain-specific data from the mining and metallurgy industry.


Quantization

For convenience and improved performance, we also provide this FP8 checkpoint of the nn-tech/MetalGPT-1 model. Using FP8 precision enables faster inference and lower memory usage, while preserving model quality and numerical stability.


VLLM usage

vllm serve nn-tech/MetalGPT-1-FP8 --reasoning-parser qwen3

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="dummy"  
)

response = client.chat.completions.create(
    model="nn-tech/MetalGPT-1-FP8",
    messages=[
        {"role": "system", "content": "Ты специалист в области металлургии."},
        {"role": "user", "content": "Назови плюсы и минусы хлоридной и сульфатной технологии производства никеля."}
    ],
    temperature=0.7,
    max_tokens=1024
)

print(response.choices[0].message.content)