Introduction

The Ming large language model (Ming‑LLM) is a domain‑specialized LLM for the energy sector.

We release both the base model and the supervised fine‑tuned (SFT) variant.
The Ming base model is initialized from the Qwen2.5‑72B base model and is subsequently adapted via continued pretraining on a high‑quality energy‑domain corpus.
The SFT variant is initialized from the Ming base model and is trained on instruction‑tuning datasets, including conversational QA, sentiment analysis, and information extraction, among others.
Both models demonstrate improved performance across the C‑Eval, CMMLU, MMLU, GSM8K, and IFEval benchmarks.

Model Parameters

Base model:

sequence_len: 4096
gradient_accumulation_steps: 128
learning_rate: 1.0e-5
lr_scheduler_type: cosine
warmup_ratio: 0
num_train_epochs: 1.0

SFT:

sequence_len: 4096
gradient_accumulation_steps: 128
max learning rate: 2e-6
max_grad_norm: 1.0
lr_scheduler_type: cosine
warmup_ratio: 0.03
num_train_epochs: 1.0

Evaluation

Model	c-eval 5-shot	cmmlu 5-shot	mmlu 5-shot	GPQA 0-shot	BBH 0-shot	HellaSwag 10-shot	GSM8K	IFEVAL
qwen2.5-72B-base	89.72	89.75	84.79	37.88	85.81	94.93	89.99	-
ming1.0-base	90.11	89.84	84.97	41.92	84.80	92.73	89.23	-
qwen2.5-72B-instruct	87.97	87.26	84.18	36.87	83.68	92.65	89.69	82.81
ming1.0	90.08	89.94	85.12	37.88	85.24	94.20	91.43	78.74

Inference

You can use Ming model with the standard HuggingFace transformers library:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer

dtype = torch.bfloat16
device_map = "auto"

model_path = /model/path
tokenizer = AutoTokenizer.from_pretrained(
    model_path, use_fast=True, trust_remote_code=True
)
model = AutoModelForCausalLM.from_pretrained(
    model_path, torch_dtype=dtype, device_map=device_map, trust_remote_code=True
)

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user",   "content": "who are you?"} 
]

prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    output_ids = model.generate(
        **inputs,
        max_new_tokens=256,
        do_sample=True,
        temperature=0.3,
        top_p=0.9,
        repetition_penalty=1.1,
        eos_token_id=eos_token_id,  
        pad_token_id=(tokenizer.pad_token_id or tokenizer.eos_token_id),
        streamer=None 
    )
gen_ids = output_ids[0, inputs["input_ids"].shape[1]:]
text = tokenizer.decode(gen_ids, skip_special_tokens=False)

Bias, Risks, and Limitations

Like any base language model or fine-tuned model without safety filtering, these models can easily be prompted by users to generate harmful and sensitive content.
Such content may also be produced unintentionally, especially in cases involving bias, so we recommend that users consider the risks when applying this technology.
Additionally, many statements from Ming Model or any LLM are often inaccurate, so facts should be verified.

License and use

Ming1.0 is built with Qwen-2.5-72B. Qwen-2.5-72B is licensed under the Qwen LICENSE AGREEMENT, Copyright (c) Alibaba Cloud. All Rights Reserved.
Subject to the Qwen LICENSE AGREEMENT, Ming1.0 is under MIT license.

Downloads last month: -

Safetensors

Model size

73B params

Tensor type

BF16

Model tree for ZhongMingTech/Ming1.0

Base model

Qwen/Qwen2.5-72B

Finetuned

(51)

this model

Quantizations

2 models