mohdusman001/llama-3p1-8b-instruct-PI2-table-sft-8192
Stage‑2 (π₂): text + schema → table. Fine‑tuned from meta-llama/Meta-Llama-3.1-8B-Instruct with a 3‑stage context schedule (2k → 4k → 8k).
Context window: 8192 tokens.
Final eval — loss: 2.395211, ppl: 10.9705
Quick sanity — json_valid: 0.078, key_order: 0.078, type: 0.078
Contract (I/O)
Input (chat):
- user message containing:
[SCHEMA]— JSON object like{"fields":[{"name":"col1","type":"string"}, ...]}<|document|>— the unstructured text to parse
Output: raw JSON Lines (JSONL) — one JSON object per row, keys exactly in schema order.
No arrays, no prose, no code fences.
Quickstart (deterministic)
import json, torch
from transformers import AutoTokenizer, AutoModelForCausalLM
MODEL_ID = "mohdusman001/llama-3p1-8b-instruct-PI2-table-sft-8192"
tok = AutoTokenizer.from_pretrained(MODEL_ID, use_fast=True)
if tok.pad_token is None: tok.pad_token = tok.eos_token
# Detect common EOS/EOT ids for clean stopping.
def eos_ids(tokenizer):
ids = []
if tokenizer.eos_token_id is not None:
ids.append(int(tokenizer.eos_token_id))
for s in ["<|eot_id|>", "<|end_of_text|>", "<|eos|>", "</s>", "<|im_end|>", "<|eot|>"]:
try:
tid = tokenizer.convert_tokens_to_ids(s)
if tid is not None and tid != tokenizer.unk_token_id:
ids.append(int(tid))
except Exception:
pass
return list(dict.fromkeys(ids)) # dedupe
model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
torch_dtype=(torch.bfloat16 if torch.cuda.is_available() else torch.float32),
device_map="auto",
attn_implementation="flash_attention_2",
)
schema = {"fields":[{"name":"order_id","type":"string"},{"name":"item","type":"string"},{"name":"qty","type":"integer"}]}
doc = "Orders today:\n- O-1003: 2x pencil\n- O-1004: 1x notebook"
user = "[SCHEMA]\n" + json.dumps(schema) + "\n\n<|document|>\n" + doc + "\n"
chat = [{"role":"user","content": user}]
txt = tok.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
inp = tok(txt, return_tensors="pt").to(model.device)
with torch.no_grad():
out = model.generate(
**inp, max_new_tokens=256, do_sample=False,
eos_token_id=eos_ids(tok), pad_token_id=tok.pad_token_id
)
gen = tok.decode(out[0][inp["input_ids"].shape[1]:], skip_special_tokens=True)
rows = []
for ln in gen.splitlines():
ln = ln.strip()
if not ln: continue
try:
rows.append(json.loads(ln))
except Exception:
break # stop at first non‑JSON line
print(rows) # list of dicts
Prompt recipe (text)
[SCHEMA]
{"fields":[{"name":"order_id","type":"string"},{"name":"item","type":"string"},{"name":"qty","type":"integer"}]}
<|document|>
Orders today:
- O-1003: 2x pencil
- O-1004: 1x notebook
Preview (first rows from samples/)
Sample 1 (first rows)
{"name": "Alimentum", "area": "city centre", "familyfriendly": ""}
Sample 2 (first rows)
{"name": "Alimentum", "area": "city centre", "familyfriendly": "false"}
Sample 3 (first rows)
{"name": "Alimentum", "area": "city centre", "familyfriendly": "false"}
Notes
- Keep decoding deterministic (no sampling) for higher JSON validity.
- Validate downstream: enforce field order and simple type checks.
- Respect the base model license and your data licenses.
- Downloads last month
- 105
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for mohdusman001/llama-3p1-8b-instruct-PI2-table-sft-8192
Base model
meta-llama/Llama-3.1-8B
Finetuned
meta-llama/Llama-3.1-8B-Instruct