mohdusman001/llama-3p1-8b-instruct-PI2-table-sft-8192

Stage‑2 (π₂): text + schema → table. Fine‑tuned from meta-llama/Meta-Llama-3.1-8B-Instruct with a 3‑stage context schedule (2k → 4k → 8k).
Context window: 8192 tokens.

Final eval — loss: 2.395211, ppl: 10.9705
Quick sanity — json_valid: 0.078, key_order: 0.078, type: 0.078


Contract (I/O)

Input (chat):

  • user message containing:
    • [SCHEMA] — JSON object like {"fields":[{"name":"col1","type":"string"}, ...]}
    • <|document|> — the unstructured text to parse

Output: raw JSON Lines (JSONL)one JSON object per row, keys exactly in schema order.
No arrays, no prose, no code fences.


Quickstart (deterministic)

import json, torch
from transformers import AutoTokenizer, AutoModelForCausalLM

MODEL_ID = "mohdusman001/llama-3p1-8b-instruct-PI2-table-sft-8192"
tok = AutoTokenizer.from_pretrained(MODEL_ID, use_fast=True)
if tok.pad_token is None: tok.pad_token = tok.eos_token

# Detect common EOS/EOT ids for clean stopping.
def eos_ids(tokenizer):
    ids = []
    if tokenizer.eos_token_id is not None:
        ids.append(int(tokenizer.eos_token_id))
    for s in ["<|eot_id|>", "<|end_of_text|>", "<|eos|>", "</s>", "<|im_end|>", "<|eot|>"]:
        try:
            tid = tokenizer.convert_tokens_to_ids(s)
            if tid is not None and tid != tokenizer.unk_token_id:
                ids.append(int(tid))
        except Exception:
            pass
    return list(dict.fromkeys(ids))  # dedupe

model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    torch_dtype=(torch.bfloat16 if torch.cuda.is_available() else torch.float32),
    device_map="auto",
    attn_implementation="flash_attention_2",
)

schema = {"fields":[{"name":"order_id","type":"string"},{"name":"item","type":"string"},{"name":"qty","type":"integer"}]}
doc = "Orders today:\n- O-1003: 2x pencil\n- O-1004: 1x notebook"

user = "[SCHEMA]\n" + json.dumps(schema) + "\n\n<|document|>\n" + doc + "\n"
chat = [{"role":"user","content": user}]
txt = tok.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
inp = tok(txt, return_tensors="pt").to(model.device)

with torch.no_grad():
    out = model.generate(
        **inp, max_new_tokens=256, do_sample=False,
        eos_token_id=eos_ids(tok), pad_token_id=tok.pad_token_id
    )
gen = tok.decode(out[0][inp["input_ids"].shape[1]:], skip_special_tokens=True)

rows = []
for ln in gen.splitlines():
    ln = ln.strip()
    if not ln: continue
    try:
        rows.append(json.loads(ln))
    except Exception:
        break  # stop at first non‑JSON line
print(rows)  # list of dicts

Prompt recipe (text)

[SCHEMA]
{"fields":[{"name":"order_id","type":"string"},{"name":"item","type":"string"},{"name":"qty","type":"integer"}]}

<|document|>
Orders today:
- O-1003: 2x pencil
- O-1004: 1x notebook

Preview (first rows from samples/)

Sample 1 (first rows)

{"name": "Alimentum", "area": "city centre", "familyfriendly": ""}

Sample 2 (first rows)

{"name": "Alimentum", "area": "city centre", "familyfriendly": "false"}

Sample 3 (first rows)

{"name": "Alimentum", "area": "city centre", "familyfriendly": "false"}

Notes

  • Keep decoding deterministic (no sampling) for higher JSON validity.
  • Validate downstream: enforce field order and simple type checks.
  • Respect the base model license and your data licenses.
Downloads last month
105
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mohdusman001/llama-3p1-8b-instruct-PI2-table-sft-8192

Finetuned
(2009)
this model