πŸ“– Griot Story Identifier (1,000 Rows, DistilBERT)

This is a fine-tuned binary text classification model that predicts whether a given passage contains a story (story) or does not (not_story).
It was trained on a synthetic dataset of 1,000 rows, with each input text being β‰₯ 300 words.


🧾 Model Details

  • Base model: distilbert-base-uncased
  • Task: Binary text classification (story vs. not_story)
  • Dataset size: 1,000 rows (balanced)
  • Sequence length: 256 max tokens
  • Training epochs: 4
  • Framework: πŸ€— Transformers + PyTorch

πŸ“Š Labels

  • story β†’ text contains a narrative arc (beginning, middle, end, events, characters)
  • not_story β†’ text is descriptive, conversational, or factual without a narrative arc

πŸš€ Usage

Load the model directly from the Hub:

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

repo_id = "mjpsm/Griot-Story-Identifier-1k-v1"

tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForSequenceClassification.from_pretrained(repo_id)

text = """
Last summer, my friends and I built a treehouse in the backyard...
(300+ word passage here)
"""

inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=256)
with torch.no_grad():
    logits = model(**inputs).logits
pred_id = int(logits.argmax(dim=-1))
label = model.config.id2label[pred_id]

print("Predicted label:", label)
Downloads last month
2
Safetensors
Model size
67M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for mjpsm/Griot-Story-Identifier-1k-v1

Quantizations
1 model