UPSC Mains Question Generator
This is a fine-tuned version of ibm-granite/granite-3.0-350m specialized for generating UPSC Civil Services (Mains) style questions for General Studies (GS) papers.
It was trained on a dataset of 58 topics, each with numerous example questions from previous years, covering GS1, GS2, GS3, and GS4.
How to Use
This model uses a specific Question:/Answer: prompt format. To get a good response, you must follow this template.
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import re
# --- 1. Load the Model ---
MODEL_ID = "Hardman/upsc-question-generator"
# Configure for 4-bit loading
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.float16
)
model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
quantization_config=bnb_config,
device_map="auto",
dtype=torch.float16
)
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
# --- 2. Define the Prompt ---
topic = "GS3 - Cybersecurity threats in India"
num_questions = 2
instruction = f"""Generate exactly {num_questions} original UPSC-style Mains questions for the topic: "{topic}"."""
prompt = f"Question:\n{instruction}\n\nAnswer:"
# --- 3. Generate ---
inputs = tokenizer(prompt, return_tensors="pt", padding=True, truncation=True)
inputs = {k: v.to(model.device) for k, v in inputs.items()}
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.7,
top_p=0.9,
do_sample=True,
pad_token_id=tokenizer.eos_token_id,
eos_token_id=tokenizer.eos_token_id
)
# --- 4. Decode and Parse ---
full_output = tokenizer.decode(outputs[0], skip_special_tokens=True)
answer = full_output.split("Answer:")[-1].strip()
# Clean up the output
for line in answer.split('\n'):
line = line.strip()
line = re.sub(r"^(Question \d+:|\d+\)|\d+\.)\s*", "", line)
if len(line) > 25: # Filter out junk
print(line)
''''
## Training Details
Base Model: ibm-granite/granite-3.0-350m
Method: 4-bit QLoRA (fine-tuning) using Unsloth.
Data: A custom dataset of 58 topics and their associated past questions for all 4 GS papers.
Hardware: Google Colab T4 GPU
Training Steps: 120
- Downloads last month
- 7