Smart Home Task Classification Model

Model Description

This is a fine-tuned DistilBERT-base-uncased model for intelligent routing of smart home assistant requests to appropriate agents. The model classifies user prompts into 5 categories: lighting, climate, security, entertainment, appliance.

Intended Uses & Limitations

Intended Uses

Routing smart home voice commands to the correct agent (e.g., lighting control, climate control)
Low-latency classification for IoT applications
Integration with home automation systems

Limitations

Trained on synthetic data, may not generalize to all real-world prompts
Limited to 5 predefined categories
English language only

Training Details

Training Data

10,000 synthetic smart home prompts
Includes real-time sensor data (temperature, humidity, ambient light, human presence, location)
Format: JSONL with "context" and "output" fields
Split: 8,000 train, 2,000 validation

Training Procedure

Base model: distilbert-base-uncased (66M parameters)
Task: Sequence Classification (5 classes)
Fine-tuning: 3 epochs
Learning rate: 2e-5
Batch size: 16
Optimizer: AdamW

Training Logs

Epoch 1: Train Loss 0.85 → Val Loss 0.32
Epoch 2: Train Loss 0.28 → Val Loss 0.15
Epoch 3: Train Loss 0.12 → Val Loss 0.08
Final Validation: Accuracy 100%, F1 100%

Performance

Metric	Value
Accuracy	100%
F1 Score	100%

How to Use

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("SaiCharan7829/smartHome_task_classification-distilBERT-66M")
model = AutoModelForSequenceClassification.from_pretrained("SaiCharan7829/smartHome_task_classification-distilBERT-66M")

agents = ["lighting", "climate", "security", "entertainment", "appliance"]

prompt = "Turn on the living room lights. Current sensors: Temperature: 25°C, Humidity: 50%, Ambient Light: 200 lux, Human Presence: True, Location: living_room"

inputs = tokenizer(prompt, return_tensors="pt", truncation=True, padding=True, max_length=512)
with torch.no_grad():
    outputs = model(**inputs)
predicted_class = torch.argmax(outputs.logits).item()
print(f"Predicted Agent: {agents[predicted_class]}")

Model Files

model.safetensors: Model weights
config.json: Model configuration
tokenizer.json: Tokenizer files
vocab.txt: Vocabulary
special_tokens_map.json: Special tokens
training_args.bin: Training arguments

Dataset

The synthetic dataset is included as synthetic_data.jsonl.

License

MIT License

Downloads last month: 13

Safetensors

Model size

67M params

Tensor type

F32