Smart Home Task Classification Model
Model Description
This is a fine-tuned DistilBERT-base-uncased model for intelligent routing of smart home assistant requests to appropriate agents. The model classifies user prompts into 5 categories: lighting, climate, security, entertainment, appliance.
Intended Uses & Limitations
Intended Uses
- Routing smart home voice commands to the correct agent (e.g., lighting control, climate control)
- Low-latency classification for IoT applications
- Integration with home automation systems
Limitations
- Trained on synthetic data, may not generalize to all real-world prompts
- Limited to 5 predefined categories
- English language only
Training Details
Training Data
- 10,000 synthetic smart home prompts
- Includes real-time sensor data (temperature, humidity, ambient light, human presence, location)
- Format: JSONL with "context" and "output" fields
- Split: 8,000 train, 2,000 validation
Training Procedure
- Base model: distilbert-base-uncased (66M parameters)
- Task: Sequence Classification (5 classes)
- Fine-tuning: 3 epochs
- Learning rate: 2e-5
- Batch size: 16
- Optimizer: AdamW
Training Logs
- Epoch 1: Train Loss 0.85 โ Val Loss 0.32
- Epoch 2: Train Loss 0.28 โ Val Loss 0.15
- Epoch 3: Train Loss 0.12 โ Val Loss 0.08
- Final Validation: Accuracy 100%, F1 100%
Performance
| Metric | Value |
|---|---|
| Accuracy | 100% |
| F1 Score | 100% |
How to Use
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("SaiCharan7829/smartHome_task_classification-distilBERT-66M")
model = AutoModelForSequenceClassification.from_pretrained("SaiCharan7829/smartHome_task_classification-distilBERT-66M")
agents = ["lighting", "climate", "security", "entertainment", "appliance"]
prompt = "Turn on the living room lights. Current sensors: Temperature: 25ยฐC, Humidity: 50%, Ambient Light: 200 lux, Human Presence: True, Location: living_room"
inputs = tokenizer(prompt, return_tensors="pt", truncation=True, padding=True, max_length=512)
with torch.no_grad():
outputs = model(**inputs)
predicted_class = torch.argmax(outputs.logits).item()
print(f"Predicted Agent: {agents[predicted_class]}")
Model Files
model.safetensors: Model weightsconfig.json: Model configurationtokenizer.json: Tokenizer filesvocab.txt: Vocabularyspecial_tokens_map.json: Special tokenstraining_args.bin: Training arguments
Dataset
The synthetic dataset is included as synthetic_data.jsonl.
License
MIT License
- Downloads last month
- 13