Smart Home Task Classification Model

Model Description

This is a fine-tuned DistilBERT-base-uncased model for intelligent routing of smart home assistant requests to appropriate agents. The model classifies user prompts into 5 categories: lighting, climate, security, entertainment, appliance.

Intended Uses & Limitations

Intended Uses

  • Routing smart home voice commands to the correct agent (e.g., lighting control, climate control)
  • Low-latency classification for IoT applications
  • Integration with home automation systems

Limitations

  • Trained on synthetic data, may not generalize to all real-world prompts
  • Limited to 5 predefined categories
  • English language only

Training Details

Training Data

  • 10,000 synthetic smart home prompts
  • Includes real-time sensor data (temperature, humidity, ambient light, human presence, location)
  • Format: JSONL with "context" and "output" fields
  • Split: 8,000 train, 2,000 validation

Training Procedure

  • Base model: distilbert-base-uncased (66M parameters)
  • Task: Sequence Classification (5 classes)
  • Fine-tuning: 3 epochs
  • Learning rate: 2e-5
  • Batch size: 16
  • Optimizer: AdamW

Training Logs

  • Epoch 1: Train Loss 0.85 โ†’ Val Loss 0.32
  • Epoch 2: Train Loss 0.28 โ†’ Val Loss 0.15
  • Epoch 3: Train Loss 0.12 โ†’ Val Loss 0.08
  • Final Validation: Accuracy 100%, F1 100%

Performance

Metric Value
Accuracy 100%
F1 Score 100%

How to Use

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("SaiCharan7829/smartHome_task_classification-distilBERT-66M")
model = AutoModelForSequenceClassification.from_pretrained("SaiCharan7829/smartHome_task_classification-distilBERT-66M")

agents = ["lighting", "climate", "security", "entertainment", "appliance"]

prompt = "Turn on the living room lights. Current sensors: Temperature: 25ยฐC, Humidity: 50%, Ambient Light: 200 lux, Human Presence: True, Location: living_room"

inputs = tokenizer(prompt, return_tensors="pt", truncation=True, padding=True, max_length=512)
with torch.no_grad():
    outputs = model(**inputs)
predicted_class = torch.argmax(outputs.logits).item()
print(f"Predicted Agent: {agents[predicted_class]}")

Model Files

  • model.safetensors: Model weights
  • config.json: Model configuration
  • tokenizer.json: Tokenizer files
  • vocab.txt: Vocabulary
  • special_tokens_map.json: Special tokens
  • training_args.bin: Training arguments

Dataset

The synthetic dataset is included as synthetic_data.jsonl.

License

MIT License

Downloads last month
13
Safetensors
Model size
67M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support