Singapore Hawker Centers Status Prediction Model
Model Description
This is a machine learning model that predicts the status of Singapore hawker centers based on their characteristics. The model uses a Random Forest classifier to classify hawker centers into different status categories.
Model Performance
- Accuracy: 88.46%
- Algorithm: Random Forest Classifier
- Training Data: 129 Singapore hawker centers
- Features: 8 engineered features
Model Details
Model Architecture
- Type: Random Forest Classifier
- Number of Estimators: 100
- Random State: 42
Training Data
The model was trained on Singapore hawker centers data from the GeoJSON dataset, containing:
- 129 hawker centers
- Location coordinates (longitude, latitude)
- Number of food stalls
- Completion dates
- Status information
Feature Engineering
- Age: Calculated from completion year
- Location: Normalized longitude and latitude
- Stalls: Number of cooked food stalls
- Size Categories: Small (<20), Medium (20-50), Large (50-100), Very Large (>100)
Target Classes
- Existing: 103 centers (79.8%)
- Existing (new): 16 centers (12.4%)
- Under Construction: 6 centers (4.7%)
- Existing (replacement): 3 centers (2.3%)
- Other: 1 center (0.8%)
Feature Importance
- Age: 55.2% - Most important predictor
- Latitude: 20.6% - Geographic location matters
- Longitude: 9.3% - East-west positioning
- Stalls: 8.8% - Size of the center
- Size Categories: 4.1% - Categorical size information
Usage
Python Example
import joblib
import numpy as np
# Load the model
model = joblib.load('hawker_centers_model.pkl')
# Prepare features for prediction
# [stalls, age, longitude_norm, latitude_norm, stall_Small, stall_Medium, stall_Large, stall_Very Large]
features = [50, 20, 0.5, 0.3, 0, 1, 0, 0]
# Make prediction
prediction = model.predict([features])
confidence = model.predict_proba([features]).max()
print(f"Predicted status: {prediction[0]}")
print(f"Confidence: {confidence:.2%}")
Input Format
- stalls: Number of food stalls (integer)
- age: Age of hawker center in years (integer)
- longitude_norm: Normalized longitude (0-1, float)
- latitude_norm: Normalized latitude (0-1, float)
- stall_Small: Binary indicator for small size (0 or 1)
- stall_Medium: Binary indicator for medium size (0 or 1)
- stall_Large: Binary indicator for large size (0 or 1)
- stall_Very Large: Binary indicator for very large size (0 or 1)
Training Details
Data Preprocessing
- Extracted completion years from date strings
- Calculated age from completion year
- Normalized coordinates to 0-1 range
- Created size categories based on stall count
- Grouped rare classes to avoid stratification issues
Model Training
- Train/Test Split: 80/20
- Cross-validation: Not used (small dataset)
- Class Balancing: Rare classes grouped as "Other"
Limitations
- Small Dataset: Only 129 samples, which limits generalization
- Class Imbalance: Most centers are "Existing" status
- Geographic Bias: Model trained only on Singapore data
- Temporal Bias: Data from specific time period
- Feature Limitations: Only basic features used
Ethical Considerations
- Model should not be used for discriminatory purposes
- Results should be interpreted with caution due to small dataset
- Consider local context when applying to other regions
- Regular retraining recommended as new data becomes available
Citation
@misc{singapore-hawker-centers-model,
title={Singapore Hawker Centers Status Prediction Model},
author={Custom AI Studio},
year={2024},
url={https://huggingface.co/your-username/singapore-hawker-centers-model}
}
License
This model is released under the MIT License.
Contact
For questions or issues, please open an issue on the model repository.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support