Singapore Hawker Centers Status Prediction Model

Model Description

This is a machine learning model that predicts the status of Singapore hawker centers based on their characteristics. The model uses a Random Forest classifier to classify hawker centers into different status categories.

Model Performance

  • Accuracy: 88.46%
  • Algorithm: Random Forest Classifier
  • Training Data: 129 Singapore hawker centers
  • Features: 8 engineered features

Model Details

Model Architecture

  • Type: Random Forest Classifier
  • Number of Estimators: 100
  • Random State: 42

Training Data

The model was trained on Singapore hawker centers data from the GeoJSON dataset, containing:

  • 129 hawker centers
  • Location coordinates (longitude, latitude)
  • Number of food stalls
  • Completion dates
  • Status information

Feature Engineering

  1. Age: Calculated from completion year
  2. Location: Normalized longitude and latitude
  3. Stalls: Number of cooked food stalls
  4. Size Categories: Small (<20), Medium (20-50), Large (50-100), Very Large (>100)

Target Classes

  • Existing: 103 centers (79.8%)
  • Existing (new): 16 centers (12.4%)
  • Under Construction: 6 centers (4.7%)
  • Existing (replacement): 3 centers (2.3%)
  • Other: 1 center (0.8%)

Feature Importance

  1. Age: 55.2% - Most important predictor
  2. Latitude: 20.6% - Geographic location matters
  3. Longitude: 9.3% - East-west positioning
  4. Stalls: 8.8% - Size of the center
  5. Size Categories: 4.1% - Categorical size information

Usage

Python Example

import joblib
import numpy as np

# Load the model
model = joblib.load('hawker_centers_model.pkl')

# Prepare features for prediction
# [stalls, age, longitude_norm, latitude_norm, stall_Small, stall_Medium, stall_Large, stall_Very Large]
features = [50, 20, 0.5, 0.3, 0, 1, 0, 0]

# Make prediction
prediction = model.predict([features])
confidence = model.predict_proba([features]).max()

print(f"Predicted status: {prediction[0]}")
print(f"Confidence: {confidence:.2%}")

Input Format

  • stalls: Number of food stalls (integer)
  • age: Age of hawker center in years (integer)
  • longitude_norm: Normalized longitude (0-1, float)
  • latitude_norm: Normalized latitude (0-1, float)
  • stall_Small: Binary indicator for small size (0 or 1)
  • stall_Medium: Binary indicator for medium size (0 or 1)
  • stall_Large: Binary indicator for large size (0 or 1)
  • stall_Very Large: Binary indicator for very large size (0 or 1)

Training Details

Data Preprocessing

  • Extracted completion years from date strings
  • Calculated age from completion year
  • Normalized coordinates to 0-1 range
  • Created size categories based on stall count
  • Grouped rare classes to avoid stratification issues

Model Training

  • Train/Test Split: 80/20
  • Cross-validation: Not used (small dataset)
  • Class Balancing: Rare classes grouped as "Other"

Limitations

  1. Small Dataset: Only 129 samples, which limits generalization
  2. Class Imbalance: Most centers are "Existing" status
  3. Geographic Bias: Model trained only on Singapore data
  4. Temporal Bias: Data from specific time period
  5. Feature Limitations: Only basic features used

Ethical Considerations

  • Model should not be used for discriminatory purposes
  • Results should be interpreted with caution due to small dataset
  • Consider local context when applying to other regions
  • Regular retraining recommended as new data becomes available

Citation

@misc{singapore-hawker-centers-model,
  title={Singapore Hawker Centers Status Prediction Model},
  author={Custom AI Studio},
  year={2024},
  url={https://huggingface.co/your-username/singapore-hawker-centers-model}
}

License

This model is released under the MIT License.

Contact

For questions or issues, please open an issue on the model repository.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support