--- title: Gambling Site Detector emoji: 🐢 colorFrom: gray colorTo: purple sdk: gradio sdk_version: 5.27.0 app_file: app.py pinned: false license: mit short_description: Detects whether a website is related to gambling or not --- # 🕵️ Indonesian Gambling Website Detection - Huggingface Space This Space detects whether a website is related to gambling based on its screenshot and OCR text. ## Features - Single URL Prediction - Batch URLs Prediction - Screenshot capture using external API (apiflash) - OCR extraction using a hybrid approach (OCR.Space API + EasyOCR fallback) - Multimodal model (image + text fusion) ## Instructions 1. Enter a website URL or upload a `.txt` file containing multiple URLs (one URL per line). 2. The system will: - Take a screenshot of the website. - Extract text using Tesseract OCR. - Predict if it is a Gambling or Non-Gambling site. ## Model - Fusion model (`best_fusion_model.pth`) trained with EfficientNet + IndoBERT. ## Deployment This Space requires: - Gradio - Torch - Transformers - EasyOCR - Pillow - Pandas - Requests ## Important Notes ⚡ **Inference may take longer than expected** because this Space runs on **CPU-only hardware**. Performance is significantly slower compared to GPU-enabled environments like Google Colab. Each prediction can take several minutes due to the complexity of multimodal fusion models (image + text processing). For faster performance, consider running the model locally with a GPU, or upgrading to a GPU-enabled Huggingface Space.