Spaces:
Sleeping
Sleeping
| # Zero-Shot Model Selection Feature | |
| ## Overview | |
| You can now **choose which AI model** to use for zero-shot classification! This allows you to balance between accuracy and speed based on your needs. | |
| ## Available Zero-Shot Models | |
| ### 1. **BART-large-MNLI** (Current Default) | |
| - **Size**: 400M parameters | |
| - **Speed**: Slow | |
| - **Best for**: Maximum accuracy, works out of the box | |
| - **Description**: Large sequence-to-sequence model, excellent zero-shot performance | |
| - **Model ID**: `facebook/bart-large-mnli` | |
| ### 2. **DeBERTa-v3-base-MNLI** β **Recommended** | |
| - **Size**: 86M parameters (4.5x smaller than BART) | |
| - **Speed**: Fast | |
| - **Best for**: Fast zero-shot classification with good accuracy | |
| - **Description**: DeBERTa trained on NLI datasets, excellent zero-shot with better speed | |
| - **Model ID**: `MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli` | |
| ### 3. **DistilBART-MNLI** | |
| - **Size**: 134M parameters | |
| - **Speed**: Medium | |
| - **Best for**: Balanced zero-shot performance | |
| - **Description**: Distilled BART for zero-shot, good balance of speed and accuracy | |
| - **Model ID**: `valhalla/distilbart-mnli-12-3` | |
| ## How to Use | |
| ### Step 1: Go to Training Page | |
| 1. Navigate to **Admin Panel** β **Training** tab | |
| 2. Look for the **"Zero-Shot Classification Model"** section at the top | |
| ### Step 2: View Current Model | |
| - The dropdown shows the currently active model | |
| - Below it, you'll see model information (size, speed, description) | |
| ### Step 3: Change Model | |
| 1. Select a different model from the dropdown | |
| 2. The system will ask for confirmation | |
| 3. The analyzer will reload with the new model | |
| 4. **All future classifications** will use the selected model | |
| ### Step 4: Test It | |
| - Go to **Submissions** page | |
| - Click "Re-analyze" on any submission | |
| - The new model will be used for classification! | |
| ## When to Use Each Model | |
| ### Use BART-large-MNLI if: | |
| - β Accuracy is more important than speed | |
| - β You have powerful hardware | |
| - β You don't mind waiting a bit longer | |
| ### Use DeBERTa-v3-base-MNLI if: β **RECOMMENDED** | |
| - β You want good accuracy with better speed | |
| - β You're working with many submissions | |
| - β You want to save computational resources | |
| - β You need faster response times | |
| ### Use DistilBART-MNLI if: | |
| - β You want something in between | |
| - β You're familiar with BART but need better speed | |
| ## Technical Details | |
| ### How It Works | |
| 1. **Settings Storage**: The selected model is stored in the database (`Settings` table) | |
| 2. **Dynamic Loading**: The analyzer checks the setting and loads the selected model | |
| 3. **Hot Reload**: When you change models, the analyzer reloads automatically | |
| 4. **No Data Loss**: Changing models doesn't affect your training data or fine-tuned models | |
| ### Model Persistence | |
| - The selected model remains active even after app restart | |
| - Each submission classification uses the currently active zero-shot model | |
| - Fine-tuned models override zero-shot models when deployed | |
| ### API Endpoints | |
| **Get Current Model:** | |
| ``` | |
| GET /admin/api/get-zero-shot-model | |
| ``` | |
| **Change Model:** | |
| ``` | |
| POST /admin/api/set-zero-shot-model | |
| Body: {"model_key": "deberta-v3-base-mnli"} | |
| ``` | |
| ## Performance Comparison | |
| | Model | Parameters | Classification Speed | Relative Accuracy | | |
| |-------|-----------|---------------------|-------------------| | |
| | BART-large-MNLI | 400M | 1x (baseline) | 100% | | |
| | DeBERTa-v3-base-MNLI | 86M | ~4x faster | ~95-98% | | |
| | DistilBART-MNLI | 134M | ~2x faster | ~92-95% | | |
| *Note: Actual performance may vary based on your hardware and text length* | |
| ## Fine-Tuning vs Zero-Shot | |
| ### Zero-Shot Model Selection | |
| - **When**: Before you have training data | |
| - **What**: Chooses which pre-trained model to use | |
| - **Where**: Admin β Training β Zero-Shot Classification Model | |
| - **Effect**: Affects all new classifications immediately | |
| ### Fine-Tuning Model Selection | |
| - **When**: When training with your labeled data | |
| - **What**: Chooses which model architecture to fine-tune | |
| - **Where**: Admin β Training β Base Model Architecture for Fine-Tuning | |
| - **Effect**: Only affects that specific training run | |
| ### Can I use both? | |
| **Yes!** You can: | |
| 1. **Select a zero-shot model** (e.g., DeBERTa-v3-base-MNLI) for initial classifications | |
| 2. **Fine-tune** using any model (e.g., DeBERTa-v3-small) for better performance | |
| 3. **Deploy** the fine-tuned model, which will override the zero-shot model | |
| ## Troubleshooting | |
| **Q: I changed the model but nothing happened?** | |
| A: The change affects new classifications. Try clicking "Re-analyze" on a submission to see the new model in action. | |
| **Q: Which model should I choose?** | |
| A: Start with **DeBERTa-v3-base-MNLI** - it's faster than BART with minimal accuracy loss. | |
| **Q: Does this affect my fine-tuned models?** | |
| A: No! Zero-shot models are only used when no fine-tuned model is deployed. | |
| **Q: Can I switch back to BART?** | |
| A: Yes! Just select BART-large-MNLI from the dropdown anytime. | |
| **Q: Will changing models break anything?** | |
| A: No, it's completely safe. Your data, training runs, and fine-tuned models are unaffected. | |
| ## Best Practices | |
| 1. **Start with DeBERTa-v3-base-MNLI** for better speed | |
| 2. **Compare results** - try re-analyzing the same submission with different models | |
| 3. **Consider your hardware** - larger models need more RAM | |
| 4. **Fine-tune eventually** - zero-shot is great, but fine-tuning is better! | |
| ## Example Workflow | |
| ``` | |
| 1. Install app | |
| β | |
| 2. Select DeBERTa-v3-base-MNLI (for speed) | |
| β | |
| 3. Collect submissions | |
| β | |
| 4. Correct categories (builds training data) | |
| β | |
| 5. Fine-tune using DeBERTa-v3-small (best for small datasets) | |
| β | |
| 6. Deploy fine-tuned model (overrides zero-shot) | |
| β | |
| 7. Enjoy better accuracy! π | |
| ``` | |
| ## What's Next? | |
| After selecting your zero-shot model: | |
| - **Collect data**: Let users submit and classify with the selected model | |
| - **Review & correct**: Use the admin panel to fix any misclassifications | |
| - **Build training set**: Corrections are automatically saved | |
| - **Fine-tune**: Once you have 20+ examples, train a custom model | |
| - **Deploy**: Your fine-tuned model will outperform any zero-shot model! | |
| --- | |
| **Ready to try it?** Go to Admin β Training and select your model! π | |
| For questions or issues: | |
| 1. Check the model info displayed below the dropdown | |
| 2. Review this guide | |
| 3. Try switching back to BART if issues occur | |