Spaces:
Sleeping
Zero-Shot Model Selection Feature
Overview
You can now choose which AI model to use for zero-shot classification! This allows you to balance between accuracy and speed based on your needs.
Available Zero-Shot Models
1. BART-large-MNLI (Current Default)
- Size: 400M parameters
- Speed: Slow
- Best for: Maximum accuracy, works out of the box
- Description: Large sequence-to-sequence model, excellent zero-shot performance
- Model ID:
facebook/bart-large-mnli
2. DeBERTa-v3-base-MNLI β Recommended
- Size: 86M parameters (4.5x smaller than BART)
- Speed: Fast
- Best for: Fast zero-shot classification with good accuracy
- Description: DeBERTa trained on NLI datasets, excellent zero-shot with better speed
- Model ID:
MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli
3. DistilBART-MNLI
- Size: 134M parameters
- Speed: Medium
- Best for: Balanced zero-shot performance
- Description: Distilled BART for zero-shot, good balance of speed and accuracy
- Model ID:
valhalla/distilbart-mnli-12-3
How to Use
Step 1: Go to Training Page
- Navigate to Admin Panel β Training tab
- Look for the "Zero-Shot Classification Model" section at the top
Step 2: View Current Model
- The dropdown shows the currently active model
- Below it, you'll see model information (size, speed, description)
Step 3: Change Model
- Select a different model from the dropdown
- The system will ask for confirmation
- The analyzer will reload with the new model
- All future classifications will use the selected model
Step 4: Test It
- Go to Submissions page
- Click "Re-analyze" on any submission
- The new model will be used for classification!
When to Use Each Model
Use BART-large-MNLI if:
- β Accuracy is more important than speed
- β You have powerful hardware
- β You don't mind waiting a bit longer
Use DeBERTa-v3-base-MNLI if: β RECOMMENDED
- β You want good accuracy with better speed
- β You're working with many submissions
- β You want to save computational resources
- β You need faster response times
Use DistilBART-MNLI if:
- β You want something in between
- β You're familiar with BART but need better speed
Technical Details
How It Works
- Settings Storage: The selected model is stored in the database (
Settingstable) - Dynamic Loading: The analyzer checks the setting and loads the selected model
- Hot Reload: When you change models, the analyzer reloads automatically
- No Data Loss: Changing models doesn't affect your training data or fine-tuned models
Model Persistence
- The selected model remains active even after app restart
- Each submission classification uses the currently active zero-shot model
- Fine-tuned models override zero-shot models when deployed
API Endpoints
Get Current Model:
GET /admin/api/get-zero-shot-model
Change Model:
POST /admin/api/set-zero-shot-model
Body: {"model_key": "deberta-v3-base-mnli"}
Performance Comparison
| Model | Parameters | Classification Speed | Relative Accuracy |
|---|---|---|---|
| BART-large-MNLI | 400M | 1x (baseline) | 100% |
| DeBERTa-v3-base-MNLI | 86M | ~4x faster | ~95-98% |
| DistilBART-MNLI | 134M | ~2x faster | ~92-95% |
Note: Actual performance may vary based on your hardware and text length
Fine-Tuning vs Zero-Shot
Zero-Shot Model Selection
- When: Before you have training data
- What: Chooses which pre-trained model to use
- Where: Admin β Training β Zero-Shot Classification Model
- Effect: Affects all new classifications immediately
Fine-Tuning Model Selection
- When: When training with your labeled data
- What: Chooses which model architecture to fine-tune
- Where: Admin β Training β Base Model Architecture for Fine-Tuning
- Effect: Only affects that specific training run
Can I use both?
Yes! You can:
- Select a zero-shot model (e.g., DeBERTa-v3-base-MNLI) for initial classifications
- Fine-tune using any model (e.g., DeBERTa-v3-small) for better performance
- Deploy the fine-tuned model, which will override the zero-shot model
Troubleshooting
Q: I changed the model but nothing happened?
A: The change affects new classifications. Try clicking "Re-analyze" on a submission to see the new model in action.
Q: Which model should I choose?
A: Start with DeBERTa-v3-base-MNLI - it's faster than BART with minimal accuracy loss.
Q: Does this affect my fine-tuned models?
A: No! Zero-shot models are only used when no fine-tuned model is deployed.
Q: Can I switch back to BART?
A: Yes! Just select BART-large-MNLI from the dropdown anytime.
Q: Will changing models break anything?
A: No, it's completely safe. Your data, training runs, and fine-tuned models are unaffected.
Best Practices
- Start with DeBERTa-v3-base-MNLI for better speed
- Compare results - try re-analyzing the same submission with different models
- Consider your hardware - larger models need more RAM
- Fine-tune eventually - zero-shot is great, but fine-tuning is better!
Example Workflow
1. Install app
β
2. Select DeBERTa-v3-base-MNLI (for speed)
β
3. Collect submissions
β
4. Correct categories (builds training data)
β
5. Fine-tune using DeBERTa-v3-small (best for small datasets)
β
6. Deploy fine-tuned model (overrides zero-shot)
β
7. Enjoy better accuracy! π
What's Next?
After selecting your zero-shot model:
- Collect data: Let users submit and classify with the selected model
- Review & correct: Use the admin panel to fix any misclassifications
- Build training set: Corrections are automatically saved
- Fine-tune: Once you have 20+ examples, train a custom model
- Deploy: Your fine-tuned model will outperform any zero-shot model!
Ready to try it? Go to Admin β Training and select your model! π
For questions or issues:
- Check the model info displayed below the dropdown
- Review this guide
- Try switching back to BART if issues occur