participatory-planner / ZERO_SHOT_MODEL_SELECTION.md
thadillo
Phases 1-3: Database schema, text processing, analyzer updates
71797a4
|
raw
history blame
6.27 kB
# Zero-Shot Model Selection Feature
## Overview
You can now **choose which AI model** to use for zero-shot classification! This allows you to balance between accuracy and speed based on your needs.
## Available Zero-Shot Models
### 1. **BART-large-MNLI** (Current Default)
- **Size**: 400M parameters
- **Speed**: Slow
- **Best for**: Maximum accuracy, works out of the box
- **Description**: Large sequence-to-sequence model, excellent zero-shot performance
- **Model ID**: `facebook/bart-large-mnli`
### 2. **DeBERTa-v3-base-MNLI** ⭐ **Recommended**
- **Size**: 86M parameters (4.5x smaller than BART)
- **Speed**: Fast
- **Best for**: Fast zero-shot classification with good accuracy
- **Description**: DeBERTa trained on NLI datasets, excellent zero-shot with better speed
- **Model ID**: `MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli`
### 3. **DistilBART-MNLI**
- **Size**: 134M parameters
- **Speed**: Medium
- **Best for**: Balanced zero-shot performance
- **Description**: Distilled BART for zero-shot, good balance of speed and accuracy
- **Model ID**: `valhalla/distilbart-mnli-12-3`
## How to Use
### Step 1: Go to Training Page
1. Navigate to **Admin Panel** β†’ **Training** tab
2. Look for the **"Zero-Shot Classification Model"** section at the top
### Step 2: View Current Model
- The dropdown shows the currently active model
- Below it, you'll see model information (size, speed, description)
### Step 3: Change Model
1. Select a different model from the dropdown
2. The system will ask for confirmation
3. The analyzer will reload with the new model
4. **All future classifications** will use the selected model
### Step 4: Test It
- Go to **Submissions** page
- Click "Re-analyze" on any submission
- The new model will be used for classification!
## When to Use Each Model
### Use BART-large-MNLI if:
- βœ… Accuracy is more important than speed
- βœ… You have powerful hardware
- βœ… You don't mind waiting a bit longer
### Use DeBERTa-v3-base-MNLI if: ⭐ **RECOMMENDED**
- βœ… You want good accuracy with better speed
- βœ… You're working with many submissions
- βœ… You want to save computational resources
- βœ… You need faster response times
### Use DistilBART-MNLI if:
- βœ… You want something in between
- βœ… You're familiar with BART but need better speed
## Technical Details
### How It Works
1. **Settings Storage**: The selected model is stored in the database (`Settings` table)
2. **Dynamic Loading**: The analyzer checks the setting and loads the selected model
3. **Hot Reload**: When you change models, the analyzer reloads automatically
4. **No Data Loss**: Changing models doesn't affect your training data or fine-tuned models
### Model Persistence
- The selected model remains active even after app restart
- Each submission classification uses the currently active zero-shot model
- Fine-tuned models override zero-shot models when deployed
### API Endpoints
**Get Current Model:**
```
GET /admin/api/get-zero-shot-model
```
**Change Model:**
```
POST /admin/api/set-zero-shot-model
Body: {"model_key": "deberta-v3-base-mnli"}
```
## Performance Comparison
| Model | Parameters | Classification Speed | Relative Accuracy |
|-------|-----------|---------------------|-------------------|
| BART-large-MNLI | 400M | 1x (baseline) | 100% |
| DeBERTa-v3-base-MNLI | 86M | ~4x faster | ~95-98% |
| DistilBART-MNLI | 134M | ~2x faster | ~92-95% |
*Note: Actual performance may vary based on your hardware and text length*
## Fine-Tuning vs Zero-Shot
### Zero-Shot Model Selection
- **When**: Before you have training data
- **What**: Chooses which pre-trained model to use
- **Where**: Admin β†’ Training β†’ Zero-Shot Classification Model
- **Effect**: Affects all new classifications immediately
### Fine-Tuning Model Selection
- **When**: When training with your labeled data
- **What**: Chooses which model architecture to fine-tune
- **Where**: Admin β†’ Training β†’ Base Model Architecture for Fine-Tuning
- **Effect**: Only affects that specific training run
### Can I use both?
**Yes!** You can:
1. **Select a zero-shot model** (e.g., DeBERTa-v3-base-MNLI) for initial classifications
2. **Fine-tune** using any model (e.g., DeBERTa-v3-small) for better performance
3. **Deploy** the fine-tuned model, which will override the zero-shot model
## Troubleshooting
**Q: I changed the model but nothing happened?**
A: The change affects new classifications. Try clicking "Re-analyze" on a submission to see the new model in action.
**Q: Which model should I choose?**
A: Start with **DeBERTa-v3-base-MNLI** - it's faster than BART with minimal accuracy loss.
**Q: Does this affect my fine-tuned models?**
A: No! Zero-shot models are only used when no fine-tuned model is deployed.
**Q: Can I switch back to BART?**
A: Yes! Just select BART-large-MNLI from the dropdown anytime.
**Q: Will changing models break anything?**
A: No, it's completely safe. Your data, training runs, and fine-tuned models are unaffected.
## Best Practices
1. **Start with DeBERTa-v3-base-MNLI** for better speed
2. **Compare results** - try re-analyzing the same submission with different models
3. **Consider your hardware** - larger models need more RAM
4. **Fine-tune eventually** - zero-shot is great, but fine-tuning is better!
## Example Workflow
```
1. Install app
↓
2. Select DeBERTa-v3-base-MNLI (for speed)
↓
3. Collect submissions
↓
4. Correct categories (builds training data)
↓
5. Fine-tune using DeBERTa-v3-small (best for small datasets)
↓
6. Deploy fine-tuned model (overrides zero-shot)
↓
7. Enjoy better accuracy! πŸŽ‰
```
## What's Next?
After selecting your zero-shot model:
- **Collect data**: Let users submit and classify with the selected model
- **Review & correct**: Use the admin panel to fix any misclassifications
- **Build training set**: Corrections are automatically saved
- **Fine-tune**: Once you have 20+ examples, train a custom model
- **Deploy**: Your fine-tuned model will outperform any zero-shot model!
---
**Ready to try it?** Go to Admin β†’ Training and select your model! πŸš€
For questions or issues:
1. Check the model info displayed below the dropdown
2. Review this guide
3. Try switching back to BART if issues occur