Spaces:

Thadillo
/

participatory-planner

Sleeping

App Files Files Community

participatory-planner / ZERO_SHOT_MODEL_SELECTION.md

thadillo

Phases 1-3: Database schema, text processing, analyzer updates

71797a4 about 1 month ago

preview code

raw

history blame

6.27 kB

	# Zero-Shot Model Selection Feature

	## Overview

	You can now choose which AI model to use for zero-shot classification! This allows you to balance between accuracy and speed based on your needs.

	## Available Zero-Shot Models

	### 1. BART-large-MNLI (Current Default)
	- Size: 400M parameters
	- Speed: Slow
	- Best for: Maximum accuracy, works out of the box
	- Description: Large sequence-to-sequence model, excellent zero-shot performance
	- Model ID: `facebook/bart-large-mnli`

	### 2. DeBERTa-v3-base-MNLI ⭐ Recommended
	- Size: 86M parameters (4.5x smaller than BART)
	- Speed: Fast
	- Best for: Fast zero-shot classification with good accuracy
	- Description: DeBERTa trained on NLI datasets, excellent zero-shot with better speed
	- Model ID: `MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli`

	### 3. DistilBART-MNLI
	- Size: 134M parameters
	- Speed: Medium
	- Best for: Balanced zero-shot performance
	- Description: Distilled BART for zero-shot, good balance of speed and accuracy
	- Model ID: `valhalla/distilbart-mnli-12-3`

	## How to Use

	### Step 1: Go to Training Page
	1. Navigate to Admin Panel → Training tab
	2. Look for the "Zero-Shot Classification Model" section at the top

	### Step 2: View Current Model
	- The dropdown shows the currently active model
	- Below it, you'll see model information (size, speed, description)

	### Step 3: Change Model
	1. Select a different model from the dropdown
	2. The system will ask for confirmation
	3. The analyzer will reload with the new model
	4. All future classifications will use the selected model

	### Step 4: Test It
	- Go to Submissions page
	- Click "Re-analyze" on any submission
	- The new model will be used for classification!

	## When to Use Each Model

	### Use BART-large-MNLI if:
	- ✅ Accuracy is more important than speed
	- ✅ You have powerful hardware
	- ✅ You don't mind waiting a bit longer

	### Use DeBERTa-v3-base-MNLI if: ⭐ RECOMMENDED
	- ✅ You want good accuracy with better speed
	- ✅ You're working with many submissions
	- ✅ You want to save computational resources
	- ✅ You need faster response times

	### Use DistilBART-MNLI if:
	- ✅ You want something in between
	- ✅ You're familiar with BART but need better speed

	## Technical Details

	### How It Works

	1. Settings Storage: The selected model is stored in the database (`Settings` table)
	2. Dynamic Loading: The analyzer checks the setting and loads the selected model
	3. Hot Reload: When you change models, the analyzer reloads automatically
	4. No Data Loss: Changing models doesn't affect your training data or fine-tuned models

	### Model Persistence

	- The selected model remains active even after app restart
	- Each submission classification uses the currently active zero-shot model
	- Fine-tuned models override zero-shot models when deployed

	### API Endpoints

	Get Current Model:
	```
	GET /admin/api/get-zero-shot-model
	```

	Change Model:
	```
	POST /admin/api/set-zero-shot-model
	Body: {"model_key": "deberta-v3-base-mnli"}
	```

	## Performance Comparison

	\| Model \| Parameters \| Classification Speed \| Relative Accuracy \|
	\|-------\|-----------\|---------------------\|-------------------\|
	\| BART-large-MNLI \| 400M \| 1x (baseline) \| 100% \|
	\| DeBERTa-v3-base-MNLI \| 86M \| ~4x faster \| ~95-98% \|
	\| DistilBART-MNLI \| 134M \| ~2x faster \| ~92-95% \|

	Note: Actual performance may vary based on your hardware and text length

	## Fine-Tuning vs Zero-Shot

	### Zero-Shot Model Selection
	- When: Before you have training data
	- What: Chooses which pre-trained model to use
	- Where: Admin → Training → Zero-Shot Classification Model
	- Effect: Affects all new classifications immediately

	### Fine-Tuning Model Selection
	- When: When training with your labeled data
	- What: Chooses which model architecture to fine-tune
	- Where: Admin → Training → Base Model Architecture for Fine-Tuning
	- Effect: Only affects that specific training run

	### Can I use both?
	Yes! You can:
	1. Select a zero-shot model (e.g., DeBERTa-v3-base-MNLI) for initial classifications
	2. Fine-tune using any model (e.g., DeBERTa-v3-small) for better performance
	3. Deploy the fine-tuned model, which will override the zero-shot model

	## Troubleshooting

	Q: I changed the model but nothing happened?
	A: The change affects new classifications. Try clicking "Re-analyze" on a submission to see the new model in action.

	Q: Which model should I choose?
	A: Start with DeBERTa-v3-base-MNLI - it's faster than BART with minimal accuracy loss.

	Q: Does this affect my fine-tuned models?
	A: No! Zero-shot models are only used when no fine-tuned model is deployed.

	Q: Can I switch back to BART?
	A: Yes! Just select BART-large-MNLI from the dropdown anytime.

	Q: Will changing models break anything?
	A: No, it's completely safe. Your data, training runs, and fine-tuned models are unaffected.

	## Best Practices

	1. Start with DeBERTa-v3-base-MNLI for better speed
	2. Compare results - try re-analyzing the same submission with different models
	3. Consider your hardware - larger models need more RAM
	4. Fine-tune eventually - zero-shot is great, but fine-tuning is better!

	## Example Workflow

	```
	1. Install app
	↓
	2. Select DeBERTa-v3-base-MNLI (for speed)
	↓
	3. Collect submissions
	↓
	4. Correct categories (builds training data)
	↓
	5. Fine-tune using DeBERTa-v3-small (best for small datasets)
	↓
	6. Deploy fine-tuned model (overrides zero-shot)
	↓
	7. Enjoy better accuracy! 🎉
	```

	## What's Next?

	After selecting your zero-shot model:
	- Collect data: Let users submit and classify with the selected model
	- Review & correct: Use the admin panel to fix any misclassifications
	- Build training set: Corrections are automatically saved
	- Fine-tune: Once you have 20+ examples, train a custom model
	- Deploy: Your fine-tuned model will outperform any zero-shot model!

	---

	Ready to try it? Go to Admin → Training and select your model! 🚀

	For questions or issues:
	1. Check the model info displayed below the dropdown
	2. Review this guide
	3. Try switching back to BART if issues occur