participatory-planner / ZERO_SHOT_MODEL_SELECTION.md
thadillo
Phases 1-3: Database schema, text processing, analyzer updates
71797a4
|
raw
history blame
6.27 kB

Zero-Shot Model Selection Feature

Overview

You can now choose which AI model to use for zero-shot classification! This allows you to balance between accuracy and speed based on your needs.

Available Zero-Shot Models

1. BART-large-MNLI (Current Default)

  • Size: 400M parameters
  • Speed: Slow
  • Best for: Maximum accuracy, works out of the box
  • Description: Large sequence-to-sequence model, excellent zero-shot performance
  • Model ID: facebook/bart-large-mnli

2. DeBERTa-v3-base-MNLI ⭐ Recommended

  • Size: 86M parameters (4.5x smaller than BART)
  • Speed: Fast
  • Best for: Fast zero-shot classification with good accuracy
  • Description: DeBERTa trained on NLI datasets, excellent zero-shot with better speed
  • Model ID: MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli

3. DistilBART-MNLI

  • Size: 134M parameters
  • Speed: Medium
  • Best for: Balanced zero-shot performance
  • Description: Distilled BART for zero-shot, good balance of speed and accuracy
  • Model ID: valhalla/distilbart-mnli-12-3

How to Use

Step 1: Go to Training Page

  1. Navigate to Admin Panel β†’ Training tab
  2. Look for the "Zero-Shot Classification Model" section at the top

Step 2: View Current Model

  • The dropdown shows the currently active model
  • Below it, you'll see model information (size, speed, description)

Step 3: Change Model

  1. Select a different model from the dropdown
  2. The system will ask for confirmation
  3. The analyzer will reload with the new model
  4. All future classifications will use the selected model

Step 4: Test It

  • Go to Submissions page
  • Click "Re-analyze" on any submission
  • The new model will be used for classification!

When to Use Each Model

Use BART-large-MNLI if:

  • βœ… Accuracy is more important than speed
  • βœ… You have powerful hardware
  • βœ… You don't mind waiting a bit longer

Use DeBERTa-v3-base-MNLI if: ⭐ RECOMMENDED

  • βœ… You want good accuracy with better speed
  • βœ… You're working with many submissions
  • βœ… You want to save computational resources
  • βœ… You need faster response times

Use DistilBART-MNLI if:

  • βœ… You want something in between
  • βœ… You're familiar with BART but need better speed

Technical Details

How It Works

  1. Settings Storage: The selected model is stored in the database (Settings table)
  2. Dynamic Loading: The analyzer checks the setting and loads the selected model
  3. Hot Reload: When you change models, the analyzer reloads automatically
  4. No Data Loss: Changing models doesn't affect your training data or fine-tuned models

Model Persistence

  • The selected model remains active even after app restart
  • Each submission classification uses the currently active zero-shot model
  • Fine-tuned models override zero-shot models when deployed

API Endpoints

Get Current Model:

GET /admin/api/get-zero-shot-model

Change Model:

POST /admin/api/set-zero-shot-model
Body: {"model_key": "deberta-v3-base-mnli"}

Performance Comparison

Model Parameters Classification Speed Relative Accuracy
BART-large-MNLI 400M 1x (baseline) 100%
DeBERTa-v3-base-MNLI 86M ~4x faster ~95-98%
DistilBART-MNLI 134M ~2x faster ~92-95%

Note: Actual performance may vary based on your hardware and text length

Fine-Tuning vs Zero-Shot

Zero-Shot Model Selection

  • When: Before you have training data
  • What: Chooses which pre-trained model to use
  • Where: Admin β†’ Training β†’ Zero-Shot Classification Model
  • Effect: Affects all new classifications immediately

Fine-Tuning Model Selection

  • When: When training with your labeled data
  • What: Chooses which model architecture to fine-tune
  • Where: Admin β†’ Training β†’ Base Model Architecture for Fine-Tuning
  • Effect: Only affects that specific training run

Can I use both?

Yes! You can:

  1. Select a zero-shot model (e.g., DeBERTa-v3-base-MNLI) for initial classifications
  2. Fine-tune using any model (e.g., DeBERTa-v3-small) for better performance
  3. Deploy the fine-tuned model, which will override the zero-shot model

Troubleshooting

Q: I changed the model but nothing happened?
A: The change affects new classifications. Try clicking "Re-analyze" on a submission to see the new model in action.

Q: Which model should I choose?
A: Start with DeBERTa-v3-base-MNLI - it's faster than BART with minimal accuracy loss.

Q: Does this affect my fine-tuned models?
A: No! Zero-shot models are only used when no fine-tuned model is deployed.

Q: Can I switch back to BART?
A: Yes! Just select BART-large-MNLI from the dropdown anytime.

Q: Will changing models break anything?
A: No, it's completely safe. Your data, training runs, and fine-tuned models are unaffected.

Best Practices

  1. Start with DeBERTa-v3-base-MNLI for better speed
  2. Compare results - try re-analyzing the same submission with different models
  3. Consider your hardware - larger models need more RAM
  4. Fine-tune eventually - zero-shot is great, but fine-tuning is better!

Example Workflow

1. Install app
   ↓
2. Select DeBERTa-v3-base-MNLI (for speed)
   ↓
3. Collect submissions
   ↓
4. Correct categories (builds training data)
   ↓
5. Fine-tune using DeBERTa-v3-small (best for small datasets)
   ↓
6. Deploy fine-tuned model (overrides zero-shot)
   ↓
7. Enjoy better accuracy! πŸŽ‰

What's Next?

After selecting your zero-shot model:

  • Collect data: Let users submit and classify with the selected model
  • Review & correct: Use the admin panel to fix any misclassifications
  • Build training set: Corrections are automatically saved
  • Fine-tune: Once you have 20+ examples, train a custom model
  • Deploy: Your fine-tuned model will outperform any zero-shot model!

Ready to try it? Go to Admin β†’ Training and select your model! πŸš€

For questions or issues:

  1. Check the model info displayed below the dropdown
  2. Review this guide
  3. Try switching back to BART if issues occur