Spaces:
Running
on
Zero
A newer version of the Gradio SDK is available:
6.1.0
HuggingFace Collection Integration - Complete
π― Overview
Full integration with HuggingFace Collection for LEMM LoRAs and datasets, including automatic syncing, import/export, and name conflict resolution.
β Implemented Features
1. Dataset Import (import_prepared_dataset)
- Location:
backend/services/dataset_service.py - Purpose: Import prepared datasets from ZIP files
- Features:
- Supports both root-level and subfolder
dataset_info.jsonstructures - Automatic name conflict resolution with numeric suffixes (
_1,_2, etc.) - Validates dataset structure before import
- Updates metadata with new dataset key if renamed
- Supports both root-level and subfolder
# Example usage in app.py
def import_dataset(zip_file):
dataset_service = DatasetService()
dataset_key = dataset_service.import_prepared_dataset(zip_file)
return f"β
Imported dataset: {dataset_key}"
2. LoRA Collection Sync (sync_on_startup)
- Location:
backend/services/hf_storage_service.py - Purpose: Automatically download missing LoRAs from HF collection on app startup
- Features:
- Lists all LoRAs in collection
- Compares with local LoRA directory
- Downloads only missing LoRAs
- Handles name conflicts with numeric suffixes
- Logs sync activity
# Called automatically on app startup (app.py line 82)
hf_storage = HFStorageService(username="Gamahea", collection_slug="lemm-100-pre-beta")
sync_result = hf_storage.sync_on_startup(loras_dir=Path("models/loras"))
3. Enhanced LoRA Upload
- Location:
app.py-start_lora_training()function - Purpose: Upload trained LoRAs to HF collection with full metadata
- Features:
- Uploads LoRA to individual model repo
- Adds to collection automatically
- Includes training config in metadata
- Returns repo URL and collection link
- Graceful error handling (saves locally if upload fails)
# Upload after training (app.py lines 1397-1411)
upload_result = hf_storage.upload_lora(lora_dir, training_config=config)
if upload_result and 'repo_id' in upload_result:
# Success - show URLs
progress += f"\nβ
LoRA uploaded successfully!"
progress += f"\nπ Model: {upload_result['repo_id']}"
progress += f"\nπ Collection: https://huggingface.co/collections/Gamahea/lemm-100-pre-beta"
π¦ Name Conflict Resolution
All import functions implement automatic name conflict resolution:
- First Check: Try original name
- If Exists: Append
_1,_2,_3, etc. - Update Metadata: Store new name in
dataset_info.jsonormetadata.json - Log Action: Inform user of renaming
Example Flow
Original: my_dataset
Already exists β my_dataset_1
Already exists β my_dataset_2
Available β Use my_dataset_2 β
π Automatic Workflows
On App Startup
- Check HF collection for LoRAs
- Compare with local
models/loras/directory - Download any missing LoRAs
- Log sync results
After LoRA Training
- Train LoRA adapter locally
- Upload to HF as individual model repo
- Add to collection
- Return URLs for viewing
Dataset Import
- User uploads ZIP file
- Extract and validate structure
- Check for name conflicts
- Copy to
training_data/directory - Update dropdown lists
π οΈ Technical Details
File Structure Support
LoRA ZIP Files (both supported):
Option 1 (root):
my_lora.zip/
βββ metadata.json
βββ adapter_config.json
βββ adapter_model.safetensors
Option 2 (subfolder):
my_lora.zip/
βββ my_lora/
βββ metadata.json
βββ adapter_config.json
βββ adapter_model.safetensors
Dataset ZIP Files (both supported):
Option 1 (root):
my_dataset.zip/
βββ dataset_info.json
βββ audio/
β βββ sample_000001.wav
β βββ sample_000002.wav
βββ splits.json
Option 2 (subfolder):
my_dataset.zip/
βββ my_dataset/
βββ dataset_info.json
βββ audio/
βββ splits.json
Error Handling
All import/sync functions include:
- Try-catch blocks for graceful error handling
- Comprehensive logging with context
- User-friendly error messages
- Fallback behavior (e.g., save locally if upload fails)
π HuggingFace Collection Structure
Collection: Gamahea/lemm-100-pre-beta
- Purpose: Organize all LEMM LoRA adapters
- Visibility: Public
- Items: Individual model repos
Model Repos: Gamahea/lemm-lora-{name}
- Type: LoRA adapters (safetensors)
- Metadata: Training config, dataset info, creation date
- Files: adapter_model.safetensors, adapter_config.json, metadata.json
π― User Workflows
Train & Share a LoRA
- Prepare dataset (curated or user audio)
- Configure training parameters
- Click "Start Training"
- Wait for completion
- LoRA automatically uploaded to HF collection
- Share collection link with others
Use Someone's LoRA
- Open LEMM Space
- App automatically syncs LoRAs from collection
- Select LoRA in generation dropdown
- Generate music with custom style
Import a Dataset
- Export dataset from another LEMM instance
- Click "Import Dataset" in training tab
- Upload ZIP file
- Dataset appears in training dropdown
- Use for LoRA training
π Related Files
- HF Storage Service: backend/services/hf_storage_service.py
- Dataset Service: backend/services/dataset_service.py
- Main App: app.py
- LoRA Training Service: backend/services/lora_training_service.py
π Commit History
17f5813 (latest): Add dataset import & LoRA collection sync
import_prepared_dataset()methodsync_on_startup()method- Enhanced
upload_lora()with training_config - Numeric suffix naming for conflicts
f65e448: Fixed LoRA import to support both ZIP structures
2f0c8b4: Added "Load for Training" workflow
b40ee5f: Fixed DataFrame handling in dataset preparation
π Result
Complete HuggingFace ecosystem integration!
- β Auto-sync LoRAs from collection
- β Upload trained LoRAs to collection
- β Import/export datasets
- β Name conflict resolution
- β Comprehensive error handling
- β User-friendly feedback
All three issues from screenshots are now resolved! π