|
|
--- |
|
|
license: openrail++ |
|
|
library_name: diffusers |
|
|
pipeline_tag: text-to-image |
|
|
tags: |
|
|
- sdxl |
|
|
- text-to-image |
|
|
- image-generation |
|
|
--- |
|
|
|
|
|
<!-- README Version: v1.4 --> |
|
|
|
|
|
# Stable Diffusion XL FP16 Model Repository |
|
|
|
|
|
Local repository containing Stable Diffusion XL (SDXL) checkpoint models in FP16 precision for high-quality text-to-image generation. |
|
|
|
|
|
## Model Description |
|
|
|
|
|
This repository contains two SDXL checkpoint models optimized for different use cases: |
|
|
|
|
|
- **SDXL Base**: Full-featured SDXL 1.0 base model for high-quality image generation with standard inference steps |
|
|
- **SDXL Turbo**: Fast inference variant optimized for fewer steps (1-4 steps) while maintaining quality |
|
|
|
|
|
Both models use FP16 (16-bit floating point) precision, providing a balance between quality and VRAM efficiency. |
|
|
|
|
|
## Repository Contents |
|
|
|
|
|
``` |
|
|
E:\huggingface\sdxl-fp16\ |
|
|
βββ checkpoints/ |
|
|
β βββ sdxl/ |
|
|
β βββ sdxl-base.safetensors (6.94 GB) |
|
|
β βββ sdxl-turbo.safetensors (13.88 GB) |
|
|
βββ diffusion_models/ |
|
|
β βββ sdxl/ (empty - reserved) |
|
|
βββ loras/ |
|
|
βββ sdxl/ (empty - reserved) |
|
|
``` |
|
|
|
|
|
**Total Repository Size**: ~20.82 GB |
|
|
|
|
|
### Model Files |
|
|
|
|
|
| File | Size | Description | |
|
|
|------|------|-------------| |
|
|
| `sdxl-base.safetensors` | 6.94 GB | SDXL 1.0 base checkpoint (FP16) | |
|
|
| `sdxl-turbo.safetensors` | 13.88 GB | SDXL Turbo checkpoint (FP16) | |
|
|
|
|
|
## Hardware Requirements |
|
|
|
|
|
### SDXL Base |
|
|
- **VRAM**: 8GB minimum, 12GB+ recommended |
|
|
- **Disk Space**: 7GB for model file |
|
|
- **System RAM**: 16GB+ recommended |
|
|
- **GPU**: NVIDIA GPU with CUDA support |
|
|
|
|
|
### SDXL Turbo |
|
|
- **VRAM**: 12GB minimum, 16GB+ recommended |
|
|
- **Disk Space**: 14GB for model file |
|
|
- **System RAM**: 16GB+ recommended |
|
|
- **GPU**: NVIDIA GPU with CUDA support |
|
|
|
|
|
## Usage Examples |
|
|
|
|
|
### SDXL Base (Standard Quality) |
|
|
|
|
|
```python |
|
|
from diffusers import DiffusionPipeline |
|
|
import torch |
|
|
|
|
|
# Load SDXL base model from local path |
|
|
pipe = DiffusionPipeline.from_single_file( |
|
|
"E:/huggingface/sdxl-fp16/checkpoints/sdxl/sdxl-base.safetensors", |
|
|
torch_dtype=torch.float16 |
|
|
) |
|
|
|
|
|
pipe.to("cuda") |
|
|
|
|
|
# Generate image with standard settings |
|
|
image = pipe( |
|
|
prompt="a beautiful mountain landscape at sunset, photorealistic, highly detailed", |
|
|
negative_prompt="blurry, low quality, distorted", |
|
|
num_inference_steps=50, |
|
|
guidance_scale=7.5, |
|
|
width=1024, |
|
|
height=1024 |
|
|
).images[0] |
|
|
|
|
|
image.save("output.png") |
|
|
``` |
|
|
|
|
|
### SDXL Turbo (Fast Generation) |
|
|
|
|
|
```python |
|
|
from diffusers import DiffusionPipeline |
|
|
import torch |
|
|
|
|
|
# Load SDXL Turbo for fast inference |
|
|
pipe = DiffusionPipeline.from_single_file( |
|
|
"E:/huggingface/sdxl-fp16/checkpoints/sdxl/sdxl-turbo.safetensors", |
|
|
torch_dtype=torch.float16 |
|
|
) |
|
|
|
|
|
pipe.to("cuda") |
|
|
|
|
|
# Generate with minimal steps (1-4 steps) |
|
|
image = pipe( |
|
|
prompt="a futuristic cityscape at night, neon lights, cyberpunk", |
|
|
num_inference_steps=4, # Turbo optimized for 1-4 steps |
|
|
guidance_scale=0.0, # Turbo works best with guidance_scale=0 |
|
|
width=1024, |
|
|
height=1024 |
|
|
).images[0] |
|
|
|
|
|
image.save("turbo_output.png") |
|
|
``` |
|
|
|
|
|
### Memory Optimization |
|
|
|
|
|
```python |
|
|
import torch |
|
|
from diffusers import DiffusionPipeline |
|
|
|
|
|
# Enable memory-efficient attention |
|
|
pipe = DiffusionPipeline.from_single_file( |
|
|
"E:/huggingface/sdxl-fp16/checkpoints/sdxl/sdxl-base.safetensors", |
|
|
torch_dtype=torch.float16 |
|
|
) |
|
|
|
|
|
# Apply optimizations |
|
|
pipe.enable_attention_slicing() |
|
|
pipe.enable_vae_slicing() |
|
|
pipe.to("cuda") |
|
|
|
|
|
# Generate with optimized memory usage |
|
|
image = pipe( |
|
|
prompt="your prompt here", |
|
|
num_inference_steps=30 |
|
|
).images[0] |
|
|
``` |
|
|
|
|
|
## Model Specifications |
|
|
|
|
|
### SDXL Base |
|
|
- **Architecture**: Latent Diffusion Model with UNet |
|
|
- **Parameters**: ~2.6B (UNet backbone) |
|
|
- **Precision**: FP16 (16-bit floating point) |
|
|
- **Format**: SafeTensors (secure, efficient) |
|
|
- **Resolution**: 1024x1024 native, supports 512-2048px |
|
|
- **Text Encoders**: Dual CLIP (OpenCLIP ViT-bigG, OpenAI CLIP ViT-L) |
|
|
- **Inference Steps**: 30-50 recommended |
|
|
|
|
|
### SDXL Turbo |
|
|
- **Architecture**: Adversarial Diffusion Distillation (ADD) |
|
|
- **Parameters**: Similar to base with distillation optimizations |
|
|
- **Precision**: FP16 (16-bit floating point) |
|
|
- **Format**: SafeTensors |
|
|
- **Resolution**: 1024x1024 native |
|
|
- **Inference Steps**: 1-4 steps (optimized) |
|
|
- **Guidance Scale**: 0.0 recommended (classifier-free guidance disabled) |
|
|
|
|
|
## Performance Tips |
|
|
|
|
|
### Speed Optimization |
|
|
- **SDXL Turbo**: Use 1-4 steps with `guidance_scale=0.0` for fastest generation |
|
|
- **Attention Slicing**: Enable with `pipe.enable_attention_slicing()` for memory efficiency |
|
|
- **VAE Slicing**: Enable with `pipe.enable_vae_slicing()` to reduce VRAM usage |
|
|
- **Lower Resolutions**: Use 768x768 or 512x512 for faster generation |
|
|
- **Batch Processing**: Process multiple prompts together when VRAM allows |
|
|
|
|
|
### Quality Optimization |
|
|
- **SDXL Base**: Use 40-50 steps for highest quality |
|
|
- **Guidance Scale**: 7.0-9.0 for base model (higher = more prompt adherence) |
|
|
- **Negative Prompts**: Use detailed negative prompts to avoid unwanted elements |
|
|
- **Resolution**: 1024x1024 is the native resolution for best results |
|
|
- **Aspect Ratios**: Multiples of 64 recommended (1024x768, 768x1024, etc.) |
|
|
|
|
|
### VRAM Management |
|
|
- **8GB VRAM**: Use attention slicing, VAE slicing, lower batch sizes |
|
|
- **12GB VRAM**: Standard settings with optimizations |
|
|
- **16GB+ VRAM**: Can handle higher resolutions and batch sizes |
|
|
|
|
|
## Changelog |
|
|
|
|
|
### v1.4 (2025-10-28) |
|
|
- Final verification of repository structure and model integrity |
|
|
- Confirmed all file sizes and paths are accurate |
|
|
- Validated YAML frontmatter format and HuggingFace compliance |
|
|
- Documentation verified complete and production-ready |
|
|
|
|
|
### v1.3 (2025-10-28) |
|
|
- Verified repository structure and model file integrity |
|
|
- Confirmed YAML frontmatter compliance with HuggingFace standards |
|
|
- Validated all file paths and sizes |
|
|
- Updated documentation timestamp |
|
|
|
|
|
### v1.2 (2025-10-14) |
|
|
- Fixed YAML frontmatter: removed base_model fields (these are base models, not derived) |
|
|
- Streamlined tags to essential categories only |
|
|
- Improved metadata compliance with Hugging Face standards |
|
|
|
|
|
### v1.1 (2025-10-14) |
|
|
- Updated YAML frontmatter format (metadata now precedes version header) |
|
|
- Optimized tag ordering for better discoverability |
|
|
- Verified all model files and sizes |
|
|
|
|
|
### v1.0 (2025-10-13) |
|
|
- Initial repository documentation |
|
|
- Added SDXL Base checkpoint (6.94 GB) |
|
|
- Added SDXL Turbo checkpoint (13.88 GB) |
|
|
- Organized directory structure for checkpoints, diffusion models, and LoRAs |
|
|
|
|
|
## License |
|
|
|
|
|
**License**: CreativeML Open RAIL++-M License |
|
|
|
|
|
Stable Diffusion XL models are released under the [CreativeML Open RAIL++-M license](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/LICENSE.md), which permits commercial use with the following key terms: |
|
|
|
|
|
- β
Commercial use permitted |
|
|
- β
Modification and redistribution allowed |
|
|
- β οΈ Use restrictions apply (see full license) |
|
|
- β οΈ Must include license and attribution |
|
|
|
|
|
**Key Restrictions**: Cannot be used for illegal activities, generating harmful content, or violating privacy rights. See full license for complete terms. |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use these models in your research or applications, please cite: |
|
|
|
|
|
```bibtex |
|
|
@misc{podell2023sdxl, |
|
|
title={SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis}, |
|
|
author={Dustin Podell and Zion English and Kyle Lacey and Andreas Blattmann and Tim Dockhorn and Jonas MΓΌller and Joe Penna and Robin Rombach}, |
|
|
year={2023}, |
|
|
eprint={2307.01952}, |
|
|
archivePrefix={arXiv}, |
|
|
primaryClass={cs.CV} |
|
|
} |
|
|
|
|
|
@inproceedings{sauer2023adversarial, |
|
|
title={Adversarial Diffusion Distillation}, |
|
|
author={Sauer, Axel and Lorenz, Dominik and Blattmann, Andreas and Rombach, Robin}, |
|
|
booktitle={arXiv preprint arXiv:2311.17042}, |
|
|
year={2023} |
|
|
} |
|
|
``` |
|
|
|
|
|
## Official Resources |
|
|
|
|
|
- [SDXL Base Model](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) |
|
|
- [SDXL Turbo Model](https://huggingface.co/stabilityai/sdxl-turbo) |
|
|
- [SDXL Documentation](https://huggingface.co/docs/diffusers/using-diffusers/sdxl) |
|
|
- [Diffusers Library](https://github.com/huggingface/diffusers) |
|
|
- [SDXL Paper](https://arxiv.org/abs/2307.01952) |
|
|
- [SDXL Turbo Paper](https://arxiv.org/abs/2311.17042) |
|
|
|
|
|
## Contact & Support |
|
|
|
|
|
- **Issues**: Report issues with models or documentation on [Hugging Face Discussions](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/discussions) |
|
|
- **Community**: Join [Hugging Face Discord](https://discord.gg/hugging-face) for community support |
|
|
- **Repository**: This is a local storage repository - for upstream issues, see official model pages |
|
|
|
|
|
--- |
|
|
|
|
|
**Repository maintained locally** | Last updated: 2025-10-28 | Version: v1.4 |
|
|
|