duongve
/

NetaYume-Lumina-Image-2.0

Diffusion Single File

stable-diffusion

Model card Files Files and versions

NetaYume-Lumina-Image-2.0 / README.md

duongve's picture

Update README.md

fd3af18 verified 4 months ago

|

history blame contribute delete

2.75 kB

	---
	pipeline_tag: text-to-image
	license: apache-2.0
	base_model:
	- neta-art/Neta-Lumina
	- Alpha-VLLM/Lumina-Image-2.0
	tags:
	- stable-diffusion
	- text-to-image
	- comfyui
	- diffusion-single-file
	---

	# NetaYume Lumina Image v2.0
	![NetaYume Lumina Image v2.0](./Example/Demo_v2.png)

	---
	I. Introduction

	NetaYume Lumina is a text-to-image model fine-tuned from [Neta Lumina](https://huggingface.co/neta-art/Neta-Lumina), a high-quality anime-style image generation model developed by [Neta.art Lab](https://huggingface.co/neta-art). It builds upon [Lumina-Image-2.0](https://huggingface.co/Alpha-VLLM/Lumina-Image-2.0), an open-source base model released by the [Alpha-VLLM](https://huggingface.co/Alpha-VLLM) team at Shanghai AI Laboratory.

	This model was trained with the goal of not only generating realistic human images but also producing high-quality anime-style images. Despite being fine-tuned on a specific dataset, it retains a significant amount of knowledge from the base model.

	Key Features:
	- High-Quality Anime Generation: Generates detailed anime-style images with sharp outlines, vibrant colors, and smooth shading.
	- Improved Character Understanding: Better captures characters, especially those from the Danbooru dataset, resulting in more coherent and accurate character representations.
	- Enhanced Fine Details: Accurately generates accessories, clothing textures, hairstyles, and background elements with greater clarity.


	The file NetaYume_Lumina_v2_all_in_one.safetensors is an all-in-one file that contains the necessary weights for the VAE, text encoder, and image backbone to be used with ComfyUI.

	---

	II. Model Components & Training Details
	- Text Encoder: Pre-trained Gemma-2-2b
	- Variational Autoencoder: Pre-trained Flux.1 dev's VAE
	- Image Backbone: Fine-tune NetaLumina's Image Backbone

	---

	III. Suggestion

	System Prompt: This help you generate your desired images more easily by understanding and aligning with your prompts.

	For anime-style images using Danbooru tags:

	You are an assistant designed to generate anime images based on textual prompts.

	You are an assistant designed to generate high-quality images based on user prompts and danbooru tags.

	Recommended Settings
	- CFG: 4–7
	- Sampling Steps: 40-50
	- Sampler:
	- Euler a (with scheduler: normal)
	- res_multistep (with scheduler: linear_quadratic)

	---
	IV. Acknowledgments
	- [narugo1992](https://huggingface.co/narugo) – for the invaluable Danbooru dataset
	- [Alpha-VLLM](https://huggingface.co/Alpha-VLLM) - for creating the a wonderful model!
	- [Neta.art](https://huggingface.co/neta-art/Neta-Lumina) and his team – for openly sharing awesome model.