AlekseyCalvin's picture
Update README.md
b927cec verified
metadata
license: apache-2.0
base_model: Wan-AI/Wan2.1-T2V-14B
tags:
  - wan
  - video
  - text-to-video
  - diffusion-pipe
  - lora
  - template:sd-lora
  - standard
library_name: diffusers
pipeline_tag: text-to-video
instance_prompt: Jodorowsky psychedelic montage 1970s film by jodorowsky
widget:
  - text: Wan2.1 LoRA:Run3@2950
    output:
      url: samples/1755569403645__000002950_0.webp
  - text: Wan2.1 LoRA:Run3@3450
    output:
      url: 3450_Run3_ii.mov
  - text: Wan2.1 LoRA:Run3@2750
    output:
      url: samples/1755566258585__000002750_0.webp
  - text: Wan2.1 LoRA:Run2@1600
    output:
      url: samples/1755545990540__000001600_0.webp
  - text: Wan2.1 LoRA:Run1@1000
    output:
      url: 1000_Wan21i.mov
  - text: Wan2.1 LoRA:Run3@2450
    output:
      url: samples/1755561576004__000002450_0.webp
  - text: Wan2.1 LoRA:Run3@3450
    output:
      url: 3450_Run3_i.mov
  - text: Wan2.1 LoRA:Run3@2350
    output:
      url: samples/1755560004870__000002350_0.webp
  - text: Wan2.1 LoRA:Run2_pEMA_SigmaRel0.19
    output:
      url: 2000pEMA019_Run2_i.mov
  - text: Wan2.2 LoRA:Run1@1000
    output:
      url: Wan22_St1000_jodorowsky1.mov

ALEJANDRO JODOROWSKY's CINE-SURREELS

A Low(ish) Rank Adapter (LoRA)

For Wan2.* 14B Text to Video Models

||| By SilverAgePoets.com |||

Artistically-specialized text to video generative fine-tuned low-rank adapter (Rank 16 LoRA) for the 14billion-parameter Wan2.1, Wan2.2, and derived base models.
This LoRA was trained on a custom dataset of video clips from classic films by Alejandro Jodorowsky: the great filmmaker, artist, author, psychoanalyst, sage, & occultist/psyche-mage...

Prompt
Wan2.1 LoRA:Run3@2950
Prompt
Wan2.1 LoRA:Run3@3450
Prompt
Wan2.1 LoRA:Run3@2750
Prompt
Wan2.1 LoRA:Run2@1600
Prompt
Wan2.1 LoRA:Run1@1000
Prompt
Wan2.1 LoRA:Run3@2450
Prompt
Wan2.1 LoRA:Run3@3450
Prompt
Wan2.1 LoRA:Run3@2350
Prompt
Wan2.1 LoRA:Run2_pEMA_SigmaRel0.19
Prompt
Wan2.2 LoRA:Run1@1000

To reinforce the adapter, pre-phrase/amend prompts with:

[Jodorowsky] psychedelic montage 1970s film by Alejandro Jodorowsky, etc...
Other suggested prompt-charms: surrealist occult cinema, eclectically collaged scene, dynamic motion, kodachrome, classic countercultural movie, experimental arthouse analog footage, etc...

Training/Usage Notes:

The training, orchestrated using Ostris' ai-toolkit trainer, was conducted in several stages/runs, with each pause/re-start involving a partial changing-out of trained-on clips and substantial modifications of hyperparameters:
Run 1: Steps 0 thru 1000. With lr: 1e-4, content_or_style: content (high noise stage emphasis), and medium resolution samples.
Run 2: Steps 1001 thru 1800. With content_or_style: balanced (balanced noise schedule), lr: 9e-5, lower resolution and changed out/more numerous samples.
Run 3: Steps 1801 thru 3450. With higher resolution samples than previous runs, plus an additional (to linear) training of conv & conv_alpha networks at rank 16, with force_consistent_noise: True, and content_or_style: content (high noise stage emphasis), and lr: 1e-4.
This adapter works with both Wan2.1 and Wan2.2. Euler schedulers work best in our tests. Lower "shift" values typically yield more realism/analog quality, depending on other factors.

With accelerated Inference LoRAs:

All checkpoints are confirmed to work with the Wan2.1 Self-Forcing (T2V), FastWan, and CauseVid accelerater adapter LoRAs.
Checkpoints from Run 3 (but not Runs 1 or 2) are confirmed to work with the Wan2.2 T2V Lightning/4-step Adapter (for the Low Noise Expert transformer).
All checkpoints are also likely to work with the Wan2.2 High Noise T2V Lightning/4-step Adapter, but at worse quality/detailing/chromatic range.