Wan2.1-T2V-CausVid is an advanced text-to-video generation model built upon the Wan2.1-T2V foundation, enhanced with CausVid's causal diffusion approach. This integration enables the model to generate high-quality, temporally consistent videos from text prompts. By leveraging causal diffusion, our model excels at producing coherent long-form videos through an autoregressive generation process, addressing the temporal consistency limitations commonly found in traditional diffusion models. This approach also allows the model to generate videos with significantly fewer inference steps, substantially reducing video generation time while maintaining high quality outputs.
Wan21 CausVid LoRA
LORA
Version Detail
WAN_2_1_14B
further pruned version with only attention layers and no first block, fixes flashing and retains motion better, needs more steps and can also benefit from cfg.
Project Permissions
Model reprinted from : https://huggingface.co/Kijai/WanVideo_comfy/tree/main
Reprinted models are for communication and learning purposes only, not for commercial use. Original authors can contact us to transfer the models through our Discord channel --- #claim-models.