LiconStudio Ltx2.3-VBVR-lora-I2V

LiconStudio Ltx2.3-VBVR-lora-I2V

LORA
Reprint


Updated:

LTX-2 VBVR LoRA - Video Reasoning:

LoRA fine-tuned weights for LTX-2.3 22B on the VBVR (A Very Big Video Reasoning Suite) dataset.

Training Data

To ensure training quality, we preprocessed the full 1,000,000 videos from the official dataset and randomly sample during training to maintain data diversity. We adopt the official parameters with batch_size=16 and rank=32 to prevent catastrophic forgetting caused by excessively large rank.

The VBVR dataset contains 200 reasoning task categories, with ~5,000 variants per task, totaling ~1M videos. Main task types include:

  • Object Trajectory: Objects moving to target positions

  • Physical Reasoning: Rolling balls, collisions, gravity

  • Causal Relationships: Conditional triggers, chain reactions

  • Spatial Relationships: Relative positions, path planning

Model Details:

Base Model : ltx-2.3-22b-dev

Training Method: LoRA Fine-tuning

LoRA Rank : 32

Effective Batch Size: 16

Mixed Precision : BF16

----------

LoRA Capabilities

This LoRA adapter enhances the base LTX-2 model for production video generation workflows:

  • Enhanced Complex Prompt Understanding: Accurately interprets multi-object, multi-condition prompts with detailed spatial descriptions and temporal sequences, reducing prompt misinterpretation in production scenarios.

  • Improved Motion Dynamics: Generates smooth, physically plausible object movements with natural acceleration, deceleration, and trajectory curves, avoiding robotic or unnatural motion patterns.

  • Temporal Consistency: Maintains object appearance, lighting, and scene coherence throughout the video sequence, reducing flickering and frame-to-frame artifacts common in generated videos.

  • Precise Timing Control: Enables accurate control over action duration, pacing, and synchronization between multiple moving elements based on prompt semantics.

  • Multi-Object Interaction: Handles complex scenes with multiple objects interacting simultaneously, including collisions, following, avoiding, and coordinated movements.

  • Camera and Framing Stability: Maintains consistent camera perspective and framing throughout the sequence, avoiding unwanted camera shake or unexpected viewpoint changes.

Training Configuration:

Learning Rate : 1e-4

Scheduler : Cosine

Gradient Accumulation : 16 steps

Gradient Clipping : 1.0

Optimizer : AdamW

Evaluation Metrics:

Training Steps~6,000

Final Loss~0.008

Loss Reduction 44% (from 0.014 to 0.008)

Dataset :

This model is trained on the VBVR (Video Benchmark for Video Reasoning) dataset from .

Version Detail

LTX-2_3 Img2Video
LTX-2 VBVR LoRA - Video Reasoning .LoRA fine-tuned weights for LTX-2.3 22B on the VBVR (A Very Big Video Reasoning Suite) dataset.

Project Permissions

Model reprinted from : https://huggingface.co/LiconStudio/Ltx2.3-VBVR-lora-I2V

Reprinted models are for communication and learning purposes only, not for commercial use. Original authors can contact us to transfer the models through our Discord channel --- #claim-models.

Related Posts