Which architectural modifications does LCT introduce?

Question

Answers ( 1 )

    0
    2025-04-01T05:29:01+00:00

    Key architectural modifications include:
    - Long-context MMDiT blocks with full attention mechanisms covering all text and video tokens
    - Interleaved 3D Rotary Position Embedding (RoPE) to distinguish between different shots
    - Asynchronous timestep strategy supporting both joint denoising and conditional generation

Leave an answer