Multimodal-driven architecture for customized video generation. Enables identity-preserving, style-consistent, and subject-driven video creation from reference images and text prompts.

Paper

videogenerationcustomization