UnityShots: Memory-Driven Multi-Shot Audio-Video Generation with Boundary-Aware Gating Paper • 2606.21661 • Published 7 days ago • 21
OmniDirector: General Multi-Shot Camera Cloning without Cross-Paired Data Paper • 2606.13432 • Published 15 days ago • 109
From Pixels to Words -- Towards Native One-Vision Models at Scale Paper • 2605.28820 • Published about 1 month ago • 75
Running on Zero Agents 52 CharacterFactory 🖼 52 Generate consistent character images from text prompts
SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture Paper • 2605.12500 • Published May 12 • 194
Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond Paper • 2604.22748 • Published Apr 24 • 231
Know3D: Prompting 3D Generation with Knowledge from Vision-Language Models Paper • 2603.22782 • Published Mar 24 • 20
MultiWorld: Scalable Multi-Agent Multi-View Video World Models Paper • 2604.18564 • Published Apr 20 • 47
OmniShow: Unifying Multimodal Conditions for Human-Object Interaction Video Generation Paper • 2604.11804 • Published Apr 13 • 72
view article Article NEO-unify: Building Native Multimodal Unified Models End to End sensenova • Mar 5 • 167
OpenWorldLib: A Unified Codebase and Definition of Advanced World Models Paper • 2604.04707 • Published Apr 6 • 204
Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models Paper • 2603.25716 • Published Mar 26 • 157
ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling Paper • 2603.25746 • Published Mar 26 • 155
UniGRPO: Unified Policy Optimization for Reasoning-Driven Visual Generation Paper • 2603.23500 • Published Mar 24 • 36