Dolly In
Camera moves forward toward the subject. Creates intimacy and focus.
e.g. "slowly dolly in toward the face"
Image to Video
Upload any image and describe how it should move. The Generator Tower produces temporally consistent video clips with strong controllability over camera motion, physical dynamics, and scene behavior.
Your generated video will appear here
Upload an image and click Generate
Camera Guide
Use cinematography terms in your motion prompt to give the Generator Tower precise direction.
Camera moves forward toward the subject. Creates intimacy and focus.
e.g. "slowly dolly in toward the face"
Camera pulls back from the subject. Creates reveal or isolation effect.
e.g. "dolly out to reveal the full scene"
Camera circles around the subject. Great for 3D product showcases.
e.g. "smooth orbit left around the object"
Camera follows the subject laterally. Good for motion and action.
e.g. "tracking shot following the car"
Camera tilts vertically. Good for establishing height and scale.
e.g. "tilt up to reveal the skyscraper"
Camera rises or descends. Creates dramatic reveal from above.
e.g. "crane shot lifting above the forest"
FAQ
Cosmos 3 Super accepts PNG, JPG, and WEBP images up to 10MB. For best results, use a clear, high-resolution image with a well-defined subject and good lighting.
The Generator Tower uses a diffusion-based process conditioned on the Reasoner Tower's understanding of your image's physical properties. It generates future frames that are physically consistent with the original scene.
You can generate 3, 5, or 7-second videos. 5 seconds is the recommended default for most use cases. Longer videos use more credits.
Videos are generated at 720P quality. Landscape is 1280×720, portrait is 720×1280, and square is 720×720.