Image to Video

Cosmos 3 Super Image to Video

Upload any image and describe how it should move. The Generator Tower produces temporally consistent video clips with strong controllability over camera motion, physical dynamics, and scene behavior.

← Back to Home Try Text2Image →

Reference ImagePNG, JPG, WEBP · Max 10MB

Click to upload your image

PNG, JPG, WEBP supported

Motion PromptDescribe how the image should move

Duration

Format

Your generated video will appear here

Upload an image and click Generate

Camera Guide

Camera Motion Language for Image to Video

Use cinematography terms in your motion prompt to give the Generator Tower precise direction.

Dolly In

Camera moves forward toward the subject. Creates intimacy and focus.

e.g. "slowly dolly in toward the face"

Dolly Out

Camera pulls back from the subject. Creates reveal or isolation effect.

e.g. "dolly out to reveal the full scene"

Orbit / Arc

Camera circles around the subject. Great for 3D product showcases.

e.g. "smooth orbit left around the object"

Tracking Shot

Camera follows the subject laterally. Good for motion and action.

e.g. "tracking shot following the car"

Tilt Up/Down

Camera tilts vertically. Good for establishing height and scale.

e.g. "tilt up to reveal the skyscraper"

Crane Shot

Camera rises or descends. Creates dramatic reveal from above.

e.g. "crane shot lifting above the forest"

FAQ

What image formats can I upload?▾

Cosmos 3 Super accepts PNG, JPG, and WEBP images up to 10MB. For best results, use a clear, high-resolution image with a well-defined subject and good lighting.

How does the Generator Tower animate my image?▾

The Generator Tower uses a diffusion-based process conditioned on the Reasoner Tower's understanding of your image's physical properties. It generates future frames that are physically consistent with the original scene.

How long can the generated videos be?▾

You can generate 3, 5, or 7-second videos. 5 seconds is the recommended default for most use cases. Longer videos use more credits.

What resolution are the output videos?▾

Videos are generated at 720P quality. Landscape is 1280×720, portrait is 720×1280, and square is 720×720.