Last week, OpenAI released Sora 2, its most advanced video and audio generation model to date. The Sora 2 and Sora 2 Pro models have been available to consumers through the Sora iOS app, ChatGPT, and Sora.com. At DevDay 2025, OpenAI announced the availability of Sora 2 models via API for developers.
OpenAI claims that Sora 2 models can generate richly detailed, dynamic clips with audio from natural language or images. They also bring a deep understanding of 3D space, motion, and scene continuity to text-to-video generation. The Sora 2 APIs include the following five endpoints:
- Create video: Start a new render job from a prompt, with optional reference inputs or a remix ID.
- Get video status: Retrieve the current state of a render job and monitor its progress.
- Download video: Fetch the finished MP4 once the job is completed.
- List videos: Enumerate your videos with pagination for history, dashboards, or housekeeping.
- Delete videos: Remove an individual video ID from OpenAI’s storage.
In order to support a variety of use cases, OpenAI offers two variants of the Sora 2 model. The sora-2 is the affordable model that is designed for speed and flexibility. It is suitable for rapid iteration, concepting, and rough cuts. OpenAI mentioned that Sora 2 is more than sufficient for generating social media content. The sora-2-pro takes longer to render but can generate production-quality output. It is suitable for creating cinematic footage and marketing assets.
The sora-2 model videos are limited to 1280x720 resolution, whereas the sora-2-pro models can generate 1792x1024 resolution content. Both models support landscape and portrait orientation, and clips up to 12 seconds long. OpenAI mentioned that video input and image-to-video of real people (cameo feature) aren"t supported yet via APIs.
The sora-2 model is available for $0.10 per second for 720p videos, while the sora-2-pro will cost $0.30 per second for 720p videos and $0.50 per second for 1024p videos.