top of page

SORA 2 from OpenAI - Did you try yet?

OpenAI’s second-gen Sora adds synchronized dialogue and sound, stronger physics, and much tighter control

over multi-shot scenes. It can also insert a short “cameo” of a real person with their voice and appearance into

generated footage and launches alongside an invite-only iOS app for creation and remixing.

● Sora 2 is trained and post-trained on large-scale video so the model keeps track

of objects and cause-and-effect over time. Shots link together more coherently,

bodies and materials behave more plausibly, and audio is generated in step with

the visuals to sell the scene.

● Despite being a video model, Sora 2 can “solve” text benchmarks when framed

visually. EpochAI tested a small GPQA Diamond sample and Sora 2 reached 55%

(vs 72% for GPT-5) by prompting for a video of a professor holding up the answer

letter. Four videos were generated per question and any clip without a clear

letter was marked wrong.

ree

● A likely explanation is a prompt-rewriting LLM layer that first solves the question

and then embeds the solution in the video prompt, similar to re-prompting used

in some other video generators.

Comments


bottom of page