SORA 2 from OpenAI - Did you try yet?
- AIR Street Capital
- Nov 4
- 1 min read
OpenAI’s second-gen Sora adds synchronized dialogue and sound, stronger physics, and much tighter control
over multi-shot scenes. It can also insert a short “cameo” of a real person with their voice and appearance into
generated footage and launches alongside an invite-only iOS app for creation and remixing.
● Sora 2 is trained and post-trained on large-scale video so the model keeps track
of objects and cause-and-effect over time. Shots link together more coherently,
bodies and materials behave more plausibly, and audio is generated in step with
the visuals to sell the scene.
● Despite being a video model, Sora 2 can “solve” text benchmarks when framed
visually. EpochAI tested a small GPQA Diamond sample and Sora 2 reached 55%
(vs 72% for GPT-5) by prompting for a video of a professor holding up the answer
letter. Four videos were generated per question and any clip without a clear
letter was marked wrong.

● A likely explanation is a prompt-rewriting LLM layer that first solves the question
and then embeds the solution in the video prompt, similar to re-prompting used
in some other video generators.



Comments