Skip to content

Codex CLI Complete Guide

Sora 2 Prompt Playbook: Scene-Specific Templates & Winning Workflows

What you’ll get from this guide

Apply Sora 2’s six-element prompt skeleton across multiple scene types
Start from ready-to-use templates for live action, ads, anime, hybrids, and explainers
Diagnose failures quickly with before/after fixes while keeping credit spend under control

Building on the Sora 2 Prompt Engineering Guide, this playbook treats the six elements—Subject / Action / Setting / Style / Audio / Length—as reusable modules. Each section provides templates, recommended settings, and tuning switches for specific production goals ranging from cinematic commercials to vertical UGC and Cameo hybrids.

Revisit the Prompt Skeleton

ElementPurposeTips to avoid common failures
SubjectWho/what appearsDescribe up to three attributes (age, attire, props)
ActionMotion / behaviorUse verb + adverb to define energy and timing
SettingLocation / time / lightingLock in color tone with “place + time + lighting”
StyleCamera / lookChoose one or two shot types or rendering styles
AudioSound designPair one environment bed with one focal sound or dialogue
LengthDurationDefault to 10–15 s, then extend for long-form needs

Scene Template Library

Templates follow the “skeleton + recommended settings + quick fix” format. Items in parentheses are swap-in placeholders.

1. Cinematic Commercial (16:9)

A mid-30s female chef finishing a seasonal tasting menu in a dimly lit fine-dining kitchen.
Warm tungsten lighting, cinemascope framing, shallow depth of field. Smooth dolly-in, 4K quality.
Stainless counter ambience, gentle simmering sound. 15 seconds.
  • Recommended: 4K, 15 s, Plus/Pro
  • Goal: Highlight steam, gleam, and lens flare for appetizing mood
  • Quick fix: If motion feels unstable, change “smooth dolly-in” → “steady tripod shot”

2. Documentary Slice of Life (16:9)

A 50-year-old fisherman prepping his boat at dawn on a misty harbor.
Handheld camera, documentary color grade, realistic skin textures. 1080p.
Waves slapping against hull, clinking gear, quiet muttering. 12 seconds.
  • Recommended: 1080p, 12 s
  • Goal: Naturalistic handheld sway + sunrise crosslight for authenticity
  • Quick fix: If grain becomes excessive, remove “handheld” and keep “documentary color grade”

3. Mobile Vertical UGC (9:16)

A high-school dance crew live-streaming practice in the gym after class.
Vertical smartphone footage with gentle hand shake, light film grain, 1080x1920.
Energetic hip-hop track, cheers and sneakers squeaking. 9 seconds.
  • Recommended: 9 s, 1080x1920 (Plus-ready)
  • Goal: Social-first energy with casual authenticity
  • Quick fix: If shake causes nausea, rephrase as “subtle handheld wobble”

4. Product Showcase Macro (16:9)

The latest smart watch rotating on a marble pedestal while UI animations light up the screen.
Macro lens, rim lighting, motion-control rotation, pure white backdrop. 4K resolution.
Crystal-clear chime and interface tap sounds. 10 seconds.
  • Recommended: 4K, 10 s
  • Goal: Crisp reflections and premium finish for hardware marketing
  • Quick fix: Add “softbox lighting, glare controlled” if reflections clip

5. Anime-Style Promo (16:9)

Two high-school protagonists sprint through cherry blossoms in springtime Kyoto.
Hand-drawn 2D anime aesthetic, Studio Ghibli-inspired palette, painterly backgrounds. 12 seconds.
String quartet underscoring, laughter and breathing layered in. 12 seconds.
  • Recommended: 12 s, 1080p
  • Goal: Soft colors, hand-drawn lines, deep parallax
  • Quick fix: If results skew 3D, add “flat shading, limited animation frames, hand-drawn line art”

6. Anime × Live Action Hybrid (Cameo) (16:9)

A star-shaped animated mascot dances across Tokyo’s neon skyline and finishes by lifting the uploaded brand logo.
Live-action night city plate with toon-shaded character composite, smooth camera track. Pro tier.
Synth-pop soundtrack with celebratory fanfare ending. 15 seconds.
  • Recommended: Pro only, 15 s
  • Goal: Seamlessly blend Cameo assets with stylized characters
  • Quick fix: If character blends too much, append “toon outline 4px, minimal ambient occlusion”

7. Corporate Explainer (16:9)

A suited analyst briefs quarterly market trends in front of an interactive whiteboard.
Studio lighting, slide overlays, gentle dolly-out. 1080p. Hand gestures align with captions.
Calm male narration with subtle ambient background music. 14 seconds.
  • Recommended: 14 s, 1080p
  • Goal: Onboarding or webinar clips with clear lip-sync
  • Quick fix: If lip-sync slips, simplify dialogue to “explains key points” and record VO separately

8. Establishing Drone Glide (21:9)

A futuristic skyline filmed by a drone gliding between skyscrapers on a rainy night.
Wide lens, anamorphic flares, wet streets reflecting neon. 21:9, 8K render, Pro tier.
Low-frequency soundscape with distant sirens. 12 seconds.
  • Recommended: Pro, 12 s
  • Goal: World-building shots for sci-fi intros or transitions
  • Quick fix: If physics break, specify “moderate speed, stable flight path”

How to Work with the Templates

  1. Pick a skeleton that matches your goal; swap in highlighted nouns/adjectives.
  2. Add or trim elements—start with Action → Style → Audio, remove conflicting descriptors.
  3. Test short + lower res (1080p / ≤10 s) before scaling to 4K or 20 s.
  4. Log iterations with thumbnails and credits to compare quality vs. spend.

Failure Patterns & Fix Notes

SymptomLikely causeFix
Chaotic camera shakeAggressive “handheld/drone” wordingReplace with “gentle handheld” or “steady flight path”
Audio cuts outToo many overlapping soundsLimit to one environment bed + one feature sound
Flickering highlightsUndefined lightingSpecify “soft diffused lighting” or “studio key + fill”
Wardrobe driftConflicting outfit descriptorsLock a single outfit, add accessories later
Anime looks 3DStyle instructions too sparseAdd “cel shading, flat lighting, hand-drawn edges”

Style Lexicon (Cheat Sheet)

  • Camera moves: tracking shot / dolly-in / jib shot / static tripod / macro pass
  • Lighting: golden hour / tungsten practical / bounce fill / rim light / volumetric haze
  • Look & texture: glossy finish / matte texture / grainy film / watercolor wash / cel shading
  • Audio cues: diegetic crowd noise / whispered narration / foley footsteps / synthesized chime

Workflow Integration Tips

  • Version control: Store prompts + frames together (Notion, Sheets) with tags (#liveaction, #anime).
  • Shotlisting: For long-form, break scripts into shot units → apply relevant template per shot.
  • Collaboration: Share via GitHub Gist or Google Docs and review diffs when teammates tweak prompts.
  • Compliance: Review OpenAI’s Launching Sora responsibly for rights management and safety guidelines.

Further Reading