Session 2 of 3

Multi-Scene Videos & Advanced Workflows

Expand from 60 seconds to 3-4 minute videos with multiple scenes, AI-generated backgrounds, and professional editing techniques.

⏱️ 1 Hour 🎬 Multi-Scene Editing 🎨 AI Backgrounds

🎯 What You'll Create Today

▶️

3-4 Minute Video Preview

Multi-scene historical comedy

Your Final Video Will Include:

  • ✅ 4-6 distinct scenes (45-60 seconds each)
  • ✅ Multiple characters or angles
  • ✅ AI-generated backgrounds and environments
  • ✅ Smooth transitions between scenes
  • ✅ Background music and sound effects
  • ✅ Professional pacing and narrative structure
Total Time: 60-90 minutes (including rendering)

💡 Understanding the Challenge

No AI tool currently generates a complete 3-4 minute video in one click. Here's why:

⏱️

Duration Limits

Sora 2: 20 seconds max
Veo 3: 8 seconds max
Kling AI: Up to 2 minutes (best option)

🧠

AI Limitations

Current AI models struggle with long-term consistency, complex narratives, and sustained character development beyond short clips.

✂️

The Solution

Modular Approach: Create 4-6 separate 45-60 second scenes, then edit them together. This is how professionals work!

💡 Pro Insight: Even big-budget AI film projects use this approach. The viral "AI movie trailers" you see on YouTube are actually dozens of short clips edited together, not single long generations.

📋 Phase 1: Plan Your Multi-Scene Story

Before touching any tools, we need a solid plan. This saves hours of wasted generation time.

1

Design Your Story Structure

⏱️ 15 minutes

Break your 3-4 minute video into digestible scenes. Here's a proven formula:

The Classic 4-Scene Structure

Scene 1: Setup (45 sec)

Introduce your historical character in a modern situation. Establish the comedy premise.

Example: Henry VIII discovers Twitter and creates his first account.

Scene 2: Escalation (60 sec)

Character interacts with the modern world, confusion and humor build.

Example: Henry tries to understand hashtags, accidentally starts trending.

Scene 3: Peak Comedy (60 sec)

The funniest moment. Character's historical traits clash maximally with modern world.

Example: Henry live-tweets his frustration about his ex-wives, goes viral.

Scene 4: Payoff (45 sec)

Resolution or twist. Leave them laughing.

Example: Henry gets "cancelled" on Twitter, doesn't understand why.

Use This Planning Template

VIDEO TITLE: [Your concept]
TOTAL LENGTH: 3-4 minutes
CHARACTER: [Historical figure]

SCENE 1 - [Title]
Duration: 45 seconds
Setting: [Where does this happen?]
What happens: [Brief description]
Key joke: [Main punchline]
Visual: [Avatar or background generation needed?]

SCENE 2 - [Title]
Duration: 60 seconds
Setting: [Where does this happen?]
What happens: [Brief description]
Key joke: [Main punchline]
Visual: [Avatar or background generation needed?]

[Continue for Scenes 3-4]

Example: "Henry VIII Joins Twitter"

Scene 1 (45s): Henry VIII talking head - discovers Twitter, confused by bird logo, creates account "@TheRealKingHenry"

Scene 2 (60s): Henry's first tweets - complaining about the Pope, food photos of his banquets, doesn't understand hashtags

Scene 3 (60s): Henry goes viral - tweets about his "complicated relationship history," people start meme-ing him, he's delighted

Scene 4 (45s): Henry gets cancelled - feminist Twitter finds his tweets problematic, he's confused, ends with "What means this 'ratio'?"

2

Write Individual Scene Scripts

⏱️ 15 minutes

Now write the actual dialogue for each scene. Use the same ChatGPT/Claude method from Session 1, but for each scene separately.

Scene Script Prompt Formula

Write a [X]-second comedy monologue for Scene [NUMBER] of a 4-minute video.

Context: [Brief recap of what happened in previous scenes]

This scene: [What happens in this specific scene]

Character: [Historical figure] - personality traits: [list 3-4 traits]

Key joke: [The main punchline you want to hit]

Tone: [Describe the energy - frantic? confused? pompous?]

Word count: Approximately [X * 2.5] words

Format: Only dialogue - what the character says directly to camera.
💡 Continuity Tip: Write all 4 scene scripts in the same ChatGPT conversation. This helps the AI maintain consistency in the character's voice and reference earlier jokes.

Save Each Script Separately

Create a folder on your computer called "Henry_Twitter_Video" with files:

  • scene1_script.txt
  • scene2_script.txt
  • scene3_script.txt
  • scene4_script.txt

🎬 Phase 2: Generate Your Video Assets

Now we create the actual video clips. You'll use multiple tools depending on your needs.

3

Create Avatar Scenes (HeyGen)

⏱️ 20 minutes

For talking head scenes (like Scenes 1 and 4), use HeyGen exactly like Session 1, but do it 2-4 times.

Batch Generation Strategy

  1. Open HeyGen and create Scene 1 video (same process as Session 1)
  2. While Scene 1 renders (2-5 min), start creating Scene 2 in a new tab
  3. Continue this pattern for all avatar scenes
  4. Download all videos when complete
  5. Name them clearly: "henry_scene1.mp4", "henry_scene2.mp4", etc.

Variation Techniques

To make each scene feel different even with the same character:

  • Change backgrounds: Different colors or settings per scene
  • Vary avatar size: Close-up for intimate moments, medium shot for casual
  • Different voices: Slightly different emotional tone per scene
  • Adjust camera angle: If using multiple portraits, use different angles
⚠️ HeyGen Credit Warning: Each scene uses 1 credit on free plan. You'll need 4 credits = paid plan ($29/mo) or split across multiple free trial accounts (not recommended - violates ToS).
4

Generate Background Scenes (Optional)

⏱️ 20 minutes

For b-roll, establishing shots, or visual transitions between talking scenes, use text-to-video AI.

Tool Selection Guide

Sora 2 (Best Quality)

  • ✅ 20 seconds, native audio
  • ✅ Best physics and realism
  • ❌ Requires ChatGPT Plus ($20/mo)
  • ❌ Shorter duration

Best for: High-quality establishing shots (Tudor castle, modern city)

Veo 3 (Native Audio)

  • ✅ 8 seconds with sound
  • ✅ Excellent quality
  • ❌ Very short
  • ❌ UK image-to-video unavailable

Best for: Quick reaction shots with sound effects

Example Prompts for Background Scenes

🏰 Tudor England Establishing Shot

"Cinematic drone shot slowly pushing in on Hampton Court Palace at golden hour, 16th century English architecture, manicured gardens in foreground, dramatic clouds, period-accurate details, film grain, establishing shot for historical drama"

💻 Modern Social Media Scene

"Close-up shot of smartphone screen showing Twitter/X app interface with notifications rapidly appearing, finger scrolling through timeline, likes and retweets counting up, modern bright lighting, tech commercial aesthetic"

😱 Comedic Reaction Moment

"Dramatic zoom into medieval crown sitting on wooden table, camera shaking slightly for comedic effect, Instagram notification pops up on screen next to it, anachronistic humor, meme-style editing"

💡 B-Roll Strategy: You don't need AI backgrounds for every scene. Mix: (1) Avatar talking heads 60%, (2) AI-generated b-roll 20%, (3) Stock footage or simple graphics 20%. This balances cost and quality.
5

Organize Your Assets

⏱️ 5 minutes

Before editing, organize everything. Future you will thank present you.

Folder Structure

Henry_Twitter_Video/
├── 01_scripts/
│   ├── scene1_script.txt
│   ├── scene2_script.txt
│   ├── scene3_script.txt
│   └── scene4_script.txt
├── 02_avatar_clips/
│   ├── henry_scene1.mp4
│   ├── henry_scene2.mp4
│   └── henry_scene4.mp4
├── 03_background_clips/
│   ├── tudor_castle_establishing.mp4
│   ├── phone_twitter_closeup.mp4
│   └── viral_tweet_animation.mp4
├── 04_audio/
│   ├── background_music.mp3
│   └── sound_effects/
└── 05_final/
    └── [Your finished video goes here]

✂️ Phase 3: Edit & Polish

This is where your separate clips become a cohesive story. We'll use free/affordable editing software.

6

Choose Your Video Editor

⏱️ 2 minutes

Descript (Most Innovative)

  • ✅ Edit by editing text
  • ✅ AI-powered features
  • ❌ $12/month
  • ✅ Free trial available

Try Descript

DaVinci Resolve (Professional)

  • ✅ Free (full version)
  • ✅ Hollywood-grade tools
  • ❌ Steeper learning curve
  • ✅ Best for serious creators

Download DaVinci

7

Assemble Your Video (CapCut Example)

⏱️ 30 minutes

Let's walk through the complete editing process in CapCut (similar in other editors).

7.1 Import All Assets

  1. Open CapCut and create new project
  2. Click "Import" and select your entire "Henry_Twitter_Video" folder
  3. All clips appear in the media library

7.2 Build the Timeline

  1. Drag Scene 1 avatar clip to timeline
  2. Add Scene 2, then Scene 3, Scene 4 in order
  3. Play through - you now have the basic structure
💡 Pro Tip: Don't worry about perfection yet. Get all main clips on timeline first, then refine.

7.3 Add Transitions Between Scenes

  1. Click the point where Scene 1 meets Scene 2
  2. Click "Transition" in toolbar
  3. Choose a transition (recommended: "Dissolve" or "Fade" for comedy)
  4. Set duration to 0.5-1 second
  5. Repeat for all scene breaks
🎬 Transition Guide:
• Comedy content: Simple fades or quick cuts
• Avoid flashy transitions (star wipes, etc.) - they look amateur
• Match transition energy to your content's pacing

7.4 Insert B-Roll & Background Clips

  1. Find spots where you want visual variety (e.g., when Henry mentions "Twitter," show phone screen)
  2. Split your main clip at that point
  3. Insert your AI-generated background clip
  4. Adjust duration to match narration
Example Insert Points:

Scene 1: Henry says "I discovered this Twitter" → cut to 3-second AI shot of phone with Twitter app

Scene 2: Henry mentions "my castle" → cut to 5-second AI establishing shot of Tudor palace

Scene 3: "I'm going viral!" → cut to animated likes/retweets counting up

7.5 Add Background Music

  1. In CapCut, click "Audio" → "Music"
  2. Search for "comedy" or "playful" in CapCut's free library
  3. Or import your own from YouTube Audio Library
  4. Drag music to audio track (below video)
  5. Set volume to 20-30% (music should support, not overpower voice)
  6. Add fade in/out at start and end

7.6 Auto-Generate Subtitles

  1. Click "Text" → "Auto Captions"
  2. Select language: English (UK)
  3. Wait 30-60 seconds for processing
  4. Subtitles appear automatically
  5. Click subtitles to edit style - choose bold, readable font
  6. Position at bottom-center for YouTube, center for TikTok
💡 Subtitle Styling: White text with black outline is most readable. Yellow can work for comedy. Avoid thin fonts - they're hard to read on mobile.

7.7 Add Title Cards

  1. At the beginning, add 2-second title card:
    • "King Henry VIII Joins Twitter"
    • Subtitle: "An AI Comedy"
  2. Between scenes (optional), add 1-second chapter cards:
    • "10 Minutes Later..."
    • "Meanwhile..."
  3. At the end, 3-second outro:
    • "Created with AI"
    • Your channel name/handle

7.8 Sound Effects (Optional)

Add punch to key moments:

  • Twitter notification "ding" when showing phone
  • Dramatic music sting for comedic reveals
  • Crowd gasp when Henry says something outrageous

Free sound effects: Freesound.org or CapCut's built-in library

7.9 Final Review

  1. Watch the entire video start-to-finish
  2. Check:
    • Audio levels consistent throughout?
    • Subtitles accurate?
    • Transitions smooth?
    • Pacing feels right? (not too slow or rushed)
    • Jokes land?
  3. Make adjustments
  4. Watch one more time
8

Export Your Final Video

⏱️ 5 minutes

CapCut Export Settings

  1. Click "Export" in top-right
  2. Settings:
    • Resolution: 1080p (1920x1080 for horizontal, 1080x1920 for vertical)
    • Frame Rate: 30 FPS (or 60 FPS if all source clips are 60fps)
    • Format: MP4
    • Quality: High or Best
  3. Click "Export"
  4. Wait 2-5 minutes for rendering
  5. Save to your "05_final" folder
⚠️ File Size Alert: 3-4 minute 1080p videos are typically 100-300 MB. If much larger, reduce quality slightly. Most platforms handle up to 500MB.

🚀 Advanced Techniques

Multiple Characters

Create different avatars for different characters, then show "conversations" by cutting between them. Works great for historical debates!

Green Screen Compositing

Generate avatar with plain green background in HeyGen, then use CapCut's "Remove Background" to composite over AI-generated scenes.

Recurring Elements

Create a signature intro/outro, lower-third graphics, or recurring visual gags. Build a template you can reuse for future videos.

Viewer Engagement

Add "Subscribe" animations, end screens with video suggestions, or interactive elements to boost engagement.

🔧 Common Multi-Scene Issues

❌ Problem: Scenes feel disconnected

Solution: Add transition clips (establishing shots) between scenes. Use consistent background music throughout. Add chapter cards to signal scene changes.

❌ Problem: Voice sounds different between scenes

Solution: Use the EXACT same voice in HeyGen for all scenes. Save the voice as a preset. Consider generating all scenes in one sitting to ensure consistency.

❌ Problem: Pacing feels off - too slow or too fast

Solution: The 45/60/60/45 second formula is a guide, not a rule. If a joke needs more setup, let it breathe. If something drags, cut it ruthlessly. First viewers will tell you the truth!

❌ Problem: Too expensive - used too many AI credits

Solution: Mix AI with non-AI: (1) Text-only title cards, (2) Stock footage from Pexels/Pixabay, (3) Simple animations you create in CapCut, (4) Longer pauses on static images

🎉 You're a Multi-Scene Video Creator!

You now have the complete workflow for creating professional 3-4 minute AI comedy videos. This is the exact process used by successful AI content creators.

✅ Skills Unlocked

  • ✅ Multi-scene story structure and planning
  • ✅ Batch avatar generation workflow
  • ✅ AI background scene generation
  • ✅ Professional video editing and assembly
  • ✅ Subtitle creation and audio mixing
  • ✅ Export optimization for all platforms

📚 Homework Before Session 3

  1. Complete and publish your first 3-4 minute multi-scene video
  2. Share it on at least one platform (TikTok, YouTube, Instagram)
  3. Analyze viewer feedback - what jokes landed? What dragged?
  4. Calculate your total cost (AI tool subscriptions used)
  5. Identify which parts were most time-consuming

Why? Session 3 will show you how to reduce costs with open source alternatives. Knowing your current workflow costs helps you make smart decisions.