Resources - AI Video Masterclass

📚 Complete AI Video Tools Inventory

All 40+ tools mentioned in the masterclass, organized by category with pricing and capabilities.

🏢 Major AI Company Solutions

OpenAI Sora 2

$20/mo ChatGPT Plus

Capabilities: Up to 20 seconds, native audio, photorealistic quality, 1080p output

Best For: High-quality short clips, establishing shots, cinematic scenes

Limitations: 20s max, requires ChatGPT Plus subscription

Quality: ⭐⭐⭐⭐⭐ | Ease: ⭐⭐⭐⭐⭐

Visit OpenAI Sora →

Google Veo 3

Free Trial Beta

Capabilities: 8 seconds with sound, text-to-video, excellent quality

Best For: Quick clips with native audio, reaction shots

UK Note: Image-to-video NOT available in UK, text-to-video works

Quality: ⭐⭐⭐⭐⭐ | Ease: ⭐⭐⭐⭐

Visit Google Veo →

Meta Movie Gen

Not Public Research

Capabilities: 16 seconds, native audio, best physics simulation

Best For: Future use - currently not publicly available

Limitations: No public access yet, research preview only

Quality: ⭐⭐⭐⭐⭐ | Ease: N/A

Learn More →

🎬 Specialized Video Generation

Runway Gen-3 Alpha Turbo

$12/mo Popular

Capabilities: 10 seconds, cinematic control, motion brushes, camera controls

Best For: Professional creators, fine control over camera movement

Strengths: Best-in-class camera controls, consistent style

Quality: ⭐⭐⭐⭐⭐ | Ease: ⭐⭐⭐⭐

Visit Runway →

Pika 2.0

Free + Paid Beginner-Friendly

Capabilities: 8 seconds, sound effects, lip-sync, scene explosion effects

Best For: Quick tests, stylized content, adding effects

Strengths: Free tier generous, easy interface, fast generation

Quality: ⭐⭐⭐⭐ | Ease: ⭐⭐⭐⭐⭐

Visit Pika →

Kling AI

$7/mo Longest Duration

Capabilities: Up to 2 minutes! Best for longer sequences

Best For: Extended scenes, full conversations, narrative sequences

Strengths: Duration is unmatched, good character consistency

Quality: ⭐⭐⭐⭐ | Ease: ⭐⭐⭐⭐

Visit Kling →

Luma Dream Machine

Free + Paid Fast

Capabilities: 5 seconds, very fast generation (120 seconds)

Best For: Rapid prototyping, testing many variations

Strengths: Speed, good free tier, smooth motion

Quality: ⭐⭐⭐⭐ | Ease: ⭐⭐⭐⭐⭐

Visit Luma →

Haiper AI

$10/mo Unlimited

Capabilities: 4-6 seconds, unlimited generations on paid plan

Best For: High volume creators, testing many prompts

Strengths: Unlimited makes it cost-effective for bulk work

Quality: ⭐⭐⭐⭐ | Ease: ⭐⭐⭐⭐

Visit Haiper →

Freepik AI Video

$9/mo Multi-Model

Capabilities: Access to 5+ models in one subscription

Best For: Comparing different models, budget-conscious

Strengths: Multiple engines, cheap access to variety

Quality: ⭐⭐⭐⭐ | Ease: ⭐⭐⭐⭐

Visit Freepik →

🎭 Avatar & Talking Head Generators

HeyGen

$29/mo ⭐ Recommended

Capabilities: Most realistic avatars, 300+ voices, custom backgrounds

Best For: Historical characters, talking heads, professional videos

Free Trial: 1 minute video credit

Quality: ⭐⭐⭐⭐⭐ | Ease: ⭐⭐⭐⭐⭐

Visit HeyGen →

D-ID

$5.9/mo Budget

Capabilities: Animate photos, good lip-sync, 120+ voices

Best For: Animating historical portraits, budget projects

Strengths: Cheapest avatar option, good quality

Quality: ⭐⭐⭐⭐ | Ease: ⭐⭐⭐⭐⭐

Visit D-ID →

Synthesia

$22/mo Corporate

Capabilities: 200+ avatars, templates, team features

Best For: Professional/corporate content, templates

Strengths: Most polished, best for business use

Quality: ⭐⭐⭐⭐⭐ | Ease: ⭐⭐⭐⭐⭐

Visit Synthesia →

Colossyan

$35/mo Enterprise

Capabilities: AI script writer, conversation mode, 70+ languages

Best For: Training videos, multi-lingual content

Strengths: Built-in scriptwriting, conversation features

Quality: ⭐⭐⭐⭐ | Ease: ⭐⭐⭐⭐

Visit Colossyan →

Elai.io

$23/mo Versatile

Capabilities: Custom avatars, PPT to video, article to video

Best For: Repurposing content, custom avatars

Strengths: Content conversion tools, API access

Quality: ⭐⭐⭐⭐ | Ease: ⭐⭐⭐⭐

Visit Elai →

🎤 Voice & Audio Generation

ElevenLabs

Free + $5/mo ⭐ Best Voice

Capabilities: Voice cloning, 29 languages, ultra-realistic

Best For: Custom character voices, voice cloning

Free Tier: 10,000 characters/month (10 mins)

Quality: ⭐⭐⭐⭐⭐ | Ease: ⭐⭐⭐⭐⭐

Visit ElevenLabs →

Play.ht

$39/mo Professional

Capabilities: 900+ voices, voice cloning, commercial rights

Best For: Podcast creators, audiobooks

Strengths: Huge voice library, clear licensing

Quality: ⭐⭐⭐⭐⭐ | Ease: ⭐⭐⭐⭐

Visit Play.ht →

Murf.ai

$23/mo Team Features

Capabilities: 120+ voices, video sync, team collaboration

Best For: Teams, professional narration

Strengths: Good team features, voice changer

Quality: ⭐⭐⭐⭐ | Ease: ⭐⭐⭐⭐

Visit Murf →

🔓 Open Source Models

HunyuanVideo (Tencent)

Free ⭐ Best Quality

Capabilities: 13B parameters, text-to-video, image-to-video

Hardware: 48GB+ VRAM (dual GPUs or cloud)

Fine-Tune: SkyReels V1 specializes in humans - perfect for historical characters!

Quality: ⭐⭐⭐⭐⭐ | Ease: ⭐⭐

HuggingFace →

Mochi 1 (Genmo)

Free Accessible

Capabilities: 10B parameters, 5.4s at 30fps, Apache 2.0 license

Hardware: 24GB+ VRAM (single RTX 4090)

Strengths: Most accessible high-quality model

Quality: ⭐⭐⭐⭐ | Ease: ⭐⭐⭐

Download →

LTXVideo (Lightricks)

Free Low VRAM

Capabilities: Optimized for speed, 24fps, image/video-to-video

Hardware: 12GB VRAM minimum (RTX 3060)

Strengths: Runs on consumer hardware, very fast

Quality: ⭐⭐⭐⭐ | Ease: ⭐⭐⭐

Download →

Wan-2.1 (Alibaba)

Free Efficient

Capabilities: Multiple variants (1.3B-7B), excellent i2v

Hardware: 8GB+ VRAM (small model)

Strengths: Very efficient, good for limited hardware

Quality: ⭐⭐⭐⭐ | Ease: ⭐⭐⭐

Download →

Open-Sora 2.0

Free Research

Capabilities: 11B params, Sora-like architecture

Hardware: 40GB+ VRAM

Strengths: Academic research, experimental features

Quality: ⭐⭐⭐⭐ | Ease: ⭐⭐

GitHub →

✂️ Video Editing & Enhancement

CapCut

Free ⭐ Beginner

Capabilities: Auto-captions, effects, transitions, templates

Best For: Beginners, social media content, quick edits

Strengths: 100% free, easy interface, AI features

Quality: ⭐⭐⭐⭐ | Ease: ⭐⭐⭐⭐⭐

Download CapCut →

Descript

$12/mo Innovative

Capabilities: Edit video by editing text, AI voices, studio sound

Best For: Podcasters, creators who think in text

Strengths: Unique transcript editing, voice cloning

Quality: ⭐⭐⭐⭐⭐ | Ease: ⭐⭐⭐⭐

Visit Descript →

DaVinci Resolve

Free Professional

Capabilities: Hollywood-grade editing, color grading, effects

Best For: Serious creators, professional quality

Strengths: Free full version, industry standard

Quality: ⭐⭐⭐⭐⭐ | Ease: ⭐⭐⭐

Download →

Topaz Video AI

$299 Enhancement

Capabilities: Upscale to 8K, denoise, deinterlace, frame interpolation

Best For: Enhancing AI-generated or old footage

Strengths: Best upscaling available, one-time purchase

Quality: ⭐⭐⭐⭐⭐ | Ease: ⭐⭐⭐⭐

Visit Topaz →

☁️ Cloud GPU Providers

RunPod

$0.39/hr ⭐ Recommended

GPUs: RTX 4090, A6000, A100

Best For: Beginners, on-demand usage

Strengths: Simple interface, community templates

Visit RunPod →

Vast.ai

$0.20/hr Cheapest

GPUs: Marketplace - various options

Best For: Budget-conscious, flexible needs

Strengths: Lowest prices, many GPU choices

Visit Vast.ai →

Lambda Labs

$1.10/hr Reliable

GPUs: A100, H100

Best For: Heavy models, professional projects

Strengths: Most reliable, fastest GPUs

Visit Lambda →

💡 Ready-to-Use Workflows

Proven workflows for different video types and historical periods.

🎭 Single Character Monologue

Best for: Comedy routines, historical commentary

Script (ChatGPT/Claude)

Write 150-word monologue in character's voice

Voice (ElevenLabs or HeyGen)

Select period-appropriate voice, generate audio

Avatar (HeyGen)

Upload portrait, sync with voice, generate video

Polish (CapCut)

Add subtitles, music, export

⏱️ Total Time: 45-60 minutes | 💰 Cost: $0-2 per video

🎬 Multi-Scene Narrative

Best for: 3-4 minute episodes with story arcs

Plan Structure

4 scenes × 45-60 seconds each. Outline full story

Generate Scenes

Create 4 avatar clips in HeyGen. Batch process to save time

B-Roll (Optional)

Use Sora/Kling for establishing shots, transitions

Edit & Assemble

Stitch in CapCut, add transitions, music, subtitles

⏱️ Total Time: 2-3 hours | 💰 Cost: $5-10 per video

💬 Historical Conversation

Best for: Debates, interviews between historical figures

Write Dialogue

Back-and-forth script between 2 characters

Create Both Avatars

Generate each character separately in HeyGen

Cut Between Characters

Edit conversation by cutting between speakers

Add Split Screen

Optional: Show both characters simultaneously

⏱️ Total Time: 1.5-2 hours | 💰 Cost: $4-8 per video

🔓 Open Source Workflow

Best for: High volume, cost-sensitive projects

Setup RunPod Instance

Deploy Mochi 1 or HunyuanVideo on cloud GPU

Batch Generate

Create 5-10 clips in one session (saves money)

Download & Terminate

Save all files locally, end GPU rental immediately

Edit Locally

Use DaVinci Resolve (free) for final assembly

⏱️ Total Time: 2-4 hours setup + 1hr per video | 💰 Cost: $0.50-2 per video

⚖️ Ethics & Compliance

Legal requirements and ethical guidelines for AI-generated content in the UK.

🔴 MANDATORY: Disclosure Requirements

UK Law Requirements:

Must disclose content is AI-generated
Label as "Created with AI" or "AI-Generated Content"
Apply to ALL platforms (YouTube, TikTok, Instagram)
Include in video itself AND description

Example Disclosure:

"This video features AI-generated characters and voices. Historical figures are portrayed for entertainment purposes."

🔴 MANDATORY: Consent & Rights

Portrait Rights:

Living People: ALWAYS obtain written consent
Historical Figures (pre-1900): Generally public domain
Recent Historical (post-1900): Check estate rights
Paintings/Photos: Check copyright of the IMAGE itself

⚠️ Never use photos of living people without explicit consent, even for parody.

📋 Platform-Specific Rules

YouTube:

Check "altered or synthetic content" in upload settings
Clearly label in description
No misleading thumbnails

TikTok:

Use #AI hashtag
Label in video or caption
Follow community guidelines on synthetic media

Instagram:

Similar to TikTok requirements
Use AI content tags when available
Clear disclosure in caption

⚠️ Avoid These Violations

DON'T:

❌ Use AI to deceive or spread misinformation
❌ Create deepfakes of living people without consent
❌ Defame historical figures in a way that harms their legacy
❌ Use copyrighted music without license
❌ Violate platform ToS by hiding AI generation
❌ Create content that could be considered hate speech

✅ Best Practices

DO:

✅ Always disclose AI use prominently
✅ Use public domain or Creative Commons images
✅ Check tool ToS for commercial use rights
✅ Keep records of image sources and permissions
✅ Portray historical figures respectfully (even in comedy)
✅ Use royalty-free music or original scores
✅ Credit any third-party assets

📄 Commercial Use Rights

Tool Licensing:

HeyGen: Creator plan allows commercial use
D-ID: All plans include commercial rights
Open Source: Check specific license (Apache 2.0 = commercial OK)
Free Trials: Usually personal use only

Always read the Terms of Service before monetizing content!

📚 UK-Specific Resources:
• Ofcom Broadcasting Code: ofcom.org.uk
• ASA Advertising Standards: asa.org.uk
• ICO (Data Protection): ico.org.uk

🎓 Glossary of Terms

Plain English explanations of AI video terminology.

Avatar

A digital character (usually a talking head) generated by AI. In this masterclass, avatars are historical figures brought to life from portraits.

B-Roll

Supplementary footage that plays while narration continues. Examples: establishing shots of castles, close-ups of objects, transition scenes.

Batch Processing

Creating multiple videos in one session. More efficient than generating one at a time. Saves money on cloud GPU rentals.

ControlNet

A technique for guiding AI generation with reference images. Helps maintain consistency across multiple clips.

Deepfake

AI-generated video where a person's face is replaced with another. Requires consent if using real people. Not used in this masterclass.

Diffusion Model

The AI technology behind most video generators. Works by gradually removing noise from random pixels until a coherent video emerges.

Fine-Tune / LoRA

Training an AI model on specific data to specialize it. Example: SkyReels is a fine-tune of HunyuanVideo specialized for human characters.

Frame Rate (fps)

Frames per second. Higher = smoother motion. Most AI videos are 24-30fps. Some tools offer 60fps for ultra-smooth results.

Green Screen / Chroma Key

Filming subject on solid green background, then replacing that background digitally. HeyGen can generate avatars with green backgrounds.

Image-to-Video (i2v)

AI that animates a still image into a video. Used to bring historical portraits to life.

Inference

Running the AI model to generate output. "Inference time" = how long generation takes. Faster inference = less cloud GPU cost.

Latent Space

The mathematical "imagination space" where AI models create content. Not important for users, but you'll see this term in technical docs.

Lip-Sync

Matching mouth movements to audio. HeyGen and D-ID do this automatically. Quality varies - HeyGen is best.

LoRA (Low-Rank Adaptation)

A lightweight fine-tuning method. Adds specialized capabilities to a model without retraining the entire thing. Much faster and cheaper.

Parameters

The "size" of an AI model, measured in billions (B). More parameters generally = better quality but needs more VRAM. Example: Mochi 1 has 10B parameters.

Prompt

The text description you give the AI. Good prompts = better results. Example: "King Henry VIII sits on throne, looking confused at smartphone"

Render

The process of computing the final video file. Can take 2-10 minutes depending on length and quality. Also called "export."

Resolution

Video dimensions in pixels. Common sizes: 1080p (1920×1080), 720p (1280×720), 4K (3840×2160). Higher = better quality but larger files.

Seed

A number that controls randomness in AI generation. Same prompt + same seed = identical output. Useful for consistency.

Synthetic Media

Content created or modified by AI. Includes AI-generated videos, voices, and images. Must be labeled as such in UK.

Text-to-Speech (TTS)

AI that converts written text into spoken audio. ElevenLabs is the best TTS for character voices.

Text-to-Video (t2v)

AI that creates video from text descriptions. Sora 2, Veo 3, Runway Gen-3 are all text-to-video models.

Token

Unit of text processing in AI. Roughly 0.75 words = 1 token. Some tools charge by tokens instead of time.

Uncanny Valley

When AI-generated humans look almost-but-not-quite real, causing discomfort. Good lip-sync and natural movement help avoid this.

Upscale

Increasing video resolution using AI. Topaz Video AI is the best tool for this. Can turn 480p AI output into 1080p or 4K.

Video-to-Video (v2v)

AI that transforms existing video into a different style. Example: making a real video look like a painting.

Voice Cloning

Training AI to replicate a specific voice. ElevenLabs can clone your voice from 1 minute of audio. Requires consent if cloning others.

VRAM

Video RAM - memory on your GPU. More VRAM = can run larger AI models. Open source models need 12-48GB+ VRAM.

Weights

The trained data of an AI model. "Downloading the weights" means getting the model files. Large files - often 10-50GB.

⚡ Quick Reference

🆓 Best Free Tools

CapCut (editing)
DaVinci Resolve (pro editing)
Pika 2.0 (video generation - limited)
All open source models

💰 Best Value ($10-30/mo)

HeyGen ($29) - best avatars
Sora 2 ($20 via ChatGPT Plus)
Descript ($12) - innovative editing
Cloud GPUs (~$10-20/mo light use)

⭐ Beginner Recommended

Start: HeyGen + CapCut
Voice: HeyGen built-in (easier) or ElevenLabs (better)
Editing: CapCut (simplest)
Avoid open source until comfortable

🚀 Pro Creator Stack

Avatars: HeyGen or own hardware
Scenes: HunyuanVideo (open source)
Editing: DaVinci Resolve
Voice: ElevenLabs Professional

← Back to Home Start Session 1 →

Resources & Reference

📚 Complete AI Video Tools Inventory

🏢 Major AI Company Solutions

OpenAI Sora 2

Google Veo 3

Meta Movie Gen

🎬 Specialized Video Generation

Runway Gen-3 Alpha Turbo

Pika 2.0

Kling AI

Luma Dream Machine

Haiper AI

Freepik AI Video

🎭 Avatar & Talking Head Generators

HeyGen

D-ID

Synthesia

Colossyan

Elai.io

🎤 Voice & Audio Generation

ElevenLabs

Play.ht

Murf.ai

🔓 Open Source Models

HunyuanVideo (Tencent)

Mochi 1 (Genmo)

LTXVideo (Lightricks)

Wan-2.1 (Alibaba)

Open-Sora 2.0

✂️ Video Editing & Enhancement

CapCut

Descript

DaVinci Resolve

Topaz Video AI

☁️ Cloud GPU Providers

RunPod

Vast.ai

Lambda Labs

💡 Ready-to-Use Workflows

🎭 Single Character Monologue

🎬 Multi-Scene Narrative

💬 Historical Conversation

🔓 Open Source Workflow

⚖️ Ethics & Compliance

🔴 MANDATORY: Disclosure Requirements

🔴 MANDATORY: Consent & Rights

📋 Platform-Specific Rules

⚠️ Avoid These Violations

✅ Best Practices

📄 Commercial Use Rights

🎓 Glossary of Terms

Avatar

B-Roll

Batch Processing

ControlNet

Deepfake

Diffusion Model

Fine-Tune / LoRA

Frame Rate (fps)

Green Screen / Chroma Key

Image-to-Video (i2v)

Inference

Latent Space

Lip-Sync

LoRA (Low-Rank Adaptation)

Parameters

Prompt

Render

Resolution

Seed

Synthetic Media

Text-to-Speech (TTS)

Text-to-Video (t2v)

Token

Uncanny Valley

Upscale

Video-to-Video (v2v)

Voice Cloning

VRAM

Weights