How to Generate AI Videos with Your Agent (OpenClaw + Veevid)

You can generate AI videos by talking to your agent. No API calls, no dashboards, no manual polling - just say what you want and get a video back.

The Veevid skill connects your OpenClaw agent to Veevid's API, giving it access to 10 video generation models. Your agent handles model selection, cost estimation, generation, and delivery.

What You Need

An OpenClaw agent
A Veevid account with credits (sign up here)
An API key from veevid.ai/settings/api-keys

Step 1: Install the Skill

One command:

npx clawhub@latest install veevid

Then save your API key:

mkdir -p ~/.config/veevid
echo "vv_sk_your_key_here" > ~/.config/veevid/api_key

Your agent now knows how to generate videos.

Step 2: Talk to Your Agent

Text to Video

Describe what you want:

You: "Generate a cinematic video of a golden retriever running through sunflowers"

Agent: "I'll use Veo 3.1 standard mode. Cost: 20 credits. You have 451 credits. Proceed?"

You: "Yes"

Agent: "Done! Here's your 8-second video: [video URL]"

What the agent does:

Picks the best model for your request
Calls /api/quote to get the exact cost
Asks you to confirm before spending credits
Generates the video and polls until ready
Returns the video URL

Image to Video

Send an image and ask your agent to animate it:

You: "Turn this product photo into a video" (attach image)

Agent: finds your image in chat history "Is this the right image?" quotes the message

You: "Yes, use Kling 3.0, 10 seconds"

Agent: "Kling 3.0 Standard, 10 seconds with audio. Cost: 250 credits. You have 400 credits. Proceed?"

You: "Go ahead"

Agent: "Video ready! [video URL]"

The agent handles image upload, model configuration, and polling - you just talk.

On a Budget

You: "Make a quick test video, keep it cheap"

Agent: "The cheapest option is Grok Imagine at 480p for 6 seconds - 10 credits. Want me to use that?"

The 10 Models

Your agent can access all of these:

Model	Best For	Credits
Veo 3.1	Budget-friendly, native audio	20-140 credits
Grok Imagine	Fastest generation, native audio	10-60 credits
Veevid 1.0 Pro	Audio-visual sync	12-288 credits
Seedance 1.5 Pro	Audio, wide aspect ratios	12-288 credits
Kling 3.0	Multi-shot, audio, 3-15s	48-495 credits
LTX 2.3	Up to 4K (2160p), audio	48-960 credits
Kling 2.6	Cinematic, multi-shot, audio	70-280 credits
Sora 2 Stable	Best prompt accuracy, native audio	80-2000 credits
Sora 2	Longest clips (25s), storyboard mode	20-315 credits
Wan 2.6	Video-to-video, multi-shot, audio	100-450 credits

You can let the agent choose, or specify:

"Use Kling 3.0 for this one"

"Give me the cheapest option"

"I need a 20-second video, what can do that?"

Advanced: Multi-Shot and Storyboard

Kling 3.0 Multi-Shot

Kling 3.0 supports multi-shot sequences - multiple prompts with different durations in one video:

You: "Make a multi-shot product ad: first 3 seconds close-up of the phone, then 5 seconds of someone using it, then 2 seconds logo reveal"

The agent constructs the multi_prompt array and handles it for you.

Sora 2 Storyboard

Sora 2 Pro has a storyboard mode for multi-scene narratives:

You: "Create a storyboard video: Scene 1 - sunrise over mountains (5s), Scene 2 - hiker reaching the summit (5s), Scene 3 - panoramic view (5s)"

Start + End Frames (Kling 3.0)

Send two images for precise control:

You: "Use these two images as start and end frames for a 10-second transition" (attach two images)

How It Works

The Veevid skill teaches your agent this workflow:

1. Parse user intent - pick model + parameters
2. POST /api/quote - get exact credit cost
3. Confirm with user - "250 credits, proceed?"
4. POST /api/generate-video - start generation
5. Poll /api/video-generation/{id}/status - wait
6. Return video URL to user

Your agent reads the skill's SKILL.md and follows these steps. No coding needed.

Tips

Costs are confirmed first - The agent quotes before generating, so no surprises
Be specific - "10-second vertical product video with Kling 3.0" works better than "make a video"
Pick the right model - Veo 3.1 for quick and cheap, Kling 3.0 for quality, Sora 2 for long-form
Check your balance - Ask "How many credits do I have?"

Get Started

Install: npx clawhub@latest install veevid
Get your API key: veevid.ai/settings/api-keys
Save it: echo "vv_sk_xxx" > ~/.config/veevid/api_key
Tell your agent: "Generate a video of a cat playing piano"

One skill, 10 models, no friction.

The Veevid skill is on ClawHub. For API docs, see veevid.ai/developers.