AI Video

How to Generate AI Videos with Your Agent (OpenClaw + Veevid)

Install the Veevid skill, say one sentence, and let your AI agent generate videos for you. 10 models, automatic quoting, and zero manual API work.

Alex Chen
How to Generate AI Videos with Your Agent (OpenClaw + Veevid)

You can generate AI videos by talking to your agent. No API calls, no dashboards, no manual polling - just say what you want and get a video back.

The Veevid skill connects your OpenClaw agent to Veevid's API, giving it access to 10 video generation models. Your agent handles model selection, cost estimation, generation, and delivery.

What You Need

Step 1: Install the Skill

One command:

npx clawhub@latest install veevid

Then save your API key:

mkdir -p ~/.config/veevid
echo "vv_sk_your_key_here" > ~/.config/veevid/api_key

Your agent now knows how to generate videos.

Step 2: Talk to Your Agent

Text to Video

Describe what you want:

You: "Generate a cinematic video of a golden retriever running through sunflowers"

Agent: "I'll use Veo 3.1 standard mode. Cost: 20 credits. You have 451 credits. Proceed?"

You: "Yes"

Agent: "Done! Here's your 8-second video: [video URL]"

What the agent does:

  1. Picks the best model for your request
  2. Calls /api/quote to get the exact cost
  3. Asks you to confirm before spending credits
  4. Generates the video and polls until ready
  5. Returns the video URL

Image to Video

Send an image and ask your agent to animate it:

You: "Turn this product photo into a video" (attach image)

Agent: finds your image in chat history "Is this the right image?" quotes the message

You: "Yes, use Kling 3.0, 10 seconds"

Agent: "Kling 3.0 Standard, 10 seconds with audio. Cost: 250 credits. You have 400 credits. Proceed?"

You: "Go ahead"

Agent: "Video ready! [video URL]"

The agent handles image upload, model configuration, and polling - you just talk.

On a Budget

You: "Make a quick test video, keep it cheap"

Agent: "The cheapest option is Grok Imagine at 480p for 6 seconds - 10 credits. Want me to use that?"

The 10 Models

Your agent can access all of these:

ModelBest ForCredits
Veo 3.1Budget-friendly, native audio20-140 credits
Grok ImagineFastest generation, native audio10-60 credits
Veevid 1.0 ProAudio-visual sync12-288 credits
Seedance 1.5 ProAudio, wide aspect ratios12-288 credits
Kling 3.0Multi-shot, audio, 3-15s48-495 credits
LTX 2.3Up to 4K (2160p), audio48-960 credits
Kling 2.6Cinematic, multi-shot, audio70-280 credits
Sora 2 StableBest prompt accuracy, native audio80-2000 credits
Sora 2Longest clips (25s), storyboard mode20-315 credits
Wan 2.6Video-to-video, multi-shot, audio100-450 credits

You can let the agent choose, or specify:

"Use Kling 3.0 for this one"

"Give me the cheapest option"

"I need a 20-second video, what can do that?"

Advanced: Multi-Shot and Storyboard

Kling 3.0 Multi-Shot

Kling 3.0 supports multi-shot sequences - multiple prompts with different durations in one video:

You: "Make a multi-shot product ad: first 3 seconds close-up of the phone, then 5 seconds of someone using it, then 2 seconds logo reveal"

The agent constructs the multi_prompt array and handles it for you.

Sora 2 Storyboard

Sora 2 Pro has a storyboard mode for multi-scene narratives:

You: "Create a storyboard video: Scene 1 - sunrise over mountains (5s), Scene 2 - hiker reaching the summit (5s), Scene 3 - panoramic view (5s)"

Start + End Frames (Kling 3.0)

Send two images for precise control:

You: "Use these two images as start and end frames for a 10-second transition" (attach two images)

How It Works

The Veevid skill teaches your agent this workflow:

1. Parse user intent - pick model + parameters
2. POST /api/quote - get exact credit cost
3. Confirm with user - "250 credits, proceed?"
4. POST /api/generate-video - start generation
5. Poll /api/video-generation/{id}/status - wait
6. Return video URL to user

Your agent reads the skill's SKILL.md and follows these steps. No coding needed.

Tips

  • Costs are confirmed first - The agent quotes before generating, so no surprises
  • Be specific - "10-second vertical product video with Kling 3.0" works better than "make a video"
  • Pick the right model - Veo 3.1 for quick and cheap, Kling 3.0 for quality, Sora 2 for long-form
  • Check your balance - Ask "How many credits do I have?"

Get Started

  1. Install: npx clawhub@latest install veevid
  2. Get your API key: veevid.ai/settings/api-keys
  3. Save it: echo "vv_sk_xxx" > ~/.config/veevid/api_key
  4. Tell your agent: "Generate a video of a cat playing piano"

One skill, 10 models, no friction.


The Veevid skill is on ClawHub. For API docs, see veevid.ai/developers.

Alex Chen

Alex Chen

AI Video Technology Writer at Veevid AI. Covers AI video generation, creative tools, and emerging trends in generative media.