Prompt Engineering > Character Consistency

Guide: Multi-Entity Consistency

Welcome to the guide for Vidu AI's most revolutionary feature. Character consistency—keeping a character looking the same from shot to shot—has been the holy grail of AI video. Vidu's "Reference-to-Video" approach provides a powerful solution. Let's walk through how to use it.

What is Multi-Entity Consistency?

It's a feature that lets you upload up to seven reference images to define the key "entities" (characters, objects, or scenes) in your video. Vidu then uses these images as a guide, ensuring the elements in the final video match your references. This is how you tell a story with the same character across multiple clips.

1

Step 1: Create Your Character

Before you can make a video, you need a character. You can use any AI image generator (like Midjourney, Stable Diffusion, or DALL-E 3) to create a high-quality, clear image of your character. For best results, use a simple, neutral background.

Pro Tip: Create a character sheet with your character in different poses (front, side, back) to have more references if needed, but a single clear front-facing image is often enough to start.

2

Step 2: Upload to "My References"

Inside the Vidu AI platform, you'll find a feature called "My References". This is your personal library of characters, objects, and scenes. Upload your character image here. Give it a simple, memorable name (e.g., "AstroGirl").

3

Step 3: Build Your Scene

Now, select your character from the "My References" library to add them to your prompt. You can also add other references, like a background image (e.g., a "Cyberpunk City") or a key object (e.g., a "Robot Dog").

4

Step 4: Write the Action Prompt

With your references selected, your text prompt now only needs to describe the action. You don't need to describe the character's appearance because Vidu already knows what they look like from the reference image.

Example Scenario:

  • Reference 1: Image of your character, "AstroGirl".
  • Reference 2: Image of a "Robot Dog".
  • Reference 3: Image of a "Cyberpunk City" background.

Your Prompt:

AstroGirl is walking her Robot Dog down the street. Cinematic, neon lighting, wide angle shot.

Vidu will now generate a video that combines all three reference images into a single, coherent scene, with your character and her dog looking exactly as they should.


Pro Tips for Best Results

  • High-Quality References: The better your reference images, the better the final video. Use clear, high-resolution images.
  • Isolate Your Subject: For characters and objects, use images with clean, simple backgrounds. This helps Vidu understand exactly what the subject is.
  • Be Specific, But Not Redundant: Your text prompt should focus on the action, mood, and camera work. Let the reference images handle the appearance.
  • Generate Multiple Shots: To create a full scene, generate multiple clips. Use the same references for each clip but change the action prompt (e.g., "AstroGirl pets her Robot Dog," "A close-up of the Robot Dog wagging its tail").

Ready to Be the Director?

You can now create consistent characters. The next step is to learn how to control the camera to make your scenes truly cinematic.

Continue to: Directing the Camera →