How to Create an AI Clone

In this article, I will show a step-by-step tutorial of how to use AI Clone function of a2e.ai

What is AI Clone?

An AI clone of a specific user is an approach to creating his / her videos without the person’s physical presence or speaking. AI learns a neural network that is capable of synthesizing audio and images of the specific user. Such learning is accomplished given a short video clip of the user.

A2E’s AI clone is also referred as “instant avatar” or “small sample learning” in other avatar creation website. In academic, such method is called “fine-tuning”.

We have trained a large AI model using the audio and video data collected from thousands of people of different ethics and languages. Such model is referred as the base model. We assign a version to the base model. E.g. V3 as you can see from the video creation tab.

Given new video data of a new user (who is likely not part of our base model training), we start the training process of the base model again with only the new user’s data, so as to learn the face appearance, lip motion, and audio rhythm of the new user. The result of such training process is called “fine-tuned” model. A fine-tuned model is only able to create videos of a specific user while a base model is able to create videos of many users. On the other hand, a fine-tuned model creates much better visual and audio of a specific user than if you directly use the base model.

Step 1: Prepare Data

High quality data is the key to high quality AI clone. Make sure your data is a video clip of the target user speaking from the start to the end.

IdealBad
Video Length60 -120 secondsless than 60 seconds
longer than 300 seconds
FaceContaining only front-look faceSide face, swinging face, upside-down face
# of personsonly 1 personFrames with no person at all.
Frames with more than 1 person
Audio SyncAudio in sync with lip motionaudio not in sync with lip motion
Speaking VolumeMild and comfortable. Normal person speaking.Too loud. Too quite.
NoiseNo noise at allBackground Noise
Noise of air conditioner
Other people speaking
Noise from streets
Resolution1080pless than 720p
TrimStarting from a frame in the middle of the person’s speech. Ending at a frame in the middle of the person’s speech.Starting or ending at a frame when the user stops speech.
Example of ideal data (video clip of a target user) for obtaining AI clone

Step 2: Train Voice and Video

2.1 Train the Voice Model

Click “Voice Clone” tab, type a voice name that you can memorize.

Upload the audio that you extracted from the video file. There are a number of free software allowing you to extract audio from a video file. E.g. capcut, VLC player, or ffmpeg

Alternatively, you can record in live by clicking “Start Recording”.

The uploaded audio file should follow these rules:

IdealBad
Length2-5 minutes< 1 minute
NoiseNo noiseBackground Noise
Noise of air conditioner
Other people speaking
Noise from streets
# of personsonly 1 personAudio containing more than 1 person speaking
Speaking VolumeMild and comfortable. Normal person speaking.Too loud. Too quite.

Click “Start Training” and agree the credits that you will pay.

Wait for 1 minutes. Then you can use English. If you want to use other languages e.g. Chinese, you will need to wait for 24 hours.

2.2 Train the Video Model

Click “Video Twin” tab

Click create new. Upload the video of the user.

Hover the mouse over the thumbnail picture. Click Animate.

Wait for ~ 2 hours. Then go to create video tab. Refresh your browsers. You will see the thumbnail of your video clone.

3. Use the AI Clone

Click create video tab. You will see the picture of your video clone with a diamond sticker.

Select your AI clone. Type some text in the text box on the left. Select a proper launguage. Select “Voice Clone / XXX” from the Voices drop menu.

Click “Preview Audio”. Wait till the spinning wait bar finishes. Listen to the audio.

Click “Generate Video” if you are fine with the generated audio. Or click the speaker icon to listen to the generated audio again.

Wait for several minutes for your results. You will receive a success notice by email. Go to “My Assets” tab. View or download your result.