ByteDance, the Chinese company behind TikTok and viral video editor CapCut, has released its first AI text-to-video model to compete with OpenAI’s yet-to-be-released Sora—but for now, it’s only available in China.
Jimeng AI was developed by Faceu Technology, a company owned by ByteDance that produces the video editing app CaptCut, available for iPhone and Android, and online.
To gain access, you must log in with a Douyin account, the Chinese version of TikTok. This suggests that if the app comes to other regions, it will be linked to TikTok or CapCut. It’s possible, but pure speculation, that a version of Jimeng will be integrated into CapCut in the future.
ByteDance is not the only Chinese company developing AI video models. Kuaishou is one of China’s largest video apps and last month made Kling AI Video available outside of China for the first time. It’s one of my favorite AI tools with impressive motion quality and video realism.
What is Jimeng AI?
Jimeng AI is a text-to-video model trained and operated by Faceu Technology, the Chinese company behind video editor CapCut. Like Kling, Sora, Runway, and Luma Labs Dream Machine, it takes a text input and generates a few seconds of realistic video content.
The self-proclaimed “one-stop platform for AI creation” lets you create videos from text or images, and gives you control over camera movement as well as inputting the first and last frames. This is something most modern AI video generators offer: you give it two images and it fills in the moments in between.
Faceu’s focus was to ensure that its model could understand and accurately follow Chinese text instructions and transform abstract ideas into visual works.
How does Jimeng AI compare?
From the video clips I’ve seen on social media and the Jimeng website, it looks more like Runway Gen-2 or Pika Labs than Sora, Gen-3, or even Kling. The video movement appears slightly blurry or shaky, and the output seems more comical than realistic.
What I couldn’t confirm, since it’s not available outside of China, is how long each video clip is when first generated and whether you can extend a clip.
Most tools, including Kling, start at 5 seconds, while Runway is 10 seconds and Sora is said to be 15 seconds. Many of them also allow for multiple extensions of that first clip.
I think that because Jimeng is mobile and tied to apps like Douyin and CapCut, it falls into a different category than Kling and Dream Machine. It’s better than Captions App or Diffuse because its content is primarily geared toward social video rather than production.