Build scalable video infrastructure with the HeyGen API
Cut production costs and time by 95%; generate, translate and scale videos across 175+ languages and dialects.
Total videos generated
Total video translated
Languages
API uptime
Enterprise-grade video intelligence
Build robust video infrastructures with enterprise API and AI capabilities designed for scale, automation, and global reach.

Before translating your video, quickly review and edit the transcript so your message is accurate and clear.

Localise training and product launches in over 175 languages and dialects with 99% lip-sync accuracy.

Turn your internal wiki or knowledge base into engaging, expressive videos with AI-powered text-to-video.

Automate the creation of avatar-led onboarding and L&D videos without cameras, studios or production teams.
Straightforward for developers. Quick for teams to ship
Generate your video in minutes with our straightforward REST API.
curl--request POST \ --url https://api.heygen.com/v2/video_translate \ --header 'accept: application/json' \ --header 'content-type: application/json' \ --header 'x-api-key: <your-api-key>' \ --data '{ "translate_audio_only":"false", "keep_the_same_format":"false", "mode":"precision"}'See how teams build scalable video workflows on HeyGen’s global video intelligence infrastructure.
Enterprise API security, reliability and control
HeyGen provide enterprises with a secure, reliable video infrastructure built to scale AI video with the security, uptime, and controls global teams require.
SecurityYour data security is our top priority. HeyGen are independently audited and certified for SOC 2 Type II and GDPR compliance, ensuring your information stays protected to the highest possible standards.
ReliabilityHeyGen are built for dependable performance with a 99.8% API uptime, ensuring your video infrastructure stays available and your automated video workflows run without interruption.
SupportEnterprise customers get direct access to dedicated support engineers who help ensure smooth deployments, fast issue resolution, and reliable video operations at scale.
ControlManage permissions with role-based access control (RBAC), giving teams the ability to assign access securely, maintain governance, and control how video workflows are created and managed.


Our enterprise API integration brings the advantages of AI video directly into your creator workflows, allowing you to enjoy the ease of generating high-quality videos quickly and efficiently without the hassle of switching between tools.
GDPR
SOC 2 TYPE II
CCPA
AI ACT
DPFCertified to meet global security and compliance standards
The primary distinction lies in the balance between automation and granular control.
The Video Agent API takes a single text prompt and triggers the autonomous orchestration of avatar creation, script writing, and visual asset creation and layouts. It offers a full range of precise control whilst also allowing complete creative freedom. Quite effective for large-scale content exploration, internal video creation and automation. It’s a genuinely distinctive offering across the entire industry.
In comparison, the Standard Video Generation APIs have two main parts: 1) Avatar Video Generation and 2) Template Video Generation. Developers create Avatars and Video Templates using HeyGen’s web platform, which can then be used by the API. Even though it requires more setup, these APIs provide the precise control needed for brand-consistent, high-production-value assets. Enterprise customers have created millions of videos through them to automate their content pipelines.
Yes, here are the steps to use the Photo Avatar API
If you want to use the Avatar Video Generation, you can use or plug in the avatar by following this guide.
Yes, the API supports pure text-to-avatar generation through a structured descriptive framework that eliminates the need for external image assets. By providing specific parameters across eight required fields—including age, gender, ethnicity, and style—the AI synthesises a unique, high-resolution persona. For example, selecting 'East Asian' ethnicity with a 'Professional' style and 'Cinematic' lighting will prompt the engine to return a selection of unique Avatars and Looks, effectively allowing enterprises to scale diverse cast libraries that do not exist in the real world.
You can follow the guide here for prompt-to-avatar.
The template system is designed for high-efficiency 'mail-merge' style video production, where a master layout acts as a container for dynamic data. Users first create or select a template via the Dashboard or API, then identify specific placeholders for text, images, or audio. By sending a single POST request to the template's generate endpoint with a JSON payload of variables, the system automatically renders unique video files for each recipient, making it the industry standard for personalised sales outreach and customised customer onboarding at scale.
To ensure the highest level of realism and lip-sync accuracy, the recommended 'Golden Path' is to programmatically retrieve and utilise the default_voice_id associated with a specific avatar. This method guarantees that the vocal characteristics—such as gender, tone, and regional accent—are already optimised for that avatar's visual persona, significantly reducing the risk of 'uncanny valley' effects. If a bespoke voice is required, developers should always filter the v2/voices list to match the avatar’s metadata to maintain audiovisual consistency.
Because high-fidelity AI video rendering is a resource-intensive process that can take several minutes, the API is designed to be used with an asynchronous, event-driven architecture via Webhooks. Instead of holding an open connection (which leads to timeouts), your application should register a webhook URL to receive an automated 'push' notification once the avatar_video.success event triggers. This allows your backend to remain performant whilst only processing the video—via the provided video_url—the moment it becomes available.
The API provides a very broad global reach, supporting over 40 languages and a library of more than 300 diverse voices, enabling seamless cross-border communication. Beyond simple text-to-speech in different languages, the platform offers 'Video Translation' features that can take an existing video and translate the audio whilst simultaneously re-syncing the avatar’s lip movements to the new language. This ensures that the visual performance remains as authentic in Spanish or Japanese as it was in the original English recording.
HeyGen Video Translation can support 175 languages and dialects (according to here).
See how businesses like yours scale content creation and drive growth with the most innovative AI video technology.
