Automatic Captions
Automatic captions are only available for Pro and Enterprise plans.
Automatic captions let you add subtitles to your videos without manually writing transcript text. JsonCut analyses the final audio of your video and overlays subtitles that follow the spoken content.
The feature is multilingual and works across many languages. It takes into account all audio sources in your video configuration (for example background music, voiceovers in clips, and separate audio tracks), so the generated subtitles reflect the complete audio mix.
Example
Below is a simple example that plays a gradient background with a fixed title and enables automatic captions.
Show JSON configuration
{
"type": "video",
"config": {
"width": 1080,
"height": 720,
"format": "mp4",
"audioFilePath": "/audio/example.mp3",
"loopAudio": true,
"clipsAudioVolume": 0.5,
"autoSubtitles": true,
"autoSubtitlesStyle": {
"animation": "letter-by-letter",
"textColor": "#FFFFFF",
"fontSize": 55,
"position": {
"x": 0.5,
"y": 0.5,
"originX": "center",
"originY": "center"
},
"outlineColor": "#000000",
"outlineWidth": 4,
"outlineStyle": "outline"
},
"clips": [
{
"duration": 27,
"layers": [
{
"type": "linear-gradient",
"colors": ["#000000", "#333333"],
"direction": "vertical"
},
{
"type": "title",
"text": "Auto subtitle example",
"fontSize": 40,
"textColor": "#FFFFFF",
"position": "top"
}
]
}
]
}
}
This configuration:
- Plays background audio from
audioFilePathat a loop. - Renders a 27 second video with a gradient background and a fixed title.
- Enables automatic captions with a letter-by-letter animation.
- Positions subtitles in the center of the frame.
- Uses a black outline for better readability.
You can combine automatic captions with more complex audio setups (for example multiple audioTracks and clip audio). JsonCut will always base the subtitles on the effective audio mix of your video, so spoken content across different layers is captured in the captions.
Configuration
You enable automatic captions directly on the top level of your video config.
Core properties
| Property | Type | Required | Description |
|---|---|---|---|
autoSubtitles | boolean | ❌ | Enables automatic subtitle generation from the video audio. Default: false. |
autoSubtitlesStyle | object | ❌ | Optional styling for the generated subtitle text, similar to title layers. |
autoSubtitlesStyle properties
autoSubtitlesStyle follows the same constraints as in the official video schema.
| Property | Type | Required | Description |
|---|---|---|---|
animation | string | ❌ | Optional animation style for how text appears. Allowed values: "word-by-word", "letter-by-letter". |
textColor | string | ❌ | Text color in hex format. Pattern: #RRGGBB. Default: #ffffff. |
fontSize | number | ❌ | Font size in pixels. Range: 8–500. |
fontPath | string | ❌ | Path to a custom font file. Must start with / (e.g. /font/.../my-font.ttf). Cannot be used together with googleFont. |
googleFont | string | ❌ | Google Font specification in the format "FontName:weight", for example "Roboto:600". Cannot be used together with fontPath. |
position | object/string | ❌ | Position of the subtitles. Either a string shortcut (for example "bottom", "bottom-left", "bottom-right") or an object with numeric x/y coordinates between 0 and 1 and optional originX/originY anchors. |
outlineColor | string | ❌ | Outline color in hex format (#RRGGBB). Default: #000000. |
outlineWidth | number | ❌ | Outline width in pixels. Must be ≥ 0. |
outlineStyle | string | ❌ | Outline rendering style. Allowed values: "outline", "shadow", "glow". Default: "outline". |
Font selection rule: the schema prevents using both
fontPathandgoogleFontat the same time. Choose exactly one of them when customizing fonts.