The Headless AI
Video Clipping API
for Developers & Company Owners.

Ship short-form video automation as infrastructure. Developers get a clean REST API with structured callbacks. Owners get faster campaign turnaround, lower manual editing overhead, and scalable output across teams.

For Developers

Predictable API contracts, async jobs, and webhook-first delivery.

For Owners

Higher team throughput with less manual post-production effort.

voxcut_request.py
import requests

payload = {
  "video_url": "s3://raw/podcast_042.mp4",
  "aspect_ratio": "9:16",
  "speaker_tracking": True,
  "captions": True,
  "webhook": "https://yourdomain.com/hook"
}

response = requests.post(
  "https://api.voxcut.dev/v1/clip",
  json=payload,
  headers={"Authorization": "Bearer $YOUR_API_KEY"}
)
webhook_response.json 200 OK
{
  "job_id": "vx_9f2a_01",
  "status": "complete",
  "clip_url": "https://cdn.voxcut.dev/clips/vx_9f2a_01.mp4",
  "duration_seconds": 58,
  "speaker_id": "speaker_01",
  "bounding_box": {
    "x": 120, "y": 440,
    "w": 1080, "h": 1920
  },
  "caption_track": "vx_9f2a_01.vtt",
  "topic_tags": ["product_launch", "founder_story"],
  "highlight_score": 0.91
}

Operational outcomes that matter

Faster Time to Publish

Clip delivery without manual timeline work

Lower Ops Friction

One API flow across teams and workflows

Developer-Ready Interfaces

REST endpoints, JSON payloads, signed callbacks

Owner-Level Visibility

Metadata that supports reporting and ROI tracking

One Platform, Two Critical Wins

The same infrastructure serves two stakeholders: engineers integrating automation and owners responsible for delivery speed, margin, and consistency.

Developer Priorities

Integration Without Guesswork

  • • Clean REST ingestion with explicit payload fields.
  • • Async lifecycle updates via webhook callbacks.
  • • Structured metadata ready for downstream services.

Owner Priorities

Scalable Output Economics

  • • Reduce dependency on manual editor throughput.
  • • Shorten turnaround from source video to distribution.
  • • Standardize quality with repeatable API workflows.

Speaker Tracking

Dynamic bounding box adjustment for multi-speaker layouts. Detects and tracks active speakers with sub-frame precision.

Semantic Extraction

AI-driven selection of high-signal moments based on narrative density and engagement metadata. Zero human intervention.

Programmatic Branding

Inject custom caption styles, progress bars, and overlays via JSON configuration. No rendering templates needed.

Webhook Delivery

Receive structured payloads with direct CDN links once processing completes. Designed for programmatic integration, not manual polling.

Integration Flow: Source to Delivery

A production path that gives developers deterministic control and gives owners predictable execution at scale.

01

Ingest via API

Push MP4/MOV URLs or cloud storage paths. VoxCut validates payloads, queues jobs, and starts processing immediately.

POST /v1/ingest
{ "url": "s3://raw/file.mp4" }
02

Async Processing

Our engine extracts speakers, generates dynamic captions, and crops to 9:16 based on focal point analysis, so teams can scale output without scaling manual editing labor.

STATUS: ANALYZING_SCENE
03

Structured Callback

Your registered endpoint receives a signed callback with the finished clip URL, full metadata JSON, and caption track — ready for your CDN, product workflows, and reporting systems.

HTTP 200 CALLBACK
[ { "clip_id": "vc_99" } ]

Output Intelligence Schema

v0.2 // Application-JSON

Key Path Data Type Description
clip_url string <url> Direct signed URL to the processed 9:16 MP4.
bounding_box object Dynamic crop coordinates used for the vertical output.
caption_track string <url> SRT/VTT formatted captions with precise timestamps.
speaker_confidence float [0-1] Confidence score of the active speaker identification.
highlight_score float [0–1] Predicted engagement score for this clip segment.
topic_tags array[string] Semantic tags derived from transcript for campaign taxonomy.
speaker_id string Unique speaker identifier for multi-guest podcasts.

Built for Teams That Ship and Scale

VoxCut is purpose-built for technical builders and business operators who need reliable, repeatable short-form output from long-form media.

professional workspace with multiple monitors showing code and video editing timeline in a dark environment

Company Owners & Operators

Build a repeatable production system that improves delivery speed while reducing dependence on manual post-production.

  • + REPEATABLE OPERATING WORKFLOWS
  • + BETTER RESOURCE UTILIZATION
abstract view of computer motherboard and electronic circuits glowing with green light

Developers & Product Teams

Embed video clipping into your SaaS, internal tooling, or media automation stack using APIs and callback-driven orchestration.

  • + CLEAN JSON METADATA
  • + ASYNC WEBHOOK CALLBACKS
modern podcast studio with professional microphones and red recording indicator light in background

Media Brands & Distribution Teams

Convert large media libraries into publish-ready social clips with consistent structure and downstream delivery compatibility.

  • + BULK JOB PROCESSING
  • + AUTOMATED S3 EXPORTS

Get Early API Access

VoxCut is in private beta. We are onboarding developers and company owners who need production-grade short-form automation before general availability. Fill out the form below and we will follow up within 48 hours.

Business and developer accounts only. We review every request manually.