Agentic Video Production

EDBOT

AI-powered video editing pipeline. Natural language commands → real edits. Local inference, zero cloud dependency.

37

Tools

57+

Endpoints

6

Agents

95%

Local

What

An AI Agent That Edits Video

EdBot is an agentic video production pipeline built on local LLMs and GPU-accelerated tooling. It ingests raw footage, transcribes speech, detects chapters, scores editorial quality, catalogues everything into a searchable library, and assembles rough cuts — all driven by natural language commands. No cloud APIs. No subscriptions. One machine, one pipeline, full control.

How

Seven-Stage Pipeline

Raw footage flows through seven processing stages. Four are live and battle-tested. The pipeline extends as each stage comes online.

▶

Ingest

LIVE

→

◉

Transcribe

LIVE

→

◈

Analyze

LIVE

→

▣

Catalogue

LIVE

→

◧

Rough Cut

NEXT

→

◫

Export

PLANNED

→

◎

Distribute

PLANNED

Why

Hours Become Minutes

Manual video editing means scrubbing through hours of footage, labeling clips by hand, scoring takes by instinct, and building timelines one cut at a time. EdBot replaces that with NLP commands and automated analysis.

Before — Manual Workflow

Scrub hours of raw footage
Manually label every clip
Guess which takes are best
Hand-build timelines cut by cut
Re-watch to find specific dialogue
Export and re-export for each platform

After — EdBot Pipeline

Drop footage → auto-transcribed in minutes
Every segment indexed and searchable
Chapters scored by editorial quality
NLP search: "find all mentions of Scrooge"
Rough cut assembled from top-scoring chapters
Multi-format export: 16:9, 9:16, 1:1, ProRes

Footage

From the Library

Real frames from the catalogued footage library — 27 chapters extracted, scored, and indexed.

Features

System Capabilities

Each feature is a self-contained module in the pipeline. Live features have real data behind them. Planned features are in development.

Seven-stage processing pipeline from raw ingest through to distribution. Four stages operational.

7 stages · 4 live

Transcription LIVE

GPU-accelerated speech-to-text via faster-whisper. Full-text search across all indexed segments.

148 segments indexed

Chapter Detection LIVE

NLP-based chapter boundary detection with multi-weighted editorial scoring across four dimensions.

27 chapters · scores 3.0–7.3

Footage Library LIVE

Catalogued footage with searchable metadata, auto-extracted thumbnails, and sortable indices.

20 clips · 27 thumbnails

Dialogue Matching LIVE

Cross-reference spoken content across takes and angles for best-take selection and continuity.

175 match groups

Silence Detection LIVE

Automated dead air identification with timestamp-level gap mapping per clip.

Auto-gap identification

Agent System LIVE

Six specialized AI agents orchestrated by AriBot. Local LLM inference via Ollama with per-agent model routing.

6 agents · 11 recipes

Automated Rough Cut NEXT

Score-driven clip assembly. Auto-sequence building from highest-scoring chapters in the catalogue.

Resolve Integration PLANNED

DaVinci Resolve Studio API integration for programmatic timeline building and multi-format export.

Multi-Cam Sync PLANNED

Feed-ID scoped session management with auto-alignment of camera angles and audio sources.

Analytics Loop PLANNED

Post-publish performance data feeds back into the edit cycle. Engagement metrics inform the next rough cut.

YouTube Shorts PLANNED

Portrait reframe pipeline for 9:16 output. Automated YouTube Shorts export from long-form content.

Subtitle Automation PLANNED

Transcription-to-SRT generation with GPU-accelerated subtitle burn via FFmpeg h264_nvenc.