← Back
Live

TakeSeek is an MCP server that indexes and searches a local video library. It extracts metadata with ffprobe, transcribes audio with faster-whisper, and uses Claude’s vision to describe video frames—all searchable via full-text search.

A key design choice: rather than making separate vision API calls, Claude Code is the vision model. The extract_frames tool returns actual images that Claude observes directly, then documents what it sees and stores those descriptions for search.

Transcription and frame descriptions run in parallel—Claude doesn’t wait for Whisper to finish before starting frame descriptions, and Whisper doesn’t wait for Claude. This roughly halves the total indexing time.

Install

pip install takeseek