← Projects

Meeting Transcription & Summaries

Live · 2025

WhisperASRLLMPython

People kept asking the same thing after every meeting: who agreed to what? This pipeline answers it. You drop in a recording, and it returns a tidy summary — decisions, owners, and next steps — instead of a raw wall of text.

The problem

Meeting notes are either skipped or written by whoever was least busy, which means they’re inconsistent and often wrong. The useful signal — this person committed to this thing by this date — gets lost. I wanted that captured automatically and in a shape people would actually read.

The pipeline

flowchart LR
A[Audio file] --> B[Whisper<br/>transcribe]
B --> C[Diarize<br/>who spoke]
C --> D[LLM<br/>summarize]
D --> E[Structured notes<br/>decisions · actions]

Four stages, each one swappable on its own.

Notes from building it

  • Transcription is the easy part now — Whisper handles it well.
  • Diarization (separating speakers) is where most of the mess lives, and it matters because “Sohail will do X” is useless if you can’t tell who Sohail is.
  • Summarization only works if you give the model structure to fill in. Asking for “a summary” gives mush; asking for decisions, owners, deadlines gives something usable.

Stack

Python, Whisper (ASR), speaker diarization, a self-hosted LLM for summarization.

Result

Keeping the four stages independent meant I could improve diarization later without touching anything else — the kind of boring modularity that pays off every time you need to change one piece.