Open Source · Free Forever · MIT License

Your voice.
Your machine.

Push-to-talk dictation powered by NVIDIA's Parakeet TDT model — running entirely on your own hardware. No cloud, no subscription, no surveillance.

Get the Server Get the App
97.84% accuracy 30× faster than real-time 25 languages OpenAI-compatible API CPU-only, no GPU needed

Architecture
Two projects. One seamless stack.

Spin up the inference server once, then use the desktop client to dictate anywhere — or plug the server into any OpenAI-compatible workflow you already have.

🖥️

Parakeet Server

Python · FastAPI · ONNX Runtime
Runs on localhost:5092

audio / text
🎙️

Parakeet Flow

TypeScript · React · TanStack
Push-to-talk client app

also compatible with
🔌

Any OpenAI Client

Open WebUI, custom scripts,
or your own app


Open Source Projects
Everything you need to run it.

Both projects are MIT licensed, free to use, and built to work together out of the box.

🖥️
Python FastAPI ONNX Docker

Parakeet Server

A high-performance transcription server built around NVIDIA's Parakeet TDT 0.6B v3 model via ONNX Runtime — screaming fast on CPU, with full GPU support when you want it.

  • Up to 30× faster than real-time on consumer CPUs
  • Beats GPU-accelerated Faster Whisper on CPU alone
  • OpenAI-compatible API — works with any existing tooling
  • Silero-VAD auto-chunking for long audio files
  • CPU & GPU Docker images included
  • Built-in web UI for drag-and-drop testing
  • INT8 / FP16 / FP32 model variants selectable per request
  • Open WebUI integration out of the box
View on GitHub
🎙️
TypeScript React TanStack

Parakeet Flow

A sleek local-first dictation app that turns your voice into clipboard text in an instant. Hold Space to record — release to get your transcript.

  • Push-to-talk — hold Space anywhere on the page to record
  • Transcript lands on clipboard automatically
  • Full history with one-click re-copy
  • All audio stays on your machine, never sent to a cloud
  • Point it at any Parakeet TDT endpoint you control
  • Model selector and optional API key support
  • Built with Radix UI + Tailwind CSS 4
  • Runs with Bun, pnpm, or npm
View on GitHub

Performance
Built to be embarrassingly fast.

Benchmarked on LibriSpeech test-clean against professionally verified ground truth transcriptions.

30×
faster than real-time
on a modern CPU
97.84%
transcription accuracy
on LibriSpeech test-clean
25
languages with auto
language detection

Zero accuracy loss at INT8

All three quantization variants — INT8, FP16, FP32 — reach identical 97.84% accuracy. Choosing INT8 gets you maximum speed with nothing left on the table.

English Spanish French German Russian Italian Polish +18 more

Quick Start
Up and running in minutes.

Start the server first, then launch the client and point it at localhost:5092.

Docker (recommended)
Conda
# Clone and start (CPU)
git clone https://github.com/\
  dustinwloring1988/parakeet-server

cd parakeet-server
docker compose up parakeet-cpu -d

# Server ready at http://localhost:5092
# Swagger docs at /docs
conda create -n parakeet python=3.10
conda activate parakeet
git clone https://github.com/\
  dustinwloring1988/parakeet-server
cd parakeet-server
pip install -r requirements.txt
python server.py
Bun
npm / pnpm
git clone https://github.com/\
  dustinwloring1988/parakeet

cd parakeet
bun install
bun run dev

# Open http://localhost:3000
# Set server URL in ⚙️ Settings
git clone https://github.com/\
  dustinwloring1988/parakeet
cd parakeet
pnpm install  # or npm install
pnpm dev      # or npm run dev

# Open http://localhost:3000
💡 Configure the client: Click the gear icon in Parakeet Flow's top-right, enter http://localhost:5092, choose your model (parakeet-tdt-0.6b-v3 is the default), and save. Then hold Space to speak.