How it works

The pipeline from voice to a shareable diagram artefact.

ThoughtSift has three runtime paths: native capture (or text input) creates a transcript; the transcript is shaped by intelligence and prompt routing into diagram, narration, and image artefacts; and the artefacts are persisted locally and can be shared to a public web viewer.

The pipeline

Record — audio is captured locally.
Transcribe — a transcription engine turns audio into a transcript (see Capture for the engines).
Content intelligence — the transcript is cleaned, summarised, and classified, and a prompt style is recommended.
Diagram generation — a structured DiagramCard is produced and rendered.
Image & narration (optional) — an accompanying image and meaning-first narration can be generated.
History — the result is saved as a local history entry you can replay.

The DiagramCard

The most important contract is the DiagramCard. It is produced from a transcript, stored in history, uploaded into shares, and rendered — in both the app and on the web — by the same diagram viewer. That single canonical viewer is why a shared diagram looks identical wherever you open it.

Direct vs. cloud generation

The result screen can request a diagram directly for a fast preview, while an asynchronous cloud workflow can run the full intelligence → diagram → image sequence server-side and update your history entry as each stage completes.

Diagram types

Diagrams render as one of three layouts — flow, map, or summary — plus a Mermaid type for flowcharts and sequence diagrams.