01 — Audio Capture
input
audiodevicein1
Audio Device In CHOP
Headset mic input
44100 Hz mono
audiofileout1
Audio File Out CHOP
Writes .wav
Record toggled by script
analyze1
Analysis CHOP
RMS Power
per-frame value
02 — RMS Trigger
auto trigger
analyze1
Analysis CHOP
RMS value stream
chopexec2
CHOP Execute DAT
onValueChange()
threshold: 0.08
silence timeout: 2s
cooldown: 3s
audiofileout1
Audio File Out CHOP
par.record toggled
par.file set per take
03 — Manual Override
manual
button1
Button COMP
Press to start
Press to stop
chopexec1
CHOP Execute DAT
onOffToOn()
toggle() start/stop
button1
Button COMP
[ REC ] / Sending...
/ Press to record
04 — Transcription
speech-to-text
chopexec2
CHOP Execute DAT
_transcribe()
reads .wav file
OpenAI Whisper
External API
whisper-1 model
/v1/audio/transcriptions
multipart/form-data
transcript_output
Text DAT
Latest transcription
resets each take
Text TOP
TOP
Renders text
overlay on output
05 — Image Snapshot
visual capture
Visual Output TOP
TOP chain
StreamDiffusion +
Text overlay +
Composite
moviefileout1
Movie File Out TOP
top.save() called
on silence trigger
take_YYYYMMDD.jpg
File on disk
Saved to
Desktop/recordings/
06 — Clap Detection / Question Navigator
high freq
audiodevicein1
Audio Device In CHOP
same mic input
audiofilter1
Audio Filter CHOP
Bandpass
4000–8000 Hz
analyze1_clap
Analysis CHOP
RMS Power
high band only
chopexec_clap
CHOP Execute DAT
onValueChange()
threshold: 0.15
cooldown: 1s
current_question
Text DAT
Reads from
questions Table DAT
loops at 6
07 — Output Files Per Take
.wav
Raw audio recording
Uncompressed PCM 16-bit
Written by audiofileout1
take_20260306_114319.wav
.txt
Transcription result
Plain UTF-8 text
Written by _save_transcript()
take_20260306_114319.txt
.jpg
Visual snapshot
Captured at silence trigger
Written by moviefileout1.save()
take_20260306_114319.jpg
08 — Key Configuration Parameters
| Parameter |
Default |
Description |
| RMS_THRESHOLD | 0.08 | Minimum RMS to start recording. Set to ~2x your room noise floor. |
| SILENCE_TIMEOUT | 2.0s | Seconds of silence before auto-stop and transcription. |
| RESTART_COOLDOWN | 3.0s | Minimum gap between takes. Prevents noise re-triggering. |
| CLAP_THRESHOLD | 0.15 | High-band RMS to detect clap. Tune after filtering to 4–8kHz. |
| CLAP_COOLDOWN | 1.0s | Ignores repeat clap signals within this window. |
| MODEL | whisper-1 | OpenAI model. Alternatively: gpt-4o-transcribe. |
| LANGUAGE | en | Language hint for Whisper. Empty string for auto-detect. |
| MAX_FILE_MB | 24 MB | File size cap before rejecting send. OpenAI limit is 25 MB. |