Skip to content

Audio Steps

QuickFlo includes five audio processing steps powered by FFmpeg. Convert between formats, trim by time or silence, merge multiple files, extract audio from video, and probe metadata — all without external services or custom code.

Audio sources can come from managed storage, GCS, S3, or SFTP, or be passed as base64 data from a previous step.

Convert audio between formats with optional sample rate, channel, and bitrate adjustments.

Audio convert step editor showing format selection and audio settings
FieldDescription
Audio SourceStored file URL or base64 data
Audio FormatTarget format (see supported formats below)
ConfigurationAuto (recommended) or manual sample rate, channels, and bitrate
Output FilenameCustom filename — supports templates like {{workflow.name}}_output.wav

Auto mode selects optimal settings per format. Switch to Manual to override sample rate (8–48 kHz), channels (mono/stereo), and bitrate (64–320 kbps).

FormatUse case
MP3General purpose, widely compatible
WAVUncompressed, lossless
FLACLossless compression
OGG VorbisOpen format, good compression
AACApple/iTunes compatible
M4AApple audio (MP4 container)
OpusModern codec, excellent quality/compression ratio
WebMWeb standard
G.711 u-lawTelephony standard (8 kHz mono)
G.711 A-lawTelephony standard (8 kHz mono)

Trim audio by time range or by detecting and removing silence.

Audio trim step editor showing time-based trim configuration
FieldDescription
Start TimeWhere to begin — seconds (30), MM:SS (1:30), or HH:MM:SS (00:01:30)
End TimeWhere to stop — leave empty for end of file
Output FormatLeave empty to keep the original format

Automatically detect and remove silent regions from the beginning and/or end of the audio.

FieldDescription
ThresholdSilence sensitivity in dB (-50 recommended)
Min DurationHow long silence must last to be trimmed (1 second recommended)
Trim Start / EndChoose which sides to trim

The step output includes originalDurationMs and trimmedStartMs / trimmedEndMs so you can see exactly what was removed.

Combine multiple audio files by concatenation (sequential) or mixing (overlay).

Audio merge step editor showing multiple audio source inputs
FieldDescription
Audio FilesTwo or more audio sources (click Add Item for more)
ModeConcatenate (play in sequence) or Mix (overlay simultaneously)
Output FormatTarget format for the merged output
ConfigurationAuto or manual sample rate and channels

Concatenate plays files one after another — output duration is the sum of all inputs. Mix overlays all files at the same time — output duration matches the longest input.

Extract the audio track from a video file. Supports MP4, MKV, WebM, AVI, MOV, WMV, and FLV.

FieldDescription
Video SourceVideo file URL (managed, GCS, S3, or HTTP)
Output FormatAudio format for the extracted track
ConfigurationAuto or manual encoding settings

Analyze an audio file’s metadata without processing it. Returns format, duration, sample rate, channels, bitrate, and file size.

Audio probe step editor and output showing metadata like format, duration, and sample rate
FieldDescription
formatAudio codec (e.g., mp3, aac, pcm_s16le)
containerContainer format (e.g., mp3, wav, m4a)
durationDuration in seconds
durationFormattedDuration as HH:MM:SS
bitrateBitrate in kbps
sampleRateSample rate in Hz
channelsNumber of channels (1 = mono, 2 = stereo)
sizeFile size in bytes

All audio processing steps (except Probe) return an audio object:

{
"audio": {
"url": "gs://your-org/audio/converted_abc123.mp3",
"filename": "converted_abc123.mp3",
"format": "mp3",
"size": 245760
},
"durationMs": 5392
}

Reference the output URL in later steps:

{{ audio-convert.audio.url }}