Architecture

Updated: 2026-05-11

System Context

VideoToolbox Remote bridges a standard FFmpeg client to a dedicated macOS compression server.

flowchart LR
    User["Human User"] --> Client["FFmpeg Client"]
    Client -->|"TCP (B-Frames/Annex B)"| Server["vtremoted (macOS)"]
    Server -->|CVPixelBuffer| VT["VideoToolbox API"]
    VT -->|"Hardware Encode"| HW["Apple Silicon / T2"]

1. Components

Client (FFmpeg)

Server (vtremoted, macOS)

2. Data Flow (Encode)

  1. Handshake: Message HELLO exchange.
  2. Config: Client sends CONFIGURE, Server creates VTCompressionSession.
  3. Stream:
    • In: FRAME (pixels, optional side data)
    • Out: PACKET (H.264/HEVC, optional side data)
  4. Teardown: Client sends FLUSH, then closes.

3. Data Flow (Decode)

  1. Handshake: Message HELLO exchange.
  2. Config: Client sends CONFIGURE, Server creates VTDecompressionSession.
  3. Stream:
    • In: PACKET (Annex B, optional side data)
    • Out: FRAME (software planes or negotiated VideoToolbox output)
  4. Teardown: Client sends FLUSH, then closes.

4. Data Flow (Transcode)

  1. Handshake: Message HELLO exchange.
  2. Config: Client sends CONFIGURE with mode=transcode.
  3. Stream:
    • In: PACKET (Annex B, optional side data)
    • Out: PACKET (Annex B, optional side data)
  4. Teardown: Client sends FLUSH, then closes.

5. Capability-Gated Media Surfaces

The protocol advertises optional capabilities so newer clients can keep working with older servers for the original software-frame paths while failing newer requests during configure. The negotiated 0.4.1 surfaces include:

Hardware-frame ingest across a network is represented as an explicit upload path: local VideoToolbox frames are mapped into the negotiated wire pixel format before the server creates its own CVPixelBuffer. Handles such as IOSurface or CVPixelBuffer references are not treated as cross-host zero-copy objects.

6. Repository Layout

7. Performance Defaults

Defaults applied when the client does not override settings:

Property Default Purpose
ExpectedFrameRate from client Helps VT optimize encode pipeline
PrioritizeEncodingSpeedOverQuality unset Uses VideoToolbox default unless explicitly set
RealTime false Maximize throughput over latency
MaximizePowerEfficiency false Maximize speed over power
MaxFrameDelayCount from -bf Enable/limit frame reordering

[!NOTE] Remote decode defaults to async with a reorder depth of 2. The reorder buffer sorts by PTS and clamps only when PTS would regress.