feat(media): mandatory metadata scrubbing on /feed/publish + FFmpeg sidecar
Every photo from a phone camera ships with an EXIF block that leaks:
GPS coordinates, camera model + serial, original timestamp, software
name, author/copyright fields, sometimes an embedded thumbnail that
survives cropping. For a social feed positioned as privacy-friendly
we can't trust the client alone to scrub — a compromised build,
a future plugin, or a hostile fork would simply skip the step and
leak authorship data.
So: server-side scrub is mandatory for every /feed/publish upload.
New package: media
media/scrub.go
- Scrubber type with Scrub(ctx, bytes, claimedMIME) → (clean, actualMIME)
- ScrubImage handles JPEG/PNG/GIF/WebP in-process: decodes, optionally
downscales to 1080px max-dim, re-encodes as JPEG Q=75. Stdlib
jpeg.Encode emits ZERO metadata → scrub is complete by construction.
- Sidecar client (HTTP): posts video/audio bytes to an external
FFmpeg worker at DCHAIN_MEDIA_SIDECAR_URL
- Magic-byte MIME detection: rejects uploads where declared MIME
doesn't match actual bytes (prevents a PDF dressed as image/jpeg
from bypassing the scrubber)
- ErrSidecarUnavailable: explicit error when video arrives but no
sidecar is wired; operator opts in to fallback via
--allow-unscrubbed-video (default: reject)
media/scrub_test.go
- Crafted EXIF segment with "SECRETGPS-…Canon-EOS-R5" canary —
verifies the string is gone after ScrubImage
- Downscale test (2000×1000 → 1080×540, aspect preserved)
- MIME-mismatch rejection
- Magic-byte detector sanity table
FFmpeg sidecar — new docker/media-sidecar/
Tiny Go HTTP service (~180 LOC, no non-stdlib deps) that shells out
to ffmpeg with -map_metadata -1 + -map 0:v -map 0:a? to guarantee
only video + audio streams survive (no subtitles, attached pictures,
or data channels that could carry hidden info).
Re-encode profile:
video → H.264 CRF 28 preset=fast, Opus 64k, MP4 faststart
audio → Opus 64k, Ogg container
Dockerfile: two-stage build (Go → alpine+ffmpeg), ~90 MB image, non-
root user, /healthz endpoint for compose probes.
Node reaches it via DCHAIN_MEDIA_SIDECAR_URL. Without it, video uploads
are rejected with 503 unless operator sets DCHAIN_ALLOW_UNSCRUBBED_VIDEO.
/feed/publish wiring
- cfg.Scrubber is a required dependency
- Before storing post body we call scrubber.Scrub(); attachment bytes
+ MIME are replaced with the cleaned version
- content_hash is computed over the SCRUBBED bytes — so the on-chain
CREATE_POST tx references exactly what readers will fetch
- EstimatedFeeUT uses the scrubbed size, so author's fee reflects
actual on-disk cost
- Content-type mismatches → 400; sidecar unavailable for video → 503
Flags / env vars
--feed-db / DCHAIN_FEED_DB (existing)
--feed-ttl-days / DCHAIN_FEED_TTL_DAYS (existing)
--media-sidecar-url / DCHAIN_MEDIA_SIDECAR_URL (NEW)
--allow-unscrubbed-video / DCHAIN_ALLOW_UNSCRUBBED_VIDEO (NEW; default false)
Client responsibilities (for reference — client work lands in Phase C)
Even with server-side scrub, the client should still compress aggressively
BEFORE upload, because:
- upload time is ~N× larger for unscrubbed media (mobile networks)
- the server's 256 KiB MaxPostSize is a HARD cap — oversized uploads
are rejected, not silently truncated
- the on-chain fee is size-based, so users pay for every byte the
client didn't bother to shrink
Recommended client pipeline:
images → expo-image-manipulator: resize max-dim 1080px, WebP or
JPEG quality 50-60
videos → react-native-compressor: H.264 CRF 28, 720p max, 64k audio
audio → expo-audio's default Opus 32k (already compressed)
Documented in docs/media-sidecar.md (added later with Phase C PR).
Tests
- go test ./... green across 6 packages (blockchain consensus identity
media relay vm)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -29,11 +29,13 @@ package node
|
||||
// re-publish to another relay.
|
||||
|
||||
import (
|
||||
"context"
|
||||
"crypto/sha256"
|
||||
"encoding/base64"
|
||||
"encoding/hex"
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"log"
|
||||
"net/http"
|
||||
"sort"
|
||||
"strings"
|
||||
@@ -41,6 +43,7 @@ import (
|
||||
|
||||
"go-blockchain/blockchain"
|
||||
"go-blockchain/identity"
|
||||
"go-blockchain/media"
|
||||
"go-blockchain/relay"
|
||||
)
|
||||
|
||||
@@ -53,6 +56,18 @@ type FeedConfig struct {
|
||||
// /feed/publish so the client knows who to put in CREATE_POST tx.
|
||||
HostingRelayPub string
|
||||
|
||||
// Scrubber strips metadata from image/video/audio attachments before
|
||||
// they are stored. MUST be non-nil; a zero Scrubber (NewScrubber with
|
||||
// empty sidecar URL) still handles images in-process — only video/audio
|
||||
// require sidecar config.
|
||||
Scrubber *media.Scrubber
|
||||
|
||||
// AllowUnscrubbedVideo controls server behaviour when a video upload
|
||||
// arrives and no sidecar is configured. false (default) → reject; true
|
||||
// → store as-is with a warning log. Set via --allow-unscrubbed-video
|
||||
// flag on the node. Leave false in production.
|
||||
AllowUnscrubbedVideo bool
|
||||
|
||||
// Chain lookups (nil-safe; endpoints degrade gracefully).
|
||||
GetPost func(postID string) (*blockchain.PostRecord, error)
|
||||
LikeCount func(postID string) (uint64, error)
|
||||
@@ -136,6 +151,7 @@ func feedPublish(cfg FeedConfig) http.HandlerFunc {
|
||||
|
||||
// Decode attachment.
|
||||
var attachment []byte
|
||||
var attachmentMIME string
|
||||
if req.AttachmentB64 != "" {
|
||||
b, err := base64.StdEncoding.DecodeString(req.AttachmentB64)
|
||||
if err != nil {
|
||||
@@ -145,11 +161,48 @@ func feedPublish(cfg FeedConfig) http.HandlerFunc {
|
||||
}
|
||||
}
|
||||
attachment = b
|
||||
attachmentMIME = req.AttachmentMIME
|
||||
|
||||
// MANDATORY server-side scrub: strip ALL metadata (EXIF/GPS/
|
||||
// camera/author/ICC/etc.) and re-compress. Client is expected
|
||||
// to have done a first pass, but we never trust it — a photo
|
||||
// from a phone carries GPS coordinates by default and the client
|
||||
// might forget or a hostile client might skip the scrub entirely.
|
||||
//
|
||||
// Images are handled in-process (stdlib re-encode to JPEG kills
|
||||
// all metadata by construction). Videos/audio are forwarded to
|
||||
// the media sidecar; if none is configured and the operator
|
||||
// hasn't opted in to AllowUnscrubbedVideo, we reject.
|
||||
if cfg.Scrubber == nil {
|
||||
jsonErr(w, fmt.Errorf("media scrubber not configured on this node"), 503)
|
||||
return
|
||||
}
|
||||
ctx, cancel := context.WithTimeout(r.Context(), 60*time.Second)
|
||||
cleaned, newMIME, err := cfg.Scrubber.Scrub(ctx, attachment, attachmentMIME)
|
||||
cancel()
|
||||
if err != nil {
|
||||
// Graceful video fallback only when explicitly allowed.
|
||||
if err == media.ErrSidecarUnavailable && cfg.AllowUnscrubbedVideo {
|
||||
// Keep bytes as-is (operator accepted the risk), just log.
|
||||
log.Printf("[feed] WARNING: storing unscrubbed video — no sidecar configured (author=%s)", req.Author)
|
||||
} else {
|
||||
status := 400
|
||||
if err == media.ErrSidecarUnavailable {
|
||||
status = 503
|
||||
}
|
||||
jsonErr(w, fmt.Errorf("scrub attachment: %w", err), status)
|
||||
return
|
||||
}
|
||||
} else {
|
||||
attachment = cleaned
|
||||
attachmentMIME = newMIME
|
||||
}
|
||||
}
|
||||
|
||||
// Content hash binds the body to the on-chain metadata. We hash
|
||||
// content+attachment so the client can't publish body-A off-chain
|
||||
// and commit hash-of-body-B on-chain.
|
||||
// Content hash is computed over the scrubbed bytes — that's what
|
||||
// the on-chain tx will reference, and what readers fetch. Binds
|
||||
// the body to the metadata so a misbehaving relay can't substitute
|
||||
// a different body under the same PostID.
|
||||
h := sha256.New()
|
||||
h.Write([]byte(req.Content))
|
||||
h.Write(attachment)
|
||||
@@ -181,7 +234,7 @@ func feedPublish(cfg FeedConfig) http.HandlerFunc {
|
||||
Content: req.Content,
|
||||
ContentType: req.ContentType,
|
||||
Attachment: attachment,
|
||||
AttachmentMIME: req.AttachmentMIME,
|
||||
AttachmentMIME: attachmentMIME,
|
||||
ReplyTo: req.ReplyTo,
|
||||
QuoteOf: req.QuoteOf,
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user