Every photo from a phone camera ships with an EXIF block that leaks:
GPS coordinates, camera model + serial, original timestamp, software
name, author/copyright fields, sometimes an embedded thumbnail that
survives cropping. For a social feed positioned as privacy-friendly
we can't trust the client alone to scrub — a compromised build,
a future plugin, or a hostile fork would simply skip the step and
leak authorship data.
So: server-side scrub is mandatory for every /feed/publish upload.
New package: media
media/scrub.go
- Scrubber type with Scrub(ctx, bytes, claimedMIME) → (clean, actualMIME)
- ScrubImage handles JPEG/PNG/GIF/WebP in-process: decodes, optionally
downscales to 1080px max-dim, re-encodes as JPEG Q=75. Stdlib
jpeg.Encode emits ZERO metadata → scrub is complete by construction.
- Sidecar client (HTTP): posts video/audio bytes to an external
FFmpeg worker at DCHAIN_MEDIA_SIDECAR_URL
- Magic-byte MIME detection: rejects uploads where declared MIME
doesn't match actual bytes (prevents a PDF dressed as image/jpeg
from bypassing the scrubber)
- ErrSidecarUnavailable: explicit error when video arrives but no
sidecar is wired; operator opts in to fallback via
--allow-unscrubbed-video (default: reject)
media/scrub_test.go
- Crafted EXIF segment with "SECRETGPS-…Canon-EOS-R5" canary —
verifies the string is gone after ScrubImage
- Downscale test (2000×1000 → 1080×540, aspect preserved)
- MIME-mismatch rejection
- Magic-byte detector sanity table
FFmpeg sidecar — new docker/media-sidecar/
Tiny Go HTTP service (~180 LOC, no non-stdlib deps) that shells out
to ffmpeg with -map_metadata -1 + -map 0:v -map 0:a? to guarantee
only video + audio streams survive (no subtitles, attached pictures,
or data channels that could carry hidden info).
Re-encode profile:
video → H.264 CRF 28 preset=fast, Opus 64k, MP4 faststart
audio → Opus 64k, Ogg container
Dockerfile: two-stage build (Go → alpine+ffmpeg), ~90 MB image, non-
root user, /healthz endpoint for compose probes.
Node reaches it via DCHAIN_MEDIA_SIDECAR_URL. Without it, video uploads
are rejected with 503 unless operator sets DCHAIN_ALLOW_UNSCRUBBED_VIDEO.
/feed/publish wiring
- cfg.Scrubber is a required dependency
- Before storing post body we call scrubber.Scrub(); attachment bytes
+ MIME are replaced with the cleaned version
- content_hash is computed over the SCRUBBED bytes — so the on-chain
CREATE_POST tx references exactly what readers will fetch
- EstimatedFeeUT uses the scrubbed size, so author's fee reflects
actual on-disk cost
- Content-type mismatches → 400; sidecar unavailable for video → 503
Flags / env vars
--feed-db / DCHAIN_FEED_DB (existing)
--feed-ttl-days / DCHAIN_FEED_TTL_DAYS (existing)
--media-sidecar-url / DCHAIN_MEDIA_SIDECAR_URL (NEW)
--allow-unscrubbed-video / DCHAIN_ALLOW_UNSCRUBBED_VIDEO (NEW; default false)
Client responsibilities (for reference — client work lands in Phase C)
Even with server-side scrub, the client should still compress aggressively
BEFORE upload, because:
- upload time is ~N× larger for unscrubbed media (mobile networks)
- the server's 256 KiB MaxPostSize is a HARD cap — oversized uploads
are rejected, not silently truncated
- the on-chain fee is size-based, so users pay for every byte the
client didn't bother to shrink
Recommended client pipeline:
images → expo-image-manipulator: resize max-dim 1080px, WebP or
JPEG quality 50-60
videos → react-native-compressor: H.264 CRF 28, 720p max, 64k audio
audio → expo-audio's default Opus 32k (already compressed)
Documented in docs/media-sidecar.md (added later with Phase C PR).
Tests
- go test ./... green across 6 packages (blockchain consensus identity
media relay vm)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
150 lines
4.6 KiB
Go
150 lines
4.6 KiB
Go
package media
|
||
|
||
import (
|
||
"bytes"
|
||
"image"
|
||
"image/color"
|
||
"image/jpeg"
|
||
"testing"
|
||
)
|
||
|
||
// TestScrubImageRemovesEXIF: our scrubber re-encodes via stdlib JPEG, which
|
||
// does not preserve EXIF by construction. We verify that a crafted input
|
||
// carrying an EXIF marker produces an output without one.
|
||
func TestScrubImageRemovesEXIF(t *testing.T) {
|
||
// Build a JPEG that explicitly contains an APP1 EXIF segment.
|
||
// Structure: JPEG SOI + APP1 with "Exif\x00\x00" header + real image data.
|
||
var base bytes.Buffer
|
||
img := image.NewRGBA(image.Rect(0, 0, 8, 8))
|
||
for y := 0; y < 8; y++ {
|
||
for x := 0; x < 8; x++ {
|
||
img.Set(x, y, color.RGBA{uint8(x * 32), uint8(y * 32), 128, 255})
|
||
}
|
||
}
|
||
if err := jpeg.Encode(&base, img, &jpeg.Options{Quality: 80}); err != nil {
|
||
t.Fatalf("encode base: %v", err)
|
||
}
|
||
input := injectEXIF(t, base.Bytes())
|
||
|
||
if !bytes.Contains(input, []byte("Exif\x00\x00")) {
|
||
t.Fatalf("test setup broken: EXIF not injected")
|
||
}
|
||
// Also drop an identifiable string in the EXIF payload so we can prove
|
||
// it's gone.
|
||
if !bytes.Contains(input, []byte("SECRETGPS")) {
|
||
t.Fatalf("test setup broken: EXIF marker not injected")
|
||
}
|
||
|
||
cleaned, mime, err := ScrubImage(input, "image/jpeg")
|
||
if err != nil {
|
||
t.Fatalf("ScrubImage: %v", err)
|
||
}
|
||
if mime != "image/jpeg" {
|
||
t.Errorf("mime: got %q, want image/jpeg", mime)
|
||
}
|
||
// Verify the scrubbed output doesn't contain our canary string.
|
||
if bytes.Contains(cleaned, []byte("SECRETGPS")) {
|
||
t.Errorf("EXIF canary survived scrub — metadata not stripped")
|
||
}
|
||
// Verify the output doesn't contain the EXIF segment marker.
|
||
if bytes.Contains(cleaned, []byte("Exif\x00\x00")) {
|
||
t.Errorf("EXIF header string survived scrub")
|
||
}
|
||
// Output must still be a valid JPEG.
|
||
if _, err := jpeg.Decode(bytes.NewReader(cleaned)); err != nil {
|
||
t.Errorf("scrubbed output is not a valid JPEG: %v", err)
|
||
}
|
||
}
|
||
|
||
// injectEXIF splices a synthetic APP1 EXIF segment after the JPEG SOI.
|
||
// Segment layout: FF E1 <len_hi> <len_lo> "Exif\0\0" + arbitrary payload.
|
||
// The payload is NOT valid TIFF — that's fine; stdlib JPEG decoder skips
|
||
// unknown APP1 segments rather than aborting.
|
||
func injectEXIF(t *testing.T, src []byte) []byte {
|
||
t.Helper()
|
||
if len(src) < 2 || src[0] != 0xFF || src[1] != 0xD8 {
|
||
t.Fatalf("not a JPEG")
|
||
}
|
||
payload := []byte("Exif\x00\x00" + "SECRETGPS-51.5074N-0.1278W-Canon-EOS-R5")
|
||
segmentLen := len(payload) + 2 // +2 = 2 bytes of len field itself
|
||
var seg bytes.Buffer
|
||
seg.Write([]byte{0xFF, 0xE1})
|
||
seg.WriteByte(byte(segmentLen >> 8))
|
||
seg.WriteByte(byte(segmentLen & 0xff))
|
||
seg.Write(payload)
|
||
out := make([]byte, 0, len(src)+seg.Len())
|
||
out = append(out, src[:2]...) // SOI
|
||
out = append(out, seg.Bytes()...)
|
||
out = append(out, src[2:]...)
|
||
return out
|
||
}
|
||
|
||
// TestScrubImageMIMEMismatch: rejects bytes that don't match claimed MIME.
|
||
func TestScrubImageMIMEMismatch(t *testing.T) {
|
||
var buf bytes.Buffer
|
||
img := image.NewRGBA(image.Rect(0, 0, 4, 4))
|
||
jpeg.Encode(&buf, img, nil)
|
||
// Claim it's a PNG.
|
||
_, _, err := ScrubImage(buf.Bytes(), "image/png")
|
||
if err == nil {
|
||
t.Fatalf("expected ErrMIMEMismatch, got nil")
|
||
}
|
||
}
|
||
|
||
// TestScrubImageDownscale: images over ImageMaxDim are shrunk.
|
||
func TestScrubImageDownscale(t *testing.T) {
|
||
// Make a 2000×1000 image — larger dim 2000 > 1080.
|
||
img := image.NewRGBA(image.Rect(0, 0, 2000, 1000))
|
||
for y := 0; y < 1000; y++ {
|
||
for x := 0; x < 2000; x++ {
|
||
img.Set(x, y, color.RGBA{128, 64, 200, 255})
|
||
}
|
||
}
|
||
var buf bytes.Buffer
|
||
if err := jpeg.Encode(&buf, img, &jpeg.Options{Quality: 80}); err != nil {
|
||
t.Fatalf("encode: %v", err)
|
||
}
|
||
cleaned, _, err := ScrubImage(buf.Bytes(), "image/jpeg")
|
||
if err != nil {
|
||
t.Fatalf("ScrubImage: %v", err)
|
||
}
|
||
decoded, err := jpeg.Decode(bytes.NewReader(cleaned))
|
||
if err != nil {
|
||
t.Fatalf("decode scrubbed: %v", err)
|
||
}
|
||
b := decoded.Bounds()
|
||
if b.Dx() > ImageMaxDim || b.Dy() > ImageMaxDim {
|
||
t.Errorf("not downscaled: got %dx%d, want max %d", b.Dx(), b.Dy(), ImageMaxDim)
|
||
}
|
||
// Aspect ratio roughly preserved (2:1 → 1080:540 with rounding slack).
|
||
if b.Dx() != ImageMaxDim {
|
||
t.Errorf("larger dim: got %d, want %d", b.Dx(), ImageMaxDim)
|
||
}
|
||
}
|
||
|
||
// TestDetectMIME: a few magic-byte cases to ensure magic detection works.
|
||
func TestDetectMIME(t *testing.T) {
|
||
cases := []struct {
|
||
data []byte
|
||
want string
|
||
}{
|
||
{[]byte("\xff\xd8\xff\xe0garbage"), "image/jpeg"},
|
||
{[]byte("\x89PNG\r\n\x1a\n..."), "image/png"},
|
||
{[]byte("GIF89a..."), "image/gif"},
|
||
{[]byte{}, ""},
|
||
}
|
||
for _, tc := range cases {
|
||
got := detectMIME(tc.data)
|
||
if got != tc.want {
|
||
t.Errorf("detectMIME(%q): got %q want %q", string(tc.data[:min(len(tc.data), 12)]), got, tc.want)
|
||
}
|
||
}
|
||
}
|
||
|
||
func min(a, b int) int {
|
||
if a < b {
|
||
return a
|
||
}
|
||
return b
|
||
}
|