test(feed): end-to-end integration + two-node propagation (Phase B hardening)

Adds two integration-test files that exercise the full feed stack over
real HTTP requests, plus a fix to the publish signature model that the
EXIF scrubbing test surfaced.

Bug fix — api_feed.go publish signature flow
  Previously: server scrubbed the attachment → computed content_hash
  over the SCRUBBED bytes → verified the author's signature against
  that hash. But the client, not owning the scrubber, signs over the
  RAW upload. The two hashes differ whenever scrub touches the bytes
  (which it always does for images), so every signed upload with an
  image was rejected as "signature invalid".

  Fixed order:
    1. decode attachment from base64
    2. compute raw_content_hash over Content + raw attachment
    3. verify author's signature against raw_content_hash
    4. scrub attachment (strips EXIF / re-encodes)
    5. compute final_content_hash over Content + scrubbed attachment
    6. return final hash in response for the on-chain CREATE_POST tx

  The signature proves the upload is authentic; the final hash binds
  the on-chain record to what readers actually download.

node/feed_e2e_test.go
  In-process harness: real BadgerDB chain + feed mailbox + media
  scrubber + httptest.Server with RegisterFeedRoutes. Tests drive
  it via real http.Post / http.Get so rate limiters, auth, scrubber,
  and handler code all run on the happy path.

  Tests:
  - TestE2EFullFlow — publish → CREATE_POST tx → body fetch → view
    bump → stats → author list → soft-delete → 410 Gone on re-fetch
  - TestE2ELikeUnlikeAffectsStats — on-chain LIKE_POST bumps /stats,
    liked_by_me reflects the caller
  - TestE2ETimeline — follow graph, merged timeline newest-first
  - TestE2ETrendingRanking — likes × 3 + views puts hot post at [0]
  - TestE2EForYouFilters — excludes own posts + followed authors +
    already-liked posts; surfaces strangers
  - TestE2EHashtagSearch — tag returns only tagged posts
  - TestE2EScrubberStripsEXIF — injects SUPERSECRETGPS canary into a
    JPEG APP1 segment, uploads via /feed/publish, reads back — asserts
    canary is GONE from stored attachment. This is the privacy-critical
    regression gate: if it ever breaks, GPS coordinates leak.
  - TestE2ERejectsMIMEMismatch — PNG labelled as JPEG → 400
  - TestE2ERejectsBadSignature — wrong signer → 403
  - TestE2ERejectsStaleTimestamp — 1-hour-old ts → 400 (anti-replay)

node/feed_twonode_test.go
  Simulates two independent nodes sharing block history (gossip via
  same-block AddBlock on both chains). Verifies the v2.0.0 design
  contract: chain state replicates, but post BODIES live only on the
  hosting relay.

  Tests:
  - TestTwoNodePostPropagation — Alice publishes on A; B's chain sees
    the record; B's HTTP /feed/post/{id} returns 404 (body is A's);
    fetch from A succeeds using hosting_relay field from B's chain
    lookup. Documents the client-side routing contract.
  - TestTwoNodeLikeCounterSharedAcrossNodes — Bob likes from Node B;
    both A's and B's /stats show likes=1. Proves engagement aggregates
    are chain-authoritative, not per-relay.
  - TestTwoNodeFollowGraphReplicates — FOLLOW tx propagates, /timeline
    on B returns A-hosted posts with metadata (no body, as designed).

Coverage summary
  Publish flow (sign → scrub → hash → store):          ✓
  CREATE_POST on-chain fee accounting:                 ✓
  Like / Unlike counter consistency:                   ✓
  Follow graph → timeline merge:                       ✓
  Trending ranking by likes × 3 + views:               ✓
  For You exclusion rules (self, followed, liked):     ✓
  Hashtag inverted index:                              ✓
  View counter increment + stats aggregate:            ✓
  Soft-delete → 410 Gone:                              ✓
  Metadata scrubbing (EXIF canary):                    ✓
  MIME mismatch rejection:                             ✓
  Signature authenticity:                              ✓
  Timestamp anti-replay (±5 min window):               ✓
  Two-node block propagation:                          ✓
  Cross-node body fetch via hosting_relay:             ✓
  Likes aggregation across nodes:                      ✓

All 7 test packages green: blockchain consensus identity media node
relay vm.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
vsecoder
2026-04-18 19:27:00 +03:00
parent f885264d23
commit 9e86c93fda
3 changed files with 1393 additions and 49 deletions

504
node/feed_twonode_test.go Normal file
View File

@@ -0,0 +1,504 @@
// Two-node simulation: verifies that a post published on Node A is
// discoverable and fetchable from Node B after block propagation.
//
// The real network uses libp2p gossipsub for blocks + an HTTP pull
// fallback. For tests we simulate gossip by manually calling chain.AddBlock
// on both nodes with the same block — identical to what each node does
// after receiving a peer's gossiped block in production.
//
// Body ownership: only the HOSTING relay has the post body in its
// feed mailbox. Readers on OTHER nodes see the on-chain record
// (hosting_relay pubkey, content hash, size, author) and fetch the
// body directly from the hosting node over HTTP. That's the design —
// storage costs don't get amortised across the whole network, the
// author pays one node to host, and the public reads from that one
// node (or from replicas if/when we add post pinning in v3.0.0).
package node
import (
"crypto/sha256"
"encoding/base64"
"encoding/hex"
"encoding/json"
"fmt"
"io"
"net/http"
"net/http/httptest"
"os"
"strings"
"testing"
"time"
"go-blockchain/blockchain"
"go-blockchain/identity"
"go-blockchain/media"
"go-blockchain/relay"
)
// twoNodeHarness wires two independent chain+feed instances sharing a
// single block history (simulated gossip). Node A is the "hosting"
// relay; Node B is the reader.
type twoNodeHarness struct {
t *testing.T
aChainDir, aFeedDir string
bChainDir, bFeedDir string
aChain, bChain *blockchain.Chain
aMailbox, bMailbox *relay.FeedMailbox
aServer, bServer *httptest.Server
aHostPub string
bHostPub string
validator *identity.Identity
tipIndex uint64
tipHash []byte
}
func newTwoNodeHarness(t *testing.T) *twoNodeHarness {
t.Helper()
mkdir := func(prefix string) string {
d, err := os.MkdirTemp("", prefix)
if err != nil {
t.Fatalf("MkdirTemp: %v", err)
}
return d
}
h := &twoNodeHarness{
t: t,
aChainDir: mkdir("dchain-2n-chainA-*"),
aFeedDir: mkdir("dchain-2n-feedA-*"),
bChainDir: mkdir("dchain-2n-chainB-*"),
bFeedDir: mkdir("dchain-2n-feedB-*"),
}
var err error
h.aChain, err = blockchain.NewChain(h.aChainDir)
if err != nil {
t.Fatalf("chain A: %v", err)
}
h.bChain, err = blockchain.NewChain(h.bChainDir)
if err != nil {
t.Fatalf("chain B: %v", err)
}
h.aMailbox, err = relay.OpenFeedMailbox(h.aFeedDir, 24*time.Hour)
if err != nil {
t.Fatalf("feed A: %v", err)
}
h.bMailbox, err = relay.OpenFeedMailbox(h.bFeedDir, 24*time.Hour)
if err != nil {
t.Fatalf("feed B: %v", err)
}
h.validator, err = identity.Generate()
if err != nil {
t.Fatalf("validator: %v", err)
}
// Both nodes start from the same genesis — the single bootstrap
// validator allocates the initial supply. In production this is
// hardcoded; in tests we just generate and use it to sign blocks
// on both chains.
genesis := blockchain.GenesisBlock(h.validator.PubKeyHex(), h.validator.PrivKey)
if err := h.aChain.AddBlock(genesis); err != nil {
t.Fatalf("A genesis: %v", err)
}
if err := h.bChain.AddBlock(genesis); err != nil {
t.Fatalf("B genesis: %v", err)
}
h.tipIndex = genesis.Index
h.tipHash = genesis.Hash
// Node A hosts; Node B is a pure reader (no host_pub of its own that
// anyone publishes to). They share a single validator because this
// test isn't about consensus — it's about chain state propagation.
h.aHostPub = h.validator.PubKeyHex()
// Node B uses a separate identity purely for its hosting_relay field
// (never actually hosts anything in this scenario). Distinguishes A
// from B in balance assertions.
idB, _ := identity.Generate()
h.bHostPub = idB.PubKeyHex()
scrubber := media.NewScrubber(media.SidecarConfig{})
aCfg := FeedConfig{
Mailbox: h.aMailbox,
HostingRelayPub: h.aHostPub,
Scrubber: scrubber,
GetPost: h.aChain.Post,
LikeCount: h.aChain.LikeCount,
HasLiked: h.aChain.HasLiked,
PostsByAuthor: h.aChain.PostsByAuthor,
Following: h.aChain.Following,
}
bCfg := FeedConfig{
Mailbox: h.bMailbox,
HostingRelayPub: h.bHostPub,
Scrubber: scrubber,
GetPost: h.bChain.Post,
LikeCount: h.bChain.LikeCount,
HasLiked: h.bChain.HasLiked,
PostsByAuthor: h.bChain.PostsByAuthor,
Following: h.bChain.Following,
}
muxA := http.NewServeMux()
RegisterFeedRoutes(muxA, aCfg)
h.aServer = httptest.NewServer(muxA)
muxB := http.NewServeMux()
RegisterFeedRoutes(muxB, bCfg)
h.bServer = httptest.NewServer(muxB)
t.Cleanup(h.Close)
return h
}
func (h *twoNodeHarness) Close() {
if h.aServer != nil {
h.aServer.Close()
}
if h.bServer != nil {
h.bServer.Close()
}
if h.aMailbox != nil {
_ = h.aMailbox.Close()
}
if h.bMailbox != nil {
_ = h.bMailbox.Close()
}
if h.aChain != nil {
_ = h.aChain.Close()
}
if h.bChain != nil {
_ = h.bChain.Close()
}
for _, dir := range []string{h.aChainDir, h.aFeedDir, h.bChainDir, h.bFeedDir} {
for i := 0; i < 20; i++ {
if err := os.RemoveAll(dir); err == nil {
break
}
time.Sleep(10 * time.Millisecond)
}
}
}
// gossipBlock simulates libp2p block propagation: same block applied to
// both chains. In production, AddBlock is called on each peer after the
// gossipsub message arrives — no chain-level difference from the direct
// call here.
func (h *twoNodeHarness) gossipBlock(txs ...*blockchain.Transaction) {
h.t.Helper()
time.Sleep(2 * time.Millisecond) // distinct tx IDs
var totalFees uint64
for _, tx := range txs {
totalFees += tx.Fee
}
b := &blockchain.Block{
Index: h.tipIndex + 1,
Timestamp: time.Now().UTC(),
Transactions: txs,
PrevHash: h.tipHash,
Validator: h.validator.PubKeyHex(),
TotalFees: totalFees,
}
b.ComputeHash()
b.Sign(h.validator.PrivKey)
if err := h.aChain.AddBlock(b); err != nil {
h.t.Fatalf("A AddBlock: %v", err)
}
if err := h.bChain.AddBlock(b); err != nil {
h.t.Fatalf("B AddBlock: %v", err)
}
h.tipIndex = b.Index
h.tipHash = b.Hash
}
func (h *twoNodeHarness) nextTxID(from string, typ blockchain.EventType) string {
sum := sha256.Sum256([]byte(fmt.Sprintf("%s:%s:%d", from, typ, time.Now().UnixNano())))
return hex.EncodeToString(sum[:16])
}
// fundAB transfers from validator → target, propagated to both chains.
func (h *twoNodeHarness) fundAB(target *identity.Identity, amount uint64) {
tx := &blockchain.Transaction{
ID: h.nextTxID(h.validator.PubKeyHex(), blockchain.EventTransfer),
Type: blockchain.EventTransfer,
From: h.validator.PubKeyHex(),
To: target.PubKeyHex(),
Amount: amount,
Fee: blockchain.MinFee,
Timestamp: time.Now().UTC(),
}
h.gossipBlock(tx)
}
// publishOnA uploads body to A's feed mailbox (only A gets the body) and
// gossips the CREATE_POST tx to both chains (both see the metadata).
func (h *twoNodeHarness) publishOnA(author *identity.Identity, content string) feedPublishResponse {
h.t.Helper()
idHash := sha256.Sum256([]byte(fmt.Sprintf("%s-%d-%s",
author.PubKeyHex(), time.Now().UnixNano(), content)))
postID := hex.EncodeToString(idHash[:16])
clientHasher := sha256.New()
clientHasher.Write([]byte(content))
clientHash := hex.EncodeToString(clientHasher.Sum(nil))
ts := time.Now().Unix()
sig := author.Sign([]byte(fmt.Sprintf("publish:%s:%s:%d", postID, clientHash, ts)))
req := feedPublishRequest{
PostID: postID,
Author: author.PubKeyHex(),
Content: content,
Sig: base64.StdEncoding.EncodeToString(sig),
Ts: ts,
}
body, _ := json.Marshal(req)
resp, err := http.Post(h.aServer.URL+"/feed/publish", "application/json", strings.NewReader(string(body)))
if err != nil {
h.t.Fatalf("publish on A: %v", err)
}
defer resp.Body.Close()
if resp.StatusCode >= 400 {
raw, _ := io.ReadAll(resp.Body)
h.t.Fatalf("publish on A → %d: %s", resp.StatusCode, string(raw))
}
var out feedPublishResponse
if err := json.NewDecoder(resp.Body).Decode(&out); err != nil {
h.t.Fatalf("decode publish: %v", err)
}
// Now the ON-CHAIN CREATE_POST tx — gossiped to both nodes.
contentHash, _ := hex.DecodeString(out.ContentHash)
payload, _ := json.Marshal(blockchain.CreatePostPayload{
PostID: out.PostID,
ContentHash: contentHash,
Size: out.Size,
HostingRelay: out.HostingRelay,
})
tx := &blockchain.Transaction{
ID: h.nextTxID(author.PubKeyHex(), blockchain.EventCreatePost),
Type: blockchain.EventCreatePost,
From: author.PubKeyHex(),
Fee: out.EstimatedFeeUT,
Payload: payload,
Timestamp: time.Now().UTC(),
}
h.gossipBlock(tx)
return out
}
// likeOnB submits a LIKE_POST tx originating on Node B (simulates a
// follower using their own node). Both chains receive the block.
func (h *twoNodeHarness) likeOnB(liker *identity.Identity, postID string) {
payload, _ := json.Marshal(blockchain.LikePostPayload{PostID: postID})
tx := &blockchain.Transaction{
ID: h.nextTxID(liker.PubKeyHex(), blockchain.EventLikePost),
Type: blockchain.EventLikePost,
From: liker.PubKeyHex(),
Fee: blockchain.MinFee,
Payload: payload,
Timestamp: time.Now().UTC(),
}
h.gossipBlock(tx)
}
// getBodyFromA fetches /feed/post/{id} from Node A's HTTP server.
func (h *twoNodeHarness) getBodyFromA(postID string) (int, []byte) {
h.t.Helper()
resp, err := http.Get(h.aServer.URL + "/feed/post/" + postID)
if err != nil {
h.t.Fatalf("GET A: %v", err)
}
defer resp.Body.Close()
raw, _ := io.ReadAll(resp.Body)
return resp.StatusCode, raw
}
// getBodyFromB same for Node B.
func (h *twoNodeHarness) getBodyFromB(postID string) (int, []byte) {
h.t.Helper()
resp, err := http.Get(h.bServer.URL + "/feed/post/" + postID)
if err != nil {
h.t.Fatalf("GET B: %v", err)
}
defer resp.Body.Close()
raw, _ := io.ReadAll(resp.Body)
return resp.StatusCode, raw
}
// ── Tests ─────────────────────────────────────────────────────────────────
// TestTwoNodePostPropagation: Alice publishes on Node A. After block
// propagation, both chains have the record. Node B can read the
// on-chain metadata directly, and can fetch the body from Node A (the
// hosting relay) — which is what the client does in production.
func TestTwoNodePostPropagation(t *testing.T) {
h := newTwoNodeHarness(t)
alice, _ := identity.Generate()
h.fundAB(alice, 10*blockchain.Token)
pub := h.publishOnA(alice, "hello from node A")
// Node A chain has the record.
recA, err := h.aChain.Post(pub.PostID)
if err != nil || recA == nil {
t.Fatalf("A chain.Post: %v (rec=%v)", err, recA)
}
// Node B chain also has the record — propagation successful.
recB, err := h.bChain.Post(pub.PostID)
if err != nil || recB == nil {
t.Fatalf("B chain.Post: %v (rec=%v)", err, recB)
}
if recA.PostID != recB.PostID || recA.Author != recB.Author {
t.Errorf("chains disagree: A=%+v B=%+v", recA, recB)
}
if recB.HostingRelay != h.aHostPub {
t.Errorf("B sees hosting_relay=%s, want A's pub=%s", recB.HostingRelay, h.aHostPub)
}
// Node A HTTP serves the body.
statusA, _ := h.getBodyFromA(pub.PostID)
if statusA != http.StatusOK {
t.Errorf("A GET: status %d, want 200", statusA)
}
// Node B HTTP does NOT have the body — body only lives on the hosting
// relay. This is by design: the reader client on Node B would read
// chain.Post(id).HostingRelay, look up its URL via /api/relays, and
// fetch directly from Node A. Tested by the next assertion.
statusB, _ := h.getBodyFromB(pub.PostID)
if statusB != http.StatusNotFound {
t.Errorf("B GET: status %d, want 404 (body lives only on hosting relay)", statusB)
}
// Simulate the client routing step: use chain record from B to find
// hosting relay, then fetch from A.
hosting := recB.HostingRelay
if hosting != h.aHostPub {
t.Fatalf("hosting not A: %s", hosting)
}
// In production: look up hosting's URL via /api/relays. Here we
// already know it = h.aServer.URL. Just verify the fetch works.
statusCross, bodyCross := h.getBodyFromA(pub.PostID)
if statusCross != http.StatusOK {
t.Fatalf("cross-node fetch: status %d", statusCross)
}
var fetched struct {
Content string `json:"content"`
Author string `json:"author"`
}
if err := json.Unmarshal(bodyCross, &fetched); err != nil {
t.Fatalf("decode cross-node body: %v", err)
}
if fetched.Content != "hello from node A" {
t.Errorf("cross-node content: got %q", fetched.Content)
}
}
// TestTwoNodeLikeCounterSharedAcrossNodes: a like submitted with tx
// origin on Node B bumps the on-chain counter — which Node A's HTTP
// /stats then reflects. Demonstrates that engagement aggregates are
// consistent across the mesh because they live on the chain, not in
// any single relay's memory.
func TestTwoNodeLikeCounterSharedAcrossNodes(t *testing.T) {
h := newTwoNodeHarness(t)
alice, _ := identity.Generate()
bob, _ := identity.Generate()
h.fundAB(alice, 10*blockchain.Token)
h.fundAB(bob, 10*blockchain.Token)
pub := h.publishOnA(alice, "content for engagement test")
h.likeOnB(bob, pub.PostID)
// A's HTTP stats (backed by its chain.LikeCount) should see the like.
resp, err := http.Get(h.aServer.URL + "/feed/post/" + pub.PostID + "/stats")
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
var stats postStatsResponse
if err := json.NewDecoder(resp.Body).Decode(&stats); err != nil {
t.Fatal(err)
}
if stats.Likes != 1 {
t.Errorf("A /stats: got %d likes, want 1", stats.Likes)
}
// Same for B.
resp, err = http.Get(h.bServer.URL + "/feed/post/" + pub.PostID + "/stats")
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
if err := json.NewDecoder(resp.Body).Decode(&stats); err != nil {
t.Fatal(err)
}
if stats.Likes != 1 {
t.Errorf("B /stats: got %d likes, want 1", stats.Likes)
}
}
// TestTwoNodeFollowGraphReplicates: FOLLOW tx on any node propagates to
// both chains; B's /feed/timeline returns A-hosted posts correctly.
func TestTwoNodeFollowGraphReplicates(t *testing.T) {
h := newTwoNodeHarness(t)
alice, _ := identity.Generate() // will follow bob
bob, _ := identity.Generate() // author
h.fundAB(alice, 10*blockchain.Token)
h.fundAB(bob, 10*blockchain.Token)
// Alice follows Bob (tx gossiped to both nodes).
followTx := &blockchain.Transaction{
ID: h.nextTxID(alice.PubKeyHex(), blockchain.EventFollow),
Type: blockchain.EventFollow,
From: alice.PubKeyHex(),
To: bob.PubKeyHex(),
Fee: blockchain.MinFee,
Payload: []byte(`{}`),
Timestamp: time.Now().UTC(),
}
h.gossipBlock(followTx)
// Bob publishes on A. Alice queries timeline on B.
bobPost := h.publishOnA(bob, "bob speaks")
// Alice's timeline on Node B should include Bob's post (metadata
// lives on chain). Body fetch would go to A, but /timeline only
// returns the enriched record — which DOES include body content
// because B's feed_mailbox doesn't have it... hmm.
//
// Actually this reveals a limitation: /feed/timeline on B merges
// chain records (available) with local mailbox bodies (missing).
// The body parts of the response will be empty. For the e2e test we
// just verify the metadata is there — the client is expected to
// resolve bodies separately via the hosting_relay URL.
resp, err := http.Get(h.bServer.URL + "/feed/timeline?follower=" + alice.PubKeyHex())
if err != nil {
t.Fatal(err)
}
defer resp.Body.Close()
var tl struct {
Count int `json:"count"`
Posts []feedAuthorItem `json:"posts"`
}
if err := json.NewDecoder(resp.Body).Decode(&tl); err != nil {
t.Fatal(err)
}
if tl.Count != 1 {
t.Fatalf("B timeline count: got %d, want 1", tl.Count)
}
if tl.Posts[0].PostID != bobPost.PostID {
t.Errorf("B timeline[0]: got %s, want %s", tl.Posts[0].PostID, bobPost.PostID)
}
// Metadata must be correct even if body is empty on B.
if tl.Posts[0].Author != bob.PubKeyHex() {
t.Errorf("B timeline[0].author: got %s, want %s", tl.Posts[0].Author, bob.PubKeyHex())
}
if tl.Posts[0].HostingRelay != h.aHostPub {
t.Errorf("B timeline[0].hosting_relay: got %s, want A (%s)", tl.Posts[0].HostingRelay, h.aHostPub)
}
// Body is intentionally empty on B (A hosts it). Verify.
if tl.Posts[0].Content != "" {
t.Errorf("B timeline[0].content: got %q, want empty (body lives on A)", tl.Posts[0].Content)
}
}