ContextSync at Enterprise Scale

From Hackathon to Production

The reference implementation uses SQLite and a local filesystem — the right choices for proving a protocol works in eleven hours. But ContextSync is designed as a protocol, not a product. The same API surface, the same URI scheme, the same permission tuples, and the same provenance records map cleanly onto production cloud infrastructure.

This page shows how.

Architecture Comparison

Protocol Layer	v0.1 (Reference)	Google Cloud Platform	Amazon Web Services
Artifact storage	Local filesystem, SHA-256 content-addressed blobs	Cloud Storage (Standard), objects keyed by SHA-256 hash, versioning enabled	S3 (Standard), objects keyed by SHA-256 hash, versioning enabled
Version graph	SQLite (WAL mode), single file	Cloud Spanner (global strong consistency) or Firestore (document model, regional)	Aurora PostgreSQL (Multi-AZ) or DynamoDB (global tables for multi-region)
Change feed	In-memory SSE hub, single-process fan-out	Pub/Sub for durable message delivery + Eventarc for event routing to Cloud Run consumers	SNS for fan-out + SQS for durable queues + EventBridge for routing rules
Real-time subscriptions	Server-Sent Events from Express.js	Pub/Sub push subscriptions to Cloud Run endpoints, or Firebase Realtime Database for browser clients	API Gateway WebSocket API backed by Lambda, or AppSync subscriptions
Permissions	SQLite table with glob-pattern matching middleware	Firestore security rules for artifact-level access + Cloud IAM for service-level auth, or Open Policy Agent sidecar on Cloud Run	Amazon Verified Permissions (Cedar policy engine) for fine-grained artifact-level decisions + Cognito for identity
Provenance log	SQLite append-only table	BigQuery (append-only, columnar, petabyte scale), streamed via Pub/Sub	Timestream for time-series provenance, or S3 + Athena for cost-optimized archival query
Authentication	X-Actor-Id header (trust-based)	Cloud Identity Platform + JWT validation middleware on Cloud Run	Amazon Cognito user/machine pools + JWT validation on ALB or API Gateway
Compute	Single Node.js process on a Linode VPS	Cloud Run (auto-scaling, zero-to-N, pay-per-request) + Cloud CDN for dashboard	ECS Fargate (auto-scaling, no cluster management) + CloudFront for dashboard
Search	SQL LIKE substring match	Vertex AI Search (semantic + keyword, managed)	OpenSearch Serverless (full-text + vector, managed)
Monitoring	Console logs	Cloud Logging + Cloud Trace + Error Reporting	CloudWatch Logs + X-Ray + CloudWatch Alarms
CI/CD	Manual restart	Cloud Build + Artifact Registry + Cloud Run deploy	CodePipeline + ECR + ECS rolling deploy

Google Cloud Architecture

                            Judges / Agents / Humans
                                     |
                              Cloud CDN + LB
                                     |
                            +--------v---------+
                            |  Cloud Run       |
                            |  ContextSync API |
                            |  (auto-scaling)  |
                            +--------+---------+
                                     |
          +-----------+--------------+--------------+-------------+
          |           |              |              |             |
    +-----v----+ +---v------+ +----v-----+ +------v-----+ +----v--------+
    | Cloud    | | Cloud    | | Pub/Sub  | | Verified   | | BigQuery    |
    | Storage  | | Spanner  | |          | | Access     | | (provenance)|
    | (blobs)  | | (version | | (change  | | (IAM +     | |             |
    |          | |  graph)  | |  events) | |  OPA)      | |             |
    +----------+ +----------+ +----+-----+ +------------+ +-------------+
                                   |
                            +------v-------+
                            | Eventarc     |
                            | (route to    |
                            |  subscribers)|
                            +--------------+

Why Cloud Run: ContextSync is a stateless HTTP + SSE server. Cloud Run scales to zero when idle, scales to hundreds of instances under load, and requires no cluster management. The SSE hub moves from in-memory to Pub/Sub, so each instance subscribes independently.

Why Cloud Spanner: the version graph needs strong consistency (version numbers must be monotonically increasing, no gaps). Spanner provides this globally. For smaller deployments, Firestore is simpler and cheaper.

Why BigQuery for provenance: provenance is append-only, rarely queried in real time, and grows without bound. BigQuery handles petabyte-scale append workloads at low cost with full SQL query capability for compliance audits.

Amazon Web Services Architecture

                            Judges / Agents / Humans
                                     |
                              CloudFront + ALB
                                     |
                            +--------v---------+
                            |  ECS Fargate     |
                            |  ContextSync API |
                            |  (auto-scaling)  |
                            +--------+---------+
                                     |
          +-----------+--------------+--------------+-------------+
          |           |              |              |             |
    +-----v----+ +---v------+ +----v-----+ +------v-----+ +----v--------+
    | S3       | | Aurora   | | SNS/SQS  | | Verified   | | Timestream  |
    | (blobs)  | | Postgres | |          | | Permissions| | or S3 +     |
    |          | | (version | | (change  | | (Cedar)    | | Athena      |
    |          | |  graph)  | |  events) | |            | | (provenance)|
    +----------+ +----------+ +----+-----+ +------------+ +-------------+
                                   |
                            +------v-------+
                            | EventBridge  |
                            | (routing     |
                            |  rules)      |
                            +--------------+

Why ECS Fargate: same rationale as Cloud Run — stateless compute, auto-scaling, no infrastructure to manage. Fargate tasks map 1:1 to ContextSync server instances.

Why Aurora PostgreSQL: the version graph is a relational workload (foreign keys, monotonic sequences, joins for history queries). Aurora gives Multi-AZ durability with PostgreSQL compatibility. For global deployments, DynamoDB Global Tables trade strong consistency for multi-region availability.

Why Amazon Verified Permissions: Cedar is a purpose-built policy language for fine-grained authorization. ContextSync's permission tuples (actor, pattern, operations) map directly onto Cedar policies, with evaluation offloaded to a managed service instead of in-process middleware.

Federation: Multi-Region, Multi-Cloud

The hardest enterprise question is not "which cloud?" but "how do two ContextSync servers stay in sync across regions or providers?"

The protocol's change-event schema is the federation primitive. Each server publishes change events to its own message bus (Pub/Sub or SNS). A federation bridge subscribes to the remote server's change feed and replays events locally:

  Region: EU (GCP)                          Region: US (AWS)
  +------------------+                      +------------------+
  | ContextSync      |   change events      | ContextSync      |
  | Server (EU)      | <------------------> | Server (US)      |
  +--------+---------+   (Pub/Sub <-> SNS   +--------+---------+
           |               bridge)                    |
     +-----+-----+                             +-----+-----+
     | Spanner   |                             | Aurora    |
     | (EU data) |                             | (US data) |
     +-----------+                             +-----------+

Conflict resolution: in v0.1, ContextSync detects but does not auto-resolve conflicts. In a federated deployment, the same artifact could be written simultaneously in two regions. The federation bridge detects the version fork (two versions with the same parent) and flags it for human review. Automatic merge strategies (last-writer-wins, region-priority, or domain-specific rules) are a v0.3 feature, but the detection mechanism works today because version numbers are monotonic per-artifact.

Data residency: federation does not require full replication. An EU server can subscribe to only the change metadata (URI, version, author, summary) from the US server without replicating the artifact payload. This satisfies data residency constraints: the EU server knows what changed in the US and can request the payload on demand if permissions allow.

What Changes, What Stays the Same

Aspect	Stays the same	Changes
API surface	Every endpoint, every request/response schema	Transport (HTTP/2, gRPC option)
URI scheme	`ctx://{org}/{domain}/{id}`	Nothing
Permission model	Tuples, default-deny, glob patterns	Evaluation engine (Cedar, OPA, custom)
Provenance schema	Same fields, same append-only semantics	Storage backend (BigQuery, Timestream)
Change event format	Same JSON schema	Delivery mechanism (Pub/Sub, SNS, Kafka)
Actor model	Human vs agent, actor_id, agent_class	Auth mechanism (JWT, mTLS, API keys)
Diff format	Line-level diff, JSON patch	Nothing
Version numbering	Monotonic integers per artifact	Nothing

The protocol surface is invariant. The infrastructure underneath is a deployment decision. That is the point of building a protocol instead of a product.