Initial commit: AS4/411 directory and discovery service for Sankofa Marketplace

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-08 08:44:20 -08:00
commit c24ae925cf
109 changed files with 7222 additions and 0 deletions
--- a/docs/adr/.gitkeep
+++ b/docs/adr/.gitkeep
--- a/docs/adr/000-scope-and-non-goals.md
+++ b/docs/adr/000-scope-and-non-goals.md
@@ -0,0 +1,31 @@
+# ADR-000: Scope and Non-Goals
+
+## Status
+
+Accepted.
+
+## Context
+
+as4-411 must have a locked scope so that "interact" is not interpreted as brokering, orchestration, or config generation. The system boundary and trust model depend on this.
+
+## Decision
+
+### In Scope
+
+- as4-411 is a **directory + discovery + routing directive generator**.
+- It stores participants, identifiers, endpoints, capabilities, credentials references, and policies.
+- It resolves identifiers to **routing directives** (target protocol, address, profile, security refs, QoS). Gateways **execute** these directives; as4-411 does **not** transmit messages on their behalf.
+
+### Out of Scope (Unless Explicitly Added Later)
+
+- **Brokering / orchestration:** Sending or relaying messages between parties is out of scope. If added in the future, it must be a **separate component** (e.g. `as4-411-broker`) with a separate trust boundary so the directory's integrity and confidentiality are not contaminated.
+- **Config generation for multiple gateway stacks:** Generating full gateway configuration (e.g. PMode files, STP config) may be added as a separate tool or module; it is not part of the core directory/resolver.
+
+### Integration Default
+
+- Gateways may consume as4-411 as an **embedded library** (core + resolver + storage) or as a **sidecar/shared service** (REST or gRPC). The default pattern is documented in the README and deployment docs; both are supported.
+
+## Consequences
+
+- All feature work stays within directory, discovery, and directive generation.
+- Brokering or message transmission, if ever required, is a distinct service with its own security and compliance story.
--- a/docs/adr/001-adapter-interface-and-versioning.md
+++ b/docs/adr/001-adapter-interface-and-versioning.md
@@ -0,0 +1,44 @@
+# ADR-001: Adapter Interface and Semantic Versioning
+
+## Status
+
+Accepted.
+
+## Context
+
+Multi-rail support requires a strict plugin boundary so that each rail has a single adapter, version negotiation is clear, and compatibility is guaranteed. The protocol registry must define the minimum interface surface and versioning rules.
+
+## Decision
+
+### ProtocolAdapter Interface
+
+Every rail adapter implements the following (see `packages/core` adapter-interface.ts):
+
+- **validateIdentifier(type, value): boolean** — Validate format for the rail.
+- **normalizeIdentifier(type, value): string | null** — Return normalized value for lookup/storage, or null if invalid.
+- **resolveCandidates(ctx, request, options): Promise<AdapterCandidate[]]** — Use the supplied context (directory view) to return candidate participant+endpoint pairs.
+- **evaluateCapabilities(candidate, serviceContext): boolean** — Whether the candidate matches the requested service/action/process.
+- **renderRouteDirective(candidate, options): RouteDirective** — Build the canonical directive from a candidate.
+- **ingestSource?(config): Promise<IngestResult>** — Optional; for connectors that pull from external directories (SMP, file, etc.).
+
+The resolver (or a registry) supplies an **AdapterContext** to adapters; the context exposes findParticipantsByIdentifiers, getEndpointsByParticipantId, getCapabilitiesByParticipantId. The storage layer implements this context.
+
+### Plugin Boundaries
+
+- One adapter per rail (or per protocol family). Adapters are discovered by config or package layout (e.g. registered by protocol name or identifier type prefix).
+- No adapter depends on another adapter; shared logic lives in core or a shared utility package.
+
+### Semantic Versioning
+
+- The **adapter interface** (ProtocolAdapter) follows semantic versioning. Backward-compatible changes only: new optional methods, new optional fields on types. Breaking changes require a new major version of the interface.
+- Each **adapter implementation** has its own version (e.g. `version: "1.0.0"`). Registry can enforce minimum interface version when loading adapters.
+
+### Compatibility Guarantees
+
+- New optional methods or optional parameters do not break existing adapters.
+- New required methods or required fields are breaking; they belong to a new major version of the interface contract.
+
+## Consequences
+
+- Rails can be added by implementing ProtocolAdapter and registering; the resolver delegates to the appropriate adapter by identifier type or protocol.
+- Version mismatches can be detected at load time; operators can pin adapter or interface versions.
--- a/docs/adr/001-persistence-caching.md
+++ b/docs/adr/001-persistence-caching.md
@@ -0,0 +1,29 @@
+# ADR-001: Persistence and Caching Strategy
+
+## Status
+
+Accepted.
+
+## Context
+
+as4-411 needs canonical persistence for directory data (tenants, participants, identifiers, endpoints, capabilities, credentials, policies) and a caching strategy for resolution results to support low-latency gateway lookups and resilience.
+
+## Decision
+
+### Persistence
+
+- **Primary store: PostgreSQL.** Chosen for ACID guarantees, relational model matching the [data model](../architecture/data-model.md), and operational familiarity (replication, backups, tooling).
+- **Migrations:** SQL migrations live under `packages/storage/migrations/` (e.g. `001_initial.sql`). Applied out-of-band or via a migration runner; no automatic migrate on startup by default.
+- **Alternatives:** In-memory store for development and tests; SQLite for embedded/library deployments where Postgres is not available. Both implement the same `DirectoryStore`/`AdminStore` port.
+
+### Caching
+
+- **Resolution cache:** In-process TTL cache (e.g. `InMemoryResolveCache`) keyed by canonical `ResolveRequest` (identifiers, serviceContext, constraints, tenant). Positive and negative results are cached; negative TTL is shorter (e.g. 60s) to avoid prolonged stale “not found.”
+- **Cache key:** Deterministic and stable for same inputs (see [ADR-002](002-resolution-scoring-determinism.md)).
+- **Invalidation:** On directory mutation (participant/identifier/endpoint/policy change), invalidate by tenant or by cache key prefix when a proper event or hook is available; until then, rely on TTL.
+- **Optional:** Redis or similar for shared cache across multiple resolver instances; same interface `ResolveCache`.
+
+## Consequences
+
+- Gateways can rely on Postgres for durability and use in-memory or Redis cache for latency.
+- Embedded use cases can use SQLite or in-memory without Postgres dependency.
--- a/docs/adr/002-resolution-scoring-determinism.md
+++ b/docs/adr/002-resolution-scoring-determinism.md
@@ -0,0 +1,28 @@
+# ADR-002: Resolution Scoring and Determinism
+
+## Status
+
+Accepted.
+
+## Context
+
+Resolution must return a stable, ordered list of routing directives for the same inputs and store state.
+
+## Decision
+
+### Determinism
+
+- Same normalized request + same directory state implies same ordered list of RouteDirectives.
+- Tie-break when scores are equal: (1) explicit priority higher first, (2) lexical by endpoint id then participant id.
+
+### Scoring
+
+- Factors: endpoint priority, endpoint status (active preferred over draining over inactive). No randomness; same inputs imply same scores and order.
+
+### Cache Key
+
+- Derived from canonical request (sorted identifiers, serialized serviceContext and constraints, tenant).
+
+## Consequences
+
+- Caching and retries are reproducible and safe.
--- a/docs/adr/003-multi-tenancy-and-rls.md
+++ b/docs/adr/003-multi-tenancy-and-rls.md
@@ -0,0 +1,30 @@
+# ADR-003: Multi-Tenancy and RLS Strategy
+
+## Status
+
+Accepted.
+
+## Context
+
+Tenant scoping is required for isolation. Shared (global) data (e.g. BIC, LEI) and tenant-private data must be clearly separated, and access enforced at the database and application layer.
+
+## Decision
+
+### Model
+
+- **Global objects:** Identifiers or metadata that are public or shared (e.g. BIC, LEI, BIN range metadata). Stored with `tenant_id` null or a dedicated global tenant. Readable by all tenants for resolution when the identifier is public.
+- **Tenant-private objects:** All participant-specific data, contractual endpoints, MID/TID, and tenant-specific routing artifacts. Must be scoped by `tenant_id`; only the owning tenant can read/write.
+
+### Enforcement
+
+- **Postgres Row Level Security (RLS):** Enable on tenant-scoped tables. Policy: restrict to rows where `tenant_id` matches the session/connection tenant (set after auth). Allow read of global rows (`tenant_id IS NULL`) where applicable.
+- **Application:** Resolver and Admin API set tenant context from JWT or request; all queries filter by tenant. No cross-tenant data in responses.
+- **Per-tenant encryption:** For confidential data (Tier 2+), use per-tenant keys so compromise is isolated (see ADR-004).
+
+### Caching
+
+- Cache key includes tenant. Per-tenant TTL and invalidation optional.
+
+## Consequences
+
+- Tenants cannot see each other's private data. Global data remains available for public identifier resolution. RLS provides defense in depth alongside application checks.
--- a/docs/adr/003-policy-engine-abac.md
+++ b/docs/adr/003-policy-engine-abac.md
@@ -0,0 +1,29 @@
+# ADR-003: Policy Engine Model (ABAC)
+
+## Status
+
+Accepted.
+
+## Context
+
+Resolution must respect tenant scope and allow/deny rules using an attribute-based model.
+
+## Decision
+
+### Model
+
+- Policies are stored per tenant with rule_json (ABAC attributes), effect (allow/deny), and priority.
+- Tenant is enforced by restricting resolution to that tenant when request.tenant is set.
+
+### MVP Rule Shape
+
+- Deny: rule_json.participantId or rule_json.participantIds — exclude those participants.
+- Allow (restrictive): if any allow policy exists, rule_json.participantId/participantIds — only include those participants.
+
+### Ordering
+
+- Deny applied first; then allow restriction. Policies loaded by tenant and ordered by priority.
+
+## Consequences
+
+- Simple allow/deny by participant supported; ABAC can be extended via rule_json and filter logic.
--- a/docs/adr/004-sensitive-data-classification.md
+++ b/docs/adr/004-sensitive-data-classification.md
@@ -0,0 +1,19 @@
+# ADR-004: Sensitive Data Classification and Encryption
+
+## Status
+
+Accepted.
+
+## Context
+
+The directory holds mixed sensitivity data: public identifiers (BIC, LEI), internal endpoints and participant data, and confidential or regulated data (MID/TID, contract routing, key references). We need a clear classification and enforcement policy so that storage and access controls are consistent and auditable.
+
+## Decision
+
+- **Four tiers:** Tier 0 (public), Tier 1 (internal), Tier 2 (confidential), Tier 3 (regulated/secrets). See [data-classification.md](../security/data-classification.md) for definitions and examples.
+- **Enforcement:** Field-level encryption for Tier 2+ at rest; strict RBAC/ABAC; immutable audit logs for mutations and Tier 2+ access. Tier 3: only references (e.g. vault_ref) stored; no private keys or tokens in the directory.
+- **Mapping:** All tables and fields used for directory and routing artifacts are mapped to a tier. New fields require a tier before merge. Per-tenant encryption keys for Tier 2+ are recommended (see ADR-003).
+
+## Consequences
+
+- Operators and developers have a single reference for how to handle each data type. Compliance and security reviews can align on tier and controls.
--- a/docs/adr/005-connector-trust-and-caching.md
+++ b/docs/adr/005-connector-trust-and-caching.md
@@ -0,0 +1,30 @@
+# ADR-005: Connector Trust and Caching Strategy
+
+## Status
+
+Accepted.
+
+## Context
+
+Connectors ingest data from external or file-based sources (SMP/SML, file, SS7 feeds). Trust anchors, signature validation, caching, and resilience must be defined so that bad or stale data does not compromise resolution.
+
+## Decision
+
+### Per-Connector Requirements
+
+For each connector (SMP/SML, file, SS7, etc.) the following must be defined and documented (see [connectors.md](../architecture/connectors.md)):
+
+- **Trust anchors and signature validation:** Which certificates or keys are trusted for signed payloads; how to validate signatures on ingested bundles. Pinning and trust anchor refresh policy.
+- **Caching and refresh:** TTL for cached data, jitter to avoid thundering herd, negative caching (how long to cache "not found" or fetch failure).
+- **Resilience:** Timeouts, retries, circuit-breaker thresholds. Behavior on failure: fall back to cached only, fail closed, or fail open (document per connector).
+- **Data provenance tagging:** Every ingested record or edge must be tagged with source (e.g. "smp", "file", "gtt_feed"), last_verified (or fetched_at), and optional confidence score. Exposed in resolution evidence and resolution_trace.
+
+### SMP/SML Specifics
+
+- Cache TTL policy: document default TTL for SMP metadata and SML lookups; jitter on refresh.
+- Pinning and trust anchors: SML and SMP TLS and optional payload signing; which CAs or pins are accepted.
+- Failure behavior: on network or SMP failure, fall back to cached data only; do not serve stale beyond max stale window (document). No silent fallback to unrelated data.
+
+## Consequences
+
+- Operators can configure trust and cache per connector. Provenance is always available for audit and explainability.