barry/ZKAC

everbarry 316e3dc0bd v0.5.0

2026-04-19 23:19:24 +02:00

25 KiB

Raw Blame History

Security model and audit notes (ZKAC 0.5.0)

This document summarizes the design, residual risks, and recommendations for operators integrating ZKAC. It is not a substitute for independent review before high-assurance deployment.

Goals

Authentication: Only holders of a valid BBS+ credential for a registered role can complete verify_auth for that role.
Server identity: The server proves its long-term identity to the client via a Schnorr signature over the session transcript; clients verify against a pinned public key. This prevents MITM attacks without requiring TLS.
Confidentiality & integrity: All traffic (management and authenticated sessions) is authenticated-encrypted (ChaCha20-Poly1305) with keys derived from an ephemeral X25519 handshake.
Replay resistance: Duplicate ciphertexts in a direction are rejected (sliding window + monotonic counter).
Unlinkability (credential layer): BBS+ presentations are unlinkable across sessions when the presentation header (the session transcript hash) differs; the verifier learns only the disclosed attributes (opaque role_id, epoch) and validity. Client anonymity is preserved: the client never reveals its long-term key during the handshake.
Server cannot forge credentials: The server stores only the issuer public key per role; forging requires the issuer secret key.
Opaque server: The server stores only cryptographically verified state blobs and opaque grant ciphertexts. No user identities, role names, or credential material are stored or visible to the server.

Cryptographic components

Layer	Primitive	Purpose
Transport	X25519 ephemeral DH, HKDF-SHA256, ChaCha20-Poly1305	Session keys, AEAD
Identity	Schnorr on Ristretto255, BLAKE2b-512 challenge	Server identity binding
Credentials	BBS+ on BLS12-381 (zkryptium), SHAKE256 ciphersuite	Blind issuance, ZK presentations
Role IDs	BLAKE2b-512 (truncated to 32 bytes)	Opaque role identifiers
Grant delivery	X25519 static/ephemeral DH, HKDF-SHA256, ChaCha20-Poly1305	E2E-encrypted credential grants
Grant discovery	X25519 DH + BLAKE2b-512 truncated to 16 bytes	Detection tags for anonymous matching
PIR	LWE (n=1024, q=2^32, p=256, σ=6.4)	Single-server private record retrieval

Protocol flow

Unified channel (all connections)

Client                                Server
  |--- init_msg (eph_pk) ------------>|
  |                                   | accept()
  |                                   | prove_identity() → sign(transcript)
  |<-- response_msg + identity_pkt ---|
  | complete DH                       |
  | decrypt + verify server sig       |
  |===== encrypted session ==========>|
  |--- {op: "mgmt"} or {op: "auth"}->|

Management commands (create_registry, post_grant, etc.) and BBS+ role authentication both run inside the same encrypted, server-authenticated channel. There is no unencrypted management path.

Grant delivery (admin → recipient, through server)

Grants live in a single anonymous append-only pool (no recipient identifier on the server). Each grant entry carries an ephemeral public key, the E2E-encrypted credential payload, and a 16-byte detection tag.

Discovery (cheap, no PIR): The server exposes a pool_tags command returning all (eph_pk, tag) pairs. The client computes X25519(my_issuance_sk, eph_pk_j) for each entry and derives the expected tag via BLAKE2b-512("zkac-grant-tag" || shared_secret)[..16]. Matching entries are the client's grants. This scan is a single round-trip transferring ~48 bytes per pool entry and is computed locally.

Retrieval (PIR + split payload): For each matching pool index, the client runs LWE-based single-server SimplePIR (pir_query). The PIR database row is only a small handle (JSON: version, grant_id, SHA-256 of the ciphertext) padded to PIR_RECORD_BYTES; the bulk ciphertext is fetched in a second management round-trip (get_grant_blob). The client checks that the blob’s hash matches the handle before decrypting. The server learns which grant_id was requested on the second hop (unlike the PIR index, which stays private). Hints use H = D · A^T (seeded public matrix A); the client caches hints keyed by pool_version.

Admin                  Server (opaque relay)        Recipient
  |-- post_grant ------->|                            |
  |   (admin_proof,      | appends to pool:           |
  |    eph_pk,           |  {grant_id, eph_pk,        |
  |    ciphertext,       |   ciphertext, to_tag}      |
  |    to_tag)           |  (no recipient address)    |
  |                      |                            |
  |                      |<-- pool_tags --------------|
  |                      |--- [(eph_pk, tag), …] ---->|
  |                      |                            | local tag match
  |                      |<-- pir_query(j) -----------|
  |                      |--- answer ----------------->|
  |                      |                            | PIR decode → handle
  |                      |<-- get_grant_blob ---------|
  |                      |--- blob fields ----------->|
  |                      |                            | verify hash → decrypt
  |                      |<-- claim_grant ------------|
  |                      |  (tombstone / claimed)     |

PIR security (LWE)

Private information retrieval uses the SimplePIR construction (Henzinger–Hong–Corrigan-Gibbs–Meiklejohn–Vaikuntanathan, USENIX Security '23). Security rests on the decisional Learning With Errors (LWE) assumption:

Parameters: LWE dimension n=1024, ciphertext modulus q=2^32, plaintext modulus p=256, discrete Gaussian noise σ=6.4.
Classical security: ~128 bits (based on lattice estimator analysis at these parameters).
Post-quantum: LWE is believed hard for quantum computers; no known quantum algorithm breaks it in polynomial time.
Single-server: No non-collusion assumption. Privacy holds against an honest-but-curious server that inspects all queries and answers.

The PIR scheme is honest-but-curious only: a malicious server can return incorrect answers. This is acceptable because grant payloads are E2E-encrypted (ChaCha20-Poly1305) and credential finalization validates BBS+ blind signatures — a corrupted PIR answer causes decryption or BBS+ verification to fail, not credential forgery.

Detection tags

Each grant carries a 16-byte detection tag: BLAKE2b-512("zkac-grant-tag" || X25519(eph_sk, recipient_pk))[..16].

Privacy properties:

The tag is a deterministic function of the shared secret, which requires knowledge of either the ephemeral secret key or the recipient's issuance secret key to compute. An observer (including the server) who knows neither key cannot link a tag to a recipient.
The pool_tags list is equivalent to what the server already sees at grant insertion time — broadcasting it to querying clients reveals no new information.
A client downloading pool_tags reveals that it is checking for pending grants, but not which entries matched. Matching is a local computation.
Tags have 128-bit collision resistance (16 bytes); false positives are negligible.

Scaling and complexity (transport, credentials, registries)

This section complements the grant pool / PIR analysis above. Asymptotics use: R = number of roles in one registry state, G = number of registries hosted in memory, L = byte length of an application payload (JSON management command or auth packet body after decryption).

Transport and session crypto

Operation	Time	Bandwidth / memory
Handshake (`connect` / `accept`)	O(1)	Fixed 32-byte handshake messages; one X25519 DH, HKDF, ChaCha open.
Server identity proof	O(1)	Schnorr verify on Ristretto255 over a short transcript-derived message.
`Session::encrypt` / `decrypt` per frame	O(L)	ChaCha20-Poly1305 is linear in payload size; replay window checks are O(1) per direction.

Bottlenecks: negligible compared to BBS+ unless payloads are pushed toward frame limits. Python framing caps TCP payloads at MAX_BBS_AUTH_PROOF_BYTES + 4 KiB (~260 KiB), bounding worst-case allocations per read.

BBS+ credentials (issuance and verification)

Operation	Time	Notes
Blind `issue_blind` / `finalize` (issuer / member)	O(1) in R and G	Dominated by BLS12-381 and BBS+ proof math in zkryptium (pairings, multi-scalar muls); not sensitive to registry count or pool size.
`present` (proof generation)	O(1)	Produces a presentation bound to a nonce (e.g. transcript hash).
`verify_presentation`	O(1)	One proof check against one issuer public key.
Proof size on the wire	≤ 256 KiB	`MAX_BBS_AUTH_PROOF_BYTES`; caps attacker-controlled allocation for auth packets.

Bottlenecks: BBS+ verify and present dominate CPU on authenticated paths (role auth, admin proofs for post_grant, registry state certification). Cost is per event, not per grant in pool, but high QPS auth still needs horizontal scaling or hardware tuned for pairing-heavy crypto.

Registry state (client-managed blob on server)

Operation	Time	Size
`RegistryState::serialize` / `deserialize`	O(R)	Linear in number of role entries (each: fixed `role_id`, variable-length issuer pk bytes, epoch).
`state_hash`	**O(	state_bytes
`certify` / `verify_cert`	Same as BBS+ present / verify	One presentation over `state_hash`.
`RegistryManager::update`	O(R) for cache rebuild	Deserializes old + new state, verifies cert and version chain, rebuilds `RoleRegistry` cache by iterating all roles (`build_role_cache`).

Bandwidth: get_registry / create_registry / update_registry move the full serialized state and state certificate each time — O(R) bytes per round-trip. Very large role lists mean large management frames and more CPU on every update.

Bottlenecks: Large R (many roles in one registry) inflates state blob size, hash work, and cache rebuild. Frequent updates multiply BBS+ certify/verify cost.

RegistryManager (multi-registry server)

Operation	Time	Notes
`create` / `get` / `update` / `verify_*`	O(1) expected in G	Hash map on `registry_id`; work is on one stored registry at a time.
In-memory footprint	**O(G × (	state

Bottlenecks: G grows with every distinct registry the server accepts — mostly a RAM and operational concern. Per-request CPU is still dominated by BBS+ and (for managed flows) issuance queue handling.

Issuance request queues (`RegistryManager`)

Structure	Growth	Risk
`pending_requests` / `granted` maps	Unbounded per registry unless the application drains them	A client could queue many `queue_issuance_request` entries; server memory grows with pending items. Not the same as the grant pool file, but a similar resource exhaustion class.

Bottlenecks: Queue depth per registry; mitigations are rate limits, caps, or TTL policies at the application layer (not enforced in core today).

Issuance encryption (X25519 + ChaCha)

Operation	Time
`encrypt` / `decrypt` (grant payloads, admin replies)	O(L) for payload length L

Negligible vs BBS+ for typical small JSON blobs.

Summary: dominant costs outside the grant pool

BBS+ present/verify on every auth, admin proof, and registry certificate path — pairing-heavy, fixed per operation, proof capped at 256 KiB.
Registry state size and update — O(R) serialization, hashing, and full cache rebuild.
Issuance queues — unbounded pending entries per registry if abused.
Transport — O(L) per frame; handshake O(1).

The grant pool remains the subsystem whose per-operation cost scales with pool length n (discovery, PIR query, PIR answer compute); the rest of the protocol scales mainly with roles per registry, registry count, and proof operations per session, not with anonymous pool size.

Threats considered

Network attacker (passive)

Observes ciphertexts; cannot break ChaCha20-Poly1305 or derive session keys without breaking X25519 / HKDF under standard assumptions.
Management traffic is indistinguishable from auth traffic at the wire level (same handshake, same framing).

Network attacker (active / MITM)

Server impersonation: The server signs the session transcript hash with its long-term Ristretto255 key (prove_identity). The client verifies this signature against the pinned server public key. A MITM running a separate DH exchange produces a different transcript; it cannot forge the server's signature. The client aborts on mismatch.
Client impersonation: The BBS+ presentation is bound to the session transcript hash. A MITM cannot relay a presentation from one session to another (different transcripts) or forge one (requires a valid credential from the issuer).
Relay attack: A MITM that relays the real server's identity proof to a client fails because the proof is encrypted under the MITM-to-server session keys (not the client-to-MITM keys), and the signature is over the wrong transcript.
Management channel: All management commands (registry creation, grants) are protected by the same encrypted channel, eliminating the previous plaintext management path.

Malicious server

Can learn opaque role_id, current epoch, and that some valid member authenticated.
Sees registry_id values (needed for routing) but not role names or registry contents beyond opaque state bytes.
Sees eph_pk, to_tag, and ciphertext per grant in the anonymous pool, and pool size / timing of syncs, but cannot decrypt grant payloads or link tags to recipients.
Sees PIR queries, which are LWE-encrypted under the decisional LWE assumption — cannot determine which pool index the client is retrieving (single-server, no collusion needed).
Cannot forge BBS+ credentials without the issuer secret key.
Cannot learn member_secret from presentations under the BBS+ security assumptions.
Cannot distinguish which specific member authenticated among valid credential holders (unlinkability holds against the verifier for distinct presentation headers).
Cannot learn the client's long-term public key — it is never transmitted during handshake or auth.
Cannot perform admin operations (registry updates, grant posting) without a valid admin BBS+ credential.
Cannot correlate a recipient's mailbox identity with their authenticated sessions (different keys, unlinkable proofs).
Can censor grants by omitting tags from pool_tags or returning corrupted PIR answers. Corrupted answers are caught by E2E decryption / BBS+ verification failures. Censorship is a residual operational risk; cross-checking pool hashes across replicas mitigates it.

Malicious client

Cannot decrypt others' traffic without session keys.
Cannot produce valid auth for a role without a valid credential + correct epoch + registry entry.

Denial of service

Auth packet size: Proof length is capped (MAX_BBS_AUTH_PROOF_BYTES, 256 KiB) to bound allocations.
Handshake: Fixed 32-byte messages; no variable-length handshake parsing.
Grant pool growth: The anonymous pool is append-only with tombstoned rows (claimed), so pool length n never shrinks on disk. A malicious or careless admin can grow n without bound: larger pool_tags downloads, longer PIR hint recomputation when the pool version bumps, and per-query PIR cost linear in n (see Known limitations). This is a storage and workload amplification vector, not credential forgery. Mitigation belongs in future work (pool caps, compaction, generations).
General packet limits should still be enforced at the application layer (total message size, rate limits).

Key distribution

The server's long-term PublicKey (32-byte Ristretto255 point) functions as a self-authenticating identity — no certificate authority is required. The client must obtain and pin this key before connecting.

Recommended strategies:

Static configuration (default): embed the server public key in client config or CLI pin command (zkac-node server pin <userid> <host:port> --key <hex>). Equivalent to WireGuard's [Peer] PublicKey = ....
Trust On First Use (TOFU): accept the server's key on first connection, pin it for subsequent sessions. Risk: first connection is vulnerable.
Out-of-band verification: compare public key fingerprints over a trusted side channel (phone, in-person, encrypted messaging).
Key registry / directory: a trusted service maps names to public keys. Shifts trust to the registry and its authentication channel.

Operational requirements

Issuer secret key: Protect BbsIssuer secret material (HSM, KMS, or encrypted at rest). Compromise = ability to issue arbitrary credentials for that role.
Server long-term key: Protect the server's server_key.json. Compromise = ability to impersonate the server. Rotate the key and distribute the new public key to clients if compromised.
Member storage: member_secret and finalized Credential material must be protected; loss = re-enrollment required.
Epoch revocation: On compromise or policy change, call set_epoch and re-issue credentials only to legitimate members; old credentials become invalid at verification time.
Registry integrity: Registry state is integrity-protected by BBS+ state certificates (admin must sign updates). The server verifies these certificates before accepting changes.
Role ID privacy: role_id is a hash of the role name only if you use role_id("myrole"); treat role names as secrets if enumeration is a concern, or derive role IDs with an additional secret salt known to members.
Recipient addressing: Admins encrypt grants to the recipient's issuance public key off-server; that key is not used as a server-side mailbox index. Recipients are identified to the issuer out-of-band only.

Implementation notes (audit checklist)

BBS+ proof verification uses the same header and presentation binding as proof generation (verify_presentation in Rust).
Session transcript is included in the presentation via present(transcript_hash).
Server identity proof: Schnorr signature over transcript_hash, verified against pinned public key before any traffic.
Schnorr nonce is deterministic (H(sk || msg)) — no dependence on RNG quality at signing time.
Replay protection is symmetric per direction in Session.
Constant-time comparisons are used where critical in transport/replay paths (subtle crate).
Client long-term key is never transmitted, preserving BBS+ unlinkability.
Management and auth channels use the same encrypted handshake (no plaintext management path).
Admin proofs in post_grant are bound to the session transcript hash (no separate nonce); the CLI uses one TCP session per grant so each proof uses a fresh transcript.
After collect, the client persists the server public key from server_info (never a placeholder key).
Server stores only opaque state bytes, state certs, and encrypted grant blobs (no role names, no user IDs).
PIR queries are LWE-encrypted; the server cannot determine the queried index.
Detection tags are derived from X25519 shared secrets and cannot be linked to recipients by the server.
External: Python bindings surface raw bytes; callers must not log secrets (secret_key_bytes, member_secret, prover_blind).
External: Use secure randomness from the OS (library uses OS RNG for key generation paths exposed in Rust).

Design decisions

Unified encrypted channel: All traffic (management and auth) uses the same anonymous handshake. This eliminates the attack surface of an unencrypted management path and simplifies the protocol to a single mode.
Anonymous handshake (complete_connect_anon): The client verifies the server's identity but does not authenticate itself during the handshake. BBS+ auth is sent as an application-layer message inside the encrypted session, not as part of the handshake. This allows the same channel for both anonymous management and authenticated role access.
Server-only identity proof: Only the server signs the transcript. Adding client long-term signing would break BBS+ unlinkability (the server could correlate sessions by client public key). Client authentication is handled entirely by the anonymous BBS+ credential.
Deterministic Schnorr nonces: The signing nonce is derived as H("zkac-schnorr-nonce" || sk || msg), eliminating a class of RNG-failure attacks (cf. PS3 ECDSA, Sony 2010). Same key + same message = same signature.
Anonymous grant pool: Grant entries contain (eph_pk, ciphertext, to_tag) plus stable row metadata — no registry ID or role name. Recipients discover their grants via detection tags and retrieve them via LWE PIR. Pool rows use tombstones (claimed) so indices stay stable for PIR hints.
No user IDs on server: The server has no concept of user accounts. It is a stateless relay authenticated only by cryptographic proofs.
Single-server PIR (LWE): Eliminates the two-server non-collusion assumption of the previous XOR PIR design. Query privacy rests on decisional LWE, not operational trust in multiple server operators.
Detection tags for discovery: A 16-byte tag derived from X25519 DH allows O(n) local matching from a cheap bulk download, reducing PIR usage from O(n) queries to O(matches) queries per scan.
One session per admin grant (CLI): Each post_grant runs in its own connection so verify_admin nonces are not reused across grants in a single session.

Known limitations

Epoch granularity: Revocation is coarse (epoch bump); plan issuance and rotation policy accordingly.
zkryptium dependency: Security follows the underlying crate and BLS12-381/BBS+ standards; keep dependencies updated.
Key distribution: The library provides the cryptographic mechanism; initial key distribution is an application-layer responsibility.
Honest-but-curious PIR: The server can return incorrect PIR answers. Corrupted answers are caught by E2E decryption / BBS+ verification, but censorship (omitting grants) is not detected at the PIR layer. Cross-replica hash comparison or a transparency log can mitigate this.
Hint size: PIR hints are approximately 56 + record_bytes × N_LWE × 4 bytes (on the order of 1 MiB with record_bytes = 256 and N_LWE = 1024). Hints are cached client-side and only refetched when the pool version changes.
Unbounded grant pool: Rows are never removed from the pool file; only marked claimed. Pool length n therefore grows monotonically with every posted grant. That increases discovery traffic (pool_tags is O(n)), PIR query size (O(n) bytes per query), server work per PIR answer (O(n × record_bytes)), and hint rebuild cost when the pool changes (O(n × record_bytes × N_LWE)). Operators should plan for bounded pools or archival; the codebase does not yet enforce limits.

Future work

Bounded grant pool and anti-DoS: Introduce explicit pool caps, rate limits on post_grant, per-registry quotas, or pool generations (rotate to a fresh empty pool while archiving the old one). Optionally compact the on-disk pool by rewriting only unclaimed rows and bumping a generation id so PIR indices stay meaningful without retaining every tombstone forever. Any design must preserve stable addressing for in-flight collects or migrate clients with explicit pool ids.
Scale beyond large n: Today’s bottleneck is linear cost in pool length n for each PIR retrieval: client upload ~4n bytes per query, server matrix–vector multiply O(n × record_bytes), and discovery O(n). For very large pools, future work includes sublinear-communication PIR (e.g. DoublePIR-style layering), sharded pools with client-side routing, streaming or chunked hints, or moving heavy work off the hot path (precomputed answers, CDN for hints) — trading complexity, trust, or privacy for throughput.
DoublePIR / layered PIR: The Rust tree still carries a Figure‑14 DoublePIR reference implementation (fig14) for tests and research. Production mailbox PIR is SimplePIR on handle-only rows plus get_grant_blob for ciphertext.
Verifiable PIR: Adding a commitment to the pool state (e.g. Merkle tree or KZG) and proof of correct answer computation would defend against malicious server responses beyond what E2E encryption catches.
Pool commitment / transparency: Publishing a hash of (pool_version, hints, tags) to a public log or allowing cross-replica comparison would detect censorship by a malicious server.

Reporting issues

Report security-sensitive findings through your project's private disclosure channel (configure SECURITY.md contact or GitHub security advisories when the repository is public).

25 KiB Raw Blame History Unescape Escape