ZKAC/docs/SECURITY.md
everbarry 6e67836e95 v0.4
2026-04-18 01:06:12 +02:00

160 lines
14 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Security model and audit notes (ZKAC 0.3)
This document summarizes the design, residual risks, and recommendations for operators integrating **ZKAC**. It is not a substitute for independent review before high-assurance deployment.
## Goals
- **Authentication:** Only holders of a valid BBS+ credential for a registered role can complete `verify_auth` for that role.
- **Server identity:** The server proves its long-term identity to the client via a Schnorr signature over the session transcript; clients verify against a pinned public key. This prevents MITM attacks without requiring TLS.
- **Confidentiality & integrity:** All traffic (management and authenticated sessions) is authenticated-encrypted (ChaCha20-Poly1305) with keys derived from an ephemeral X25519 handshake.
- **Replay resistance:** Duplicate ciphertexts in a direction are rejected (sliding window + monotonic counter).
- **Unlinkability (credential layer):** BBS+ presentations are unlinkable across sessions when the presentation header (the session transcript hash) differs; the verifier learns only the disclosed attributes (opaque `role_id`, epoch) and validity. Client anonymity is preserved: the client never reveals its long-term key during the handshake.
- **Server cannot forge credentials:** The server stores only the issuer **public** key per role; forging requires the issuer secret key.
- **Opaque server:** The server stores only cryptographically verified state blobs and opaque grant ciphertexts. No user identities, role names, or credential material are stored or visible to the server.
## Cryptographic components
| Layer | Primitive | Purpose |
|-------|-----------|---------|
| Transport | X25519 ephemeral DH, HKDF-SHA256, ChaCha20-Poly1305 | Session keys, AEAD |
| Identity | Schnorr on Ristretto255, BLAKE2b-512 challenge | Server identity binding |
| Credentials | BBS+ on BLS12-381 (zkryptium), SHAKE256 ciphersuite | Blind issuance, ZK presentations |
| Role IDs | BLAKE2b-512 (truncated to 32 bytes) | Opaque role identifiers |
| Grant delivery | X25519 static/ephemeral DH, HKDF-SHA256, ChaCha20-Poly1305 | E2E-encrypted credential grants |
## Protocol flow
### Unified channel (all connections)
```
Client Server
|--- init_msg (eph_pk) ------------>|
| | accept()
| | prove_identity() → sign(transcript)
|<-- response_msg + identity_pkt ---|
| complete DH |
| decrypt + verify server sig |
|===== encrypted session ==========>|
|--- {op: "mgmt"} or {op: "auth"}->|
```
Management commands (`create_registry`, `post_grant`, etc.) and BBS+ role authentication both run inside the same encrypted, server-authenticated channel. There is no unencrypted management path.
### Grant delivery (admin → recipient, through server)
Grants live in a single **anonymous append-only pool** (no `recipient_pk` on the server). Recipients fetch rows only via **two-server XOR PIR** (`mail_pool_len` + `pir_fold` on two replicas with identical pools). Each query reveals only a random subset-XOR to each server; *which* logical index is recovered is hidden if the replicas do not collude. There is **no** full-pool download API. Scanning all rows for “pending” uses **O(n) PIR round-trips** (one Chor-style query per index).
```
Admin Server (opaque relay) Recipient
|-- post_grant ------->| |
| (admin_proof, | appends to pool: |
| eph_pk, | {grant_id, eph_pk, ct} |
| ciphertext) | (no recipient address) |
| |<-- pir_fold (replica A/B) --|
| |--- XOR of subset rows ----->|
| | | combine → one row
| | | trial-decrypt
| |<-- claim_grant ------------|
| | (tombstone / claimed) |
```
## Threats considered
### Network attacker (passive)
- Observes ciphertexts; cannot break ChaCha20-Poly1305 or derive session keys without breaking X25519 / HKDF under standard assumptions.
- Management traffic is indistinguishable from auth traffic at the wire level (same handshake, same framing).
### Network attacker (active / MITM)
- **Server impersonation:** The server signs the session transcript hash with its long-term Ristretto255 key (`prove_identity`). The client verifies this signature against the **pinned** server public key. A MITM running a separate DH exchange produces a different transcript; it cannot forge the server's signature. The client aborts on mismatch.
- **Client impersonation:** The BBS+ presentation is bound to the session transcript hash. A MITM cannot relay a presentation from one session to another (different transcripts) or forge one (requires a valid credential from the issuer).
- **Relay attack:** A MITM that relays the real server's identity proof to a client fails because the proof is encrypted under the MITM-to-server session keys (not the client-to-MITM keys), and the signature is over the wrong transcript.
- **Management channel:** All management commands (registry creation, grants) are protected by the same encrypted channel, eliminating the previous plaintext management path.
### Malicious server
- Can **learn** opaque `role_id`, current epoch, and that *some* valid member authenticated.
- Sees `registry_id` values (needed for routing) but not role names or registry contents beyond opaque state bytes.
- Sees `eph_pk` and ciphertext per grant in the anonymous pool, and pool size / timing of syncs, but cannot decrypt grant payloads. It does **not** see a per-recipient mailbox key for addressing.
- **Cannot** forge BBS+ credentials without the issuer secret key.
- **Cannot** learn `member_secret` from presentations under the BBS+ security assumptions.
- **Cannot** distinguish which specific member authenticated among valid credential holders (unlinkability holds against the verifier for distinct presentation headers).
- **Cannot** learn the client's long-term public key — it is never transmitted during handshake or auth.
- **Cannot** perform admin operations (registry updates, grant posting) without a valid admin BBS+ credential.
- **Cannot** correlate a recipient's mailbox identity with their authenticated sessions (different keys, unlinkable proofs).
### Malicious client
- Cannot decrypt others' traffic without session keys.
- Cannot produce valid auth for a role without a valid credential + correct epoch + registry entry.
### Denial of service
- **Auth packet size:** Proof length is capped (`MAX_BBS_AUTH_PROOF_BYTES`, 256 KiB) to bound allocations.
- **Handshake:** Fixed 32-byte messages; no variable-length handshake parsing.
- General packet limits should still be enforced at the application layer (total message size, rate limits).
## Key distribution
The server's long-term `PublicKey` (32-byte Ristretto255 point) functions as a **self-authenticating identity** — no certificate authority is required. The client must obtain and pin this key before connecting.
Recommended strategies:
1. **Static configuration** (default): embed the server public key in client config or CLI pin command (`zkac-node server pin <userid> <host:port> --key <hex>`). Equivalent to WireGuard's `[Peer] PublicKey = ...`.
2. **Trust On First Use (TOFU):** accept the server's key on first connection, pin it for subsequent sessions. Risk: first connection is vulnerable.
3. **Out-of-band verification:** compare public key fingerprints over a trusted side channel (phone, in-person, encrypted messaging).
4. **Key registry / directory:** a trusted service maps names to public keys. Shifts trust to the registry and its authentication channel.
## Operational requirements
1. **Issuer secret key:** Protect `BbsIssuer` secret material (HSM, KMS, or encrypted at rest). Compromise = ability to issue arbitrary credentials for that role.
2. **Server long-term key:** Protect the server's `server_key.json`. Compromise = ability to impersonate the server. Rotate the key and distribute the new public key to clients if compromised.
3. **Member storage:** `member_secret` and finalized `Credential` material must be protected; loss = re-enrollment required.
4. **Epoch revocation:** On compromise or policy change, call `set_epoch` and re-issue credentials only to legitimate members; old credentials become invalid at verification time.
5. **Registry integrity:** Registry state is integrity-protected by BBS+ state certificates (admin must sign updates). The server verifies these certificates before accepting changes.
6. **Role ID privacy:** `role_id` is a hash of the role name only if you use `role_id("myrole")`; treat role names as secrets if enumeration is a concern, or derive role IDs with an additional secret salt known to members.
7. **Recipient addressing:** Admins encrypt grants to the recipients issuance public key off-server; that key is not used as a server-side mailbox index. Recipients are identified to the issuer out-of-band only.
## Implementation notes (audit checklist)
- [x] BBS+ proof verification uses the same header and presentation binding as proof generation (`verify_presentation` in Rust).
- [x] Session transcript is included in the presentation via `present(transcript_hash)`.
- [x] Server identity proof: Schnorr signature over `transcript_hash`, verified against pinned public key before any traffic.
- [x] Schnorr nonce is deterministic (`H(sk || msg)`) — no dependence on RNG quality at signing time.
- [x] Replay protection is symmetric per direction in `Session`.
- [x] Constant-time comparisons are used where critical in transport/replay paths (`subtle` crate).
- [x] Client long-term key is never transmitted, preserving BBS+ unlinkability.
- [x] Management and auth channels use the same encrypted handshake (no plaintext management path).
- [x] Admin proofs in `post_grant` are bound to the session transcript hash (no separate nonce); the CLI uses **one TCP session per grant** so each proof uses a fresh transcript.
- [x] After collect, the client persists the server public key from `server_info` (never a placeholder key).
- [x] Server stores only opaque state bytes, state certs, and encrypted grant blobs (no role names, no user IDs).
- [ ] **External:** Python bindings surface raw bytes; callers must not log secrets (`secret_key_bytes`, `member_secret`, `prover_blind`).
- [ ] **External:** Use secure randomness from the OS (library uses OS RNG for key generation paths exposed in Rust).
## Design decisions
- **Unified encrypted channel:** All traffic (management and auth) uses the same anonymous handshake. This eliminates the attack surface of an unencrypted management path and simplifies the protocol to a single mode.
- **Anonymous handshake (`complete_connect_anon`):** The client verifies the server's identity but does not authenticate itself during the handshake. BBS+ auth is sent as an application-layer message inside the encrypted session, not as part of the handshake. This allows the same channel for both anonymous management and authenticated role access.
- **Server-only identity proof:** Only the server signs the transcript. Adding client long-term signing would break BBS+ unlinkability (the server could correlate sessions by client public key). Client authentication is handled entirely by the anonymous BBS+ credential.
- **Deterministic Schnorr nonces:** The signing nonce is derived as `H("zkac-schnorr-nonce" || sk || msg)`, eliminating a class of RNG-failure attacks (cf. PS3 ECDSA, Sony 2010). Same key + same message = same signature.
- **Anonymous grant pool:** Grant entries contain only `(eph_pk, ciphertext)` plus stable row metadata — no registry ID or role name. Recipients find their grants by trial-decrypting after two-server XOR PIR (or an O(n) PIR scan over the pool). Pool rows use tombstones (`claimed`) so indices stay stable for replicated PIR.
- **No user IDs on server:** The server has no concept of user accounts. It is a stateless relay authenticated only by cryptographic proofs.
- **One session per admin grant (CLI):** Each `post_grant` runs in its own connection so `verify_admin` nonces are not reused across grants in a single session. Registry updates use separate connections for `get_registry` and `update_registry`. Collect uses separate connections for `server_info`, pool fetch / PIR, `claim_grant`, and `get_registry` so those operations are not tied to one transcript.
## Known limitations
- **No post-quantum** primitives: classical security assumptions only.
- **Epoch granularity:** Revocation is coarse (epoch bump); plan issuance and rotation policy accordingly.
- **zkryptium dependency:** Security follows the underlying crate and BLS12-381/BBS+ standards; keep dependencies updated.
- **Key distribution:** The library provides the cryptographic mechanism; initial key distribution is an application-layer responsibility.
- **Pool metadata:** Each replica sees `pir_fold` subset queries (random-looking index sets) and timing. Two-server XOR PIR hides the target index from each server if they do not collude; running both replicas under one operator does not provide that privacy. A full-pool scan issues **n** PIR queries and has high cost; the issuer should send **`pool_index` out-of-band** so the recipient runs **one** PIR retrieval for collect.
## Future work
- **Single-server sublinear PIR:** The CLI uses **two-server XOR PIR** (Chor-style) only. **Single-server** private information retrieval with **sublinear** client communication (e.g. **SealPIR**, **DoublePIR**, or other lattice / homomorphic-encryptionbased schemes) is **not** implemented; adding it would require new dependencies, fixed database encoding, and a distinct query/response protocol. That would allow a lone replica without a non-colluding peer, at the cost of heavier crypto and implementation complexity.
## Reporting issues
Report security-sensitive findings through your project's private disclosure channel (configure `SECURITY.md` contact or GitHub security advisories when the repository is public).