ZKAC/docs/SECURITY.md

# Security model and audit notes (ZKAC 0.3)

This document summarizes the design, residual risks, and recommendations for operators integrating **ZKAC**. It is not a substitute for independent review before high-assurance deployment.

## Goals

- **Authentication:** Only holders of a valid BBS+ credential for a registered role can complete `verify_auth` for that role.
- **Server identity:** The server proves its long-term identity to the client via a Schnorr signature over the session transcript; clients verify against a pinned public key. This prevents MITM attacks without requiring TLS.
- **Confidentiality & integrity:** All traffic (management and authenticated sessions) is authenticated-encrypted (ChaCha20-Poly1305) with keys derived from an ephemeral X25519 handshake.
- **Replay resistance:** Duplicate ciphertexts in a direction are rejected (sliding window + monotonic counter).
- **Unlinkability (credential layer):** BBS+ presentations are unlinkable across sessions when the presentation header (the session transcript hash) differs; the verifier learns only the disclosed attributes (opaque `role_id`, epoch) and validity. Client anonymity is preserved: the client never reveals its long-term key during the handshake.
- **Server cannot forge credentials:** The server stores only the issuer **public** key per role; forging requires the issuer secret key.
- **Opaque server:** The server stores only cryptographically verified state blobs and opaque grant ciphertexts. No user identities, role names, or credential material are stored or visible to the server.

## Cryptographic components

| Layer | Primitive | Purpose |
|-------|-----------|---------|
| Transport | X25519 ephemeral DH, HKDF-SHA256, ChaCha20-Poly1305 | Session keys, AEAD |
| Identity | Schnorr on Ristretto255, BLAKE2b-512 challenge | Server identity binding |
| Credentials | BBS+ on BLS12-381 (zkryptium), SHAKE256 ciphersuite | Blind issuance, ZK presentations |
| Role IDs | BLAKE2b-512 (truncated to 32 bytes) | Opaque role identifiers |
| Grant delivery | X25519 static/ephemeral DH, HKDF-SHA256, ChaCha20-Poly1305 | E2E-encrypted credential grants |

## Protocol flow

### Unified channel (all connections)

```
Client                                Server
  |--- init_msg (eph_pk) ------------>|
  |                                   | accept()
  |                                   | prove_identity() → sign(transcript)
  |<-- response_msg + identity_pkt ---|
  | complete DH                       |
  | decrypt + verify server sig       |
  |===== encrypted session ==========>|
  |--- {op: "mgmt"} or {op: "auth"}->|
```

Management commands (`create_registry`, `post_grant`, etc.) and BBS+ role authentication both run inside the same encrypted, server-authenticated channel. There is no unencrypted management path.

### Grant delivery (admin → recipient, through server)

Grants live in a single **anonymous append-only pool** (no `recipient_pk` on the server). Recipients fetch rows only via **two-server XOR PIR** (`mail_pool_len` + `pir_fold` on two replicas with identical pools). Each query reveals only a random subset-XOR to each server; *which* logical index is recovered is hidden if the replicas do not collude. There is **no** full-pool download API. Scanning all rows for “pending” uses **O(n) PIR round-trips** (one Chor-style query per index).

```
Admin                  Server (opaque relay)        Recipient
  |-- post_grant ------->|                            |
  |   (admin_proof,      | appends to pool:           |
  |    eph_pk,           |  {grant_id, eph_pk, ct}    |
  |    ciphertext)       |  (no recipient address)    |
  |                      |<-- pir_fold (replica A/B) --|
  |                      |--- XOR of subset rows ----->|
  |                      |                            | combine → one row
  |                      |                            | trial-decrypt
  |                      |<-- claim_grant ------------|
  |                      |  (tombstone / claimed)     |
```

## Threats considered

### Network attacker (passive)

- Observes ciphertexts; cannot break ChaCha20-Poly1305 or derive session keys without breaking X25519 / HKDF under standard assumptions.
- Management traffic is indistinguishable from auth traffic at the wire level (same handshake, same framing).

### Network attacker (active / MITM)

- **Server impersonation:** The server signs the session transcript hash with its long-term Ristretto255 key (`prove_identity`). The client verifies this signature against the **pinned** server public key. A MITM running a separate DH exchange produces a different transcript; it cannot forge the server's signature. The client aborts on mismatch.
- **Client impersonation:** The BBS+ presentation is bound to the session transcript hash. A MITM cannot relay a presentation from one session to another (different transcripts) or forge one (requires a valid credential from the issuer).
- **Relay attack:** A MITM that relays the real server's identity proof to a client fails because the proof is encrypted under the MITM-to-server session keys (not the client-to-MITM keys), and the signature is over the wrong transcript.
- **Management channel:** All management commands (registry creation, grants) are protected by the same encrypted channel, eliminating the previous plaintext management path.

### Malicious server

- Can **learn** opaque `role_id`, current epoch, and that *some* valid member authenticated.
- Sees `registry_id` values (needed for routing) but not role names or registry contents beyond opaque state bytes.
- Sees `eph_pk` and ciphertext per grant in the anonymous pool, and pool size / timing of syncs, but cannot decrypt grant payloads. It does **not** see a per-recipient mailbox key for addressing.
- **Cannot** forge BBS+ credentials without the issuer secret key.
- **Cannot** learn `member_secret` from presentations under the BBS+ security assumptions.
- **Cannot** distinguish which specific member authenticated among valid credential holders (unlinkability holds against the verifier for distinct presentation headers).
- **Cannot** learn the client's long-term public key — it is never transmitted during handshake or auth.
- **Cannot** perform admin operations (registry updates, grant posting) without a valid admin BBS+ credential.
- **Cannot** correlate a recipient's mailbox identity with their authenticated sessions (different keys, unlinkable proofs).

### Malicious client

- Cannot decrypt others' traffic without session keys.
- Cannot produce valid auth for a role without a valid credential + correct epoch + registry entry.

### Denial of service

- **Auth packet size:** Proof length is capped (`MAX_BBS_AUTH_PROOF_BYTES`, 256 KiB) to bound allocations.
- **Handshake:** Fixed 32-byte messages; no variable-length handshake parsing.
- General packet limits should still be enforced at the application layer (total message size, rate limits).

## Key distribution

The server's long-term `PublicKey` (32-byte Ristretto255 point) functions as a **self-authenticating identity** — no certificate authority is required. The client must obtain and pin this key before connecting.

Recommended strategies:

1. **Static configuration** (default): embed the server public key in client config or CLI pin command (`zkac-node server pin <userid> <host:port> --key <hex>`). Equivalent to WireGuard's `[Peer] PublicKey = ...`.
2. **Trust On First Use (TOFU):** accept the server's key on first connection, pin it for subsequent sessions. Risk: first connection is vulnerable.
3. **Out-of-band verification:** compare public key fingerprints over a trusted side channel (phone, in-person, encrypted messaging).
4. **Key registry / directory:** a trusted service maps names to public keys. Shifts trust to the registry and its authentication channel.

## Operational requirements

1. **Issuer secret key:** Protect `BbsIssuer` secret material (HSM, KMS, or encrypted at rest). Compromise = ability to issue arbitrary credentials for that role.
2. **Server long-term key:** Protect the server's `server_key.json`. Compromise = ability to impersonate the server. Rotate the key and distribute the new public key to clients if compromised.
3. **Member storage:** `member_secret` and finalized `Credential` material must be protected; loss = re-enrollment required.
4. **Epoch revocation:** On compromise or policy change, call `set_epoch` and re-issue credentials only to legitimate members; old credentials become invalid at verification time.
5. **Registry integrity:** Registry state is integrity-protected by BBS+ state certificates (admin must sign updates). The server verifies these certificates before accepting changes.
6. **Role ID privacy:** `role_id` is a hash of the role name only if you use `role_id("myrole")`; treat role names as secrets if enumeration is a concern, or derive role IDs with an additional secret salt known to members.
7. **Recipient addressing:** Admins encrypt grants to the recipient’s issuance public key off-server; that key is not used as a server-side mailbox index. Recipients are identified to the issuer out-of-band only.

## Implementation notes (audit checklist)

- [x] BBS+ proof verification uses the same header and presentation binding as proof generation (`verify_presentation` in Rust).
- [x] Session transcript is included in the presentation via `present(transcript_hash)`.
- [x] Server identity proof: Schnorr signature over `transcript_hash`, verified against pinned public key before any traffic.
- [x] Schnorr nonce is deterministic (`H(sk || msg)`) — no dependence on RNG quality at signing time.
- [x] Replay protection is symmetric per direction in `Session`.
- [x] Constant-time comparisons are used where critical in transport/replay paths (`subtle` crate).
- [x] Client long-term key is never transmitted, preserving BBS+ unlinkability.
- [x] Management and auth channels use the same encrypted handshake (no plaintext management path).
- [x] Admin proofs in `post_grant` are bound to the session transcript hash (no separate nonce); the CLI uses **one TCP session per grant** so each proof uses a fresh transcript.
- [x] After collect, the client persists the server public key from `server_info` (never a placeholder key).
- [x] Server stores only opaque state bytes, state certs, and encrypted grant blobs (no role names, no user IDs).
- [ ] **External:** Python bindings surface raw bytes; callers must not log secrets (`secret_key_bytes`, `member_secret`, `prover_blind`).
- [ ] **External:** Use secure randomness from the OS (library uses OS RNG for key generation paths exposed in Rust).

## Design decisions

- **Unified encrypted channel:** All traffic (management and auth) uses the same anonymous handshake. This eliminates the attack surface of an unencrypted management path and simplifies the protocol to a single mode.
- **Anonymous handshake (`complete_connect_anon`):** The client verifies the server's identity but does not authenticate itself during the handshake. BBS+ auth is sent as an application-layer message inside the encrypted session, not as part of the handshake. This allows the same channel for both anonymous management and authenticated role access.
- **Server-only identity proof:** Only the server signs the transcript. Adding client long-term signing would break BBS+ unlinkability (the server could correlate sessions by client public key). Client authentication is handled entirely by the anonymous BBS+ credential.
- **Deterministic Schnorr nonces:** The signing nonce is derived as `H("zkac-schnorr-nonce" || sk || msg)`, eliminating a class of RNG-failure attacks (cf. PS3 ECDSA, Sony 2010). Same key + same message = same signature.
- **Anonymous grant pool:** Grant entries contain only `(eph_pk, ciphertext)` plus stable row metadata — no registry ID or role name. Recipients find their grants by trial-decrypting after two-server XOR PIR (or an O(n) PIR scan over the pool). Pool rows use tombstones (`claimed`) so indices stay stable for replicated PIR.
- **No user IDs on server:** The server has no concept of user accounts. It is a stateless relay authenticated only by cryptographic proofs.
- **One session per admin grant (CLI):** Each `post_grant` runs in its own connection so `verify_admin` nonces are not reused across grants in a single session. Registry updates use separate connections for `get_registry` and `update_registry`. Collect uses separate connections for `server_info`, pool fetch / PIR, `claim_grant`, and `get_registry` so those operations are not tied to one transcript.

## Known limitations

- **No post-quantum** primitives: classical security assumptions only.
- **Epoch granularity:** Revocation is coarse (epoch bump); plan issuance and rotation policy accordingly.
- **zkryptium dependency:** Security follows the underlying crate and BLS12-381/BBS+ standards; keep dependencies updated.
- **Key distribution:** The library provides the cryptographic mechanism; initial key distribution is an application-layer responsibility.
- **Pool metadata:** Each replica sees `pir_fold` subset queries (random-looking index sets) and timing. Two-server XOR PIR hides the target index from each server if they do not collude; running both replicas under one operator does not provide that privacy. A full-pool scan issues **n** PIR queries and has high cost; the issuer should send **`pool_index` out-of-band** so the recipient runs **one** PIR retrieval for collect.

## Future work

- **Single-server sublinear PIR:** The CLI uses **two-server XOR PIR** (Chor-style) only. **Single-server** private information retrieval with **sublinear** client communication (e.g. **SealPIR**, **DoublePIR**, or other lattice / homomorphic-encryption–based schemes) is **not** implemented; adding it would require new dependencies, fixed database encoding, and a distinct query/response protocol. That would allow a lone replica without a non-colluding peer, at the cost of heavier crypto and implementation complexity.

## Reporting issues

Report security-sensitive findings through your project's private disclosure channel (configure `SECURITY.md` contact or GitHub security advisories when the repository is public).