Documentation Index
Fetch the complete documentation index at: https://docs.inkwell.finance/llms.txt
Use this file to discover all available pages before exploring further.
CONFIDENTIAL & PROPRIETARY © 2026 Inkwell Finance, Inc. All Rights Reserved. This document is for informational purposes only and does not constitute legal, tax, or investment advice, nor an offer to sell or a solicitation to buy any security or other financial instrument. Any examples, structures, or flows described here are design intent only and may change.Performance claims in a pre-alpha FHE product are easy to get wrong: per-op microbenches projected into graph-level wall-clock; numbers from a different graph quoted as current-state; scheme comparisons that mix hardware tiers. Dagon’s benchmark discipline is built around four status tags. Every number in the pitch, the docs, and the dashboard must fit one of them.
The four tags
✓ measured
Wall-clock recorded end-to-end on the real graph. Parity
verified against the plaintext shadow within tolerance.
Reproducible from a committed script. Cite directly.
⚠ measured-with-caveat
Wall-clock recorded but correctness is compromised — typically
CKKS at a bootstrap chain depth that exceeds the absolute noise
floor. Cite only with the failure mode stated inline.
⊘ projected
Computed from measured per-op times × the target graph’s op
histogram. No end-to-end wall-clock run. Acceptable when the
full-graph run OOMs locally; never cite without the “(projected)”
suffix.
✗ extrapolated from SOTA
Not run at all; computed from published literature constants
(e.g., a published per-bootstrap cost). Always cite as “expected,” never
“measured.”
Rules for citation
- Always attach the tag. “CKKS on H100 runs the production matcher in 257 s (measured, parity FAILED)” is OK. “CKKS on H100 is 257 s” is not.
-
Any
measured-with-caveatnumber must state the failure mode inline. The 257 s H100 CKKS run measured graph throughput, not end-to-end correctness. Decryption after the bisection chain diverges from the plaintext shadow by more than tolerance. Every use of the number must acknowledge that. See Why CKKS fails at scale. -
Next-generation-backend numbers are always prefixed
expected. The production-target FHE backend is pre-alpha. Any figure for it is a target derived from public primitive costs, not a measurement. - Compare same hardware. “TFHE on CPU at 731 ms vs CKKS on H100 at 257 s” is an apples-to-oranges comparison. CPU vs CPU or GPU vs GPU only.
- Name the precision tier. The match circuit exists at several precision tiers that differ in op count. Numbers measured at one tier are not interchangeable with another.
-
Logged or it didn’t happen. Every
measuredandmeasured-with-caveatnumber is backed by a committed run log. “It was 7.77 s on H100” without a log reference is not a defensible claim.
Why this discipline matters
Three things go wrong in FHE benchmarking when the discipline isn’t enforced:- Graph-swap smuggling: a number measured on a small development circuit gets cited as the current-state number for a much larger production circuit. The kind of mistake that produces an order-of- magnitude over-projection.
- Correctness laundering: GPU wall-clocks get reported without the decryption round-trip. A throughput number that doesn’t pass parity is not the same as a correctness-preserving match wall-clock; the word “measured” without qualification hides that.
- Extrapolation without provenance: “the next-gen scheme will be under 1 second” with no path to how. Defensible extrapolations are op-count × per-primitive cost × overhead. An undefended claim isn’t a number.
Active gaps
| cell | how to fill | cost |
|---|---|---|
| CKKS on H100 (alternative library) | run on H100 pod | small pod cost |
| Next-gen FHE end-to-end | wait for production SDK | blocked upstream |
| Improved scheduler | move from O(N²) pairwise to O(N log N) | engineering |