Documentation Index
Fetch the complete documentation index at: https://docs.inkwell.finance/llms.txt
Use this file to discover all available pages before exploring further.
CONFIDENTIAL & PROPRIETARY © 2026 Inkwell Finance, Inc. All Rights Reserved. This document is for informational purposes only and does not constitute legal, tax, or investment advice, nor an offer to sell or a solicitation to buy any security or other financial instrument. Any examples, structures, or flows described here are design intent only and may change.Every number on this page is tagged with a status —
measured,
measured-with-caveat, projected, or extrapolated-from-SOTA — and
backed by a committed log. No figure on this page is an opaque
estimate. For the methodology that defines those tags, see
Methodology.
Headline
N=32 match cycle · H100
7.77 s wall-clock on an NVIDIA H100 SXM, with every output
decrypted and asserted against the plaintext expected value.
Correctness-preserving per-op programmable bootstrap.
Atomic homomorphic min · H100
15.1 ms mean (== p50) over 20 iterations. 20/20
decryptions match the plaintext answer. The one-comparison floor
— every higher-level op composes from it.
The headline numbers were measured on a fresh H100 SXM cloud pod.
Inter-pod variance on this hardware is typically in the low-double
digits of percent; the 7.77 s figure is a stable reference within
that envelope.
Programmable-bootstrap headlines
The match cycle uses a programmable-bootstrap FHE backend — every homomorphic primitive folds one bootstrap, so chain depth is not a correctness constraint. Headline figures on the production target (H100 SXM):| workload | wall-clock | verification |
|---|---|---|
Atomic min(u32, u32) | 15.1 ms mean | 20/20 outputs verified |
| N=32 match cycle | 7.77 s | every output decrypted and asserted |
cudaStreamSynchronize; its
wall-clock is reported separately from the match timer. Decrypt
overhead is well under one percent of the match wall-clock.
CKKS reference (parity-fail exhibit)
A leveled CKKS path on the same hardware was run as a throughput reference. At the bootstrap chain depth this matcher needs, CKKS does not preserve correctness: the wall-clock measurement completes but the decrypted answer diverges from the plaintext shadow beyond tolerance. We cite this number only with the failure mode stated inline; see Why CKKS fails at scale for the structural reason.Next-generation FHE (projected — pre-alpha)
Production REFHE substrate is expected to deliver substantial per-primitive speedup over current TFHE-style benchmarks once mainnet alpha lands. Concrete margins will be measured at that point. Any pre-mainnet figure for this backend is treated asextrapolated-from-SOTA under the
methodology discipline — derived
from published primitive cost, not measurement.
Verification scope
Every measured TFHE-style number on this page decrypts every output after the match timer stops and asserts equality against the plaintext expected value. A final invariant assertion in each harness guards against silent undercounts. Decrypt wall-clock is reported separately so reviewers can confirm verification runs outside the match timer.Related
Methodology
The four status tags (measured / measured-caveat / projected /
extrapolated) and the rules for citing each.
Why CKKS fails at scale
What the CKKS H100 parity failure means, the absolute bootstrap
noise floor, and why programmable-bootstrap is the
correctness-preserving path.
SIMD amortization
How packing takes single-batch latency down to per-batch
amortized cost on a saturated GPU.
Interactive scaling dashboard ↗
Walk through the headline measured anchors interactively.