Fail-closed by default: what a trust gate owes you when it breaks

2026-06-24 · Engineering fail-closed trust-layer transparency

A trust gate has one failure mode that matters more than all the others: the one that looks like success.

If a risk check errors and your code sees an error, that's recoverable — you retry, or you stop. But if a risk check breaks and your code sees allow, you sign against a token nobody actually screened, with confidence the response implied but never earned. For a service whose whole job is to be the thing you trust before you act, a silent allow isn't a bug. It's the bug.

Two properties are supposed to make that impossible at PaladinFi, and this post is about both — including the bugs we had to fix to make the first one true:

An enforced fail-closed contract. A source that can't be reached returns a visible unreachable marker, never silence. If no source succeeds, the verdict is warn, never allow. This is covered by a dedicated contract test suite and observable in /health.
Signed verdicts. Every free OFAC response is Ed25519-signed, so your agent can verify the verdict it acted on actually came from us and wasn't altered in transit.

We're a small team and we publish this kind of thing because, for a trust vendor, the part worth trusting isn't a clean history — it's the contract that holds when things break, and the evidence you can check yourself.

Where the discipline came from

The contract didn't start as a contract. It started as a scare.

In an earlier release we found a calldata-validation gap on the swap path — the kind of bug where, in the wrong conditions, a returned transaction could have done something other than the swap you asked for. No funds loss has ever been tied to it — no customer report, external incident, or forensic evidence — but it was latent across several releases, and it slipped past single-reviewer checks each time. We wrote it up in full at Calldata defense-in-depth.

What mattered more than the fix was the process change it forced: we moved from one reviewer to a three-adversary review told to treat the code as an audit, not a code review, and to name the funds-loss vector explicitly. That harder lens is what then surfaced the failure mode this post is really about.

The failure that looks like a pass

With the audit lens on, we pointed it at the function that composes an OFAC wallet screen, contract-anomaly checks, and third-party scam intel into one allow / warn / block verdict. Each source ran in its own try/except so one source failing wouldn't take down the whole response.

The gap: when a source raised, its factor was silently dropped. And when every source raised, the verdict step had nothing to work with and returned the default — allow, risk_score: 0 — on a token whose risk we had never actually evaluated. There was a quieter version too: if just one source failed, its factor vanished, and you couldn't tell "this source checked the token and it's clean" from "this source was down and we said nothing."

It was found, pointedly, while reviewing a homepage rewrite: we wanted to put the words "fail-closed, not silent-allow" on the site, and the review gate blocked the claim — the code didn't match the words yet. So we fixed the code before the words went live. (We wrote up that episode at Operational posture.)

The fix turned fail-closed into a contract, documented at Trust-block fail-closed contract:

A source that fails emits an explicit factor — signal: "unreachable", real: false — so silence is never a valid answer from any source.
If no source succeeds, the verdict is forced to warn, never allow. The check is a per-source success boolean, not string-matching on factor names (which could drift).

Making it generalize, not just patch

A fix that only closes the one spot you found isn't a contract — it's a patch. So we kept walking it down. The next release pushed the same unreachable discipline one layer deeper, into the sub-checks inside the scam-intel module, which had their own silent "return nothing on failure" path.

And then the instructive one: the bug reappeared in our own published tutorial. The copy-paste React hook we ship for the free endpoint mapped the verdict as recommendation === "block" ? "block" : "allow" — which forgets the third verdict. On a source outage the endpoint returns 200 with warn (the contract working), and the hook mapped that warn straight to allow. The silent-allow we'd spent releases eliminating on the server had walked into our own client example. We caught it in review before a developer shipped it, and fixed it the same day to handle allow / warn / block explicitly and fail closed on anything unrecognized.

That's the honest version of "generalized": the contract is now enforced by a dedicated fail-closed test suite — so the guarantee is test-enforced, not reviewer-enforced, which is the direct lesson from a bug that slipped past single reviewers for several releases. The broader enforcement posture is observable in /health, whose selector_enforcement block reports mode: enforce (the calldata-path defense from the earlier scare, now running in enforce mode rather than warn-only). The reason the tutorial slip got caught wasn't luck — we now know exactly what shape this bug takes and review for it everywhere, including our own docs.

Why this beats calling a single source directly

Here's the part most vendor posts won't say: on a freshly-launched Base token, PaladinFi's detection signal overlaps heavily with what a good single-source tool already surfaces. We do not claim a secret detection edge.

What single-source tools generally don't give you is failure semantics. Call most token-risk APIs during an upstream degradation and you get 200 with an empty result — which your code, reasonably, reads as "clean." That's a silent-allow you inherit, in your own integration, exactly like the one in our tutorial. PaladinFi's contract makes the difference explicit: "we couldn't check this" comes back as warn with an unreachable factor, not as a clean pass. And the verdict is signed, so an agent can prove what it acted on. That — composition with honest failure semantics, plus a verifiable verdict — is the thing you'd otherwise have to build and maintain yourself.

We're also precise about scope: the free endpoint is a wallet-address OFAC screen, not a token-contract sanctions check. Every response says so, in the _scope field on its trust object. Saying exactly what a check does and doesn't do is part of failing closed.

Try it

The token-risk check this whole post is about — OFAC + GoPlus + Etherscan + anomaly, composed under the fail-closed contract — is /v1/trust-check, behind a free API key you can get in about two minutes at paladinfi.com/signup.

If you want a zero-friction, no-signup taste first, the anonymous wallet-OFAC screen is a single POST (real-data, rate-limited):

POST https://swap.paladinfi.com/v1/trust-check/ofac
{ "address": "0x...", "chainId": 8453 }

Either way, if you want to verify a verdict came from us unaltered before acting on it, the Ed25519 recipe — pinned key, copy-paste Python and JS — is at Verify a PaladinFi response.

The claim here isn't that we never break. It's that when we do, we break visible and closed — and you can check that yourself, per response, signed.

Published: 2026-06-24 · Next quarterly piece (Q3): how the three-adversary review process is structured.