Customer Service is the GPN CTF 2026 misc challenge that builds an LCF-style higher-order-logic proof assistant from holpy, pretends to forbid axioms, and gets its guard wrong three ways. The exploit is a three-item JSON payload: declare a false constant of our own, axiomatize it, ship a one-line proof. The flag spells the lesson:
GPNCTF{Ex-Un4-Line4-VACu4-sEqui7ur-QuOdl1b3t}
Ex contradictione quodlibet — “from a contradiction, anything follows.” The flag’s mangled Latin reads ex una linea vacua sequitur quodlibet — “from one empty line, anything follows” — naming the lone allowed axiom that breaks soundness.
This is the standalone deep-dive on misc/customer-service from the GPN CTF 2026 master writeup. Full source at misc/customer-service.
The oracle
checker.py, ~115 lines, in essence:
hex_proof = input("give me your hex proof")
data = json.loads(bytes.fromhex(hex_proof).decode("utf-8"))
with theory.fresh_theory():
for imp in data.get("imports", []):
basic.load_theory(imp)
for raw in data.get("content", []):
item = items.parse_item(raw)
if item.error: continue
if item.ty == "thm":
result = monitor.check_proof(item, rewrite=False)
if result["status"] in ["OK", "ProofOK"]:
exts = item.get_extension()
report = theory.thy.checked_extend(exts)
if len(report.get_axioms()) > 1: sys.exit(1)
elif report.get_axioms() == 1 and item.ty != "thm": sys.exit(1)
thm = theory.thy.get_theorem(item.name)
if theorem_proves_false_unconditioned(thm):
win() # prints the flag
sys.exit(0)
else: sys.exit(1)
else:
exts = item.get_extension()
report = theory.thy.checked_extend(exts)
if len(report.get_axioms()) > 1: sys.exit(1)
elif report.get_axioms() == 1 and item.ty != "thm": sys.exit(1)
The win check:
def theorem_proves_false_unconditioned(thm):
concl_str = str(thm.concl).strip().lower()
is_false = (thm.concl.is_const("false") or
concl_str == "false" or concl_str == "?false")
return is_false and len(thm.assums) == 0 and len(thm.hyps) == 0
The author’s source comments include # HELP WTF IS THIS IG and # ig we addd it ?? — they’re telling you they glued this together and aren’t sure why it works. They are correct to be unsure.
Bug 1 — list == 1
if (len(report.get_axioms())) > 1: … # OK, length check
elif report.get_axioms() == 1 and item.ty != "thm": … # dead
ExtensionReport.get_axioms() returns a list of (name, info) tuples. [] == 1, [("x", t)] == 1 — both False. The elif is unreachable. So the real budget is up to one axiom per item, not zero.
Bug 2 — every thm is also an axiom
server/items.py::Theorem(Axiom) inherits get_extension() from Axiom:
def get_extension(self):
res = [extension.Theorem(self.name, Thm(self.prop))] # prf=None!
…
Then kernel/theory.py::checked_extend:
elif ext.is_theorem():
if ext.prf:
self.check_proof(ext.prf)
else: # this branch
ext_report.add_axiom(ext.name, ext.th)
self.add_theorem(ext.name, ext.th)
So even a thm item that just had its proof checked by monitor.check_proof is re-installed as an axiomatic theorem and ticks the axiom counter. The >1 bound is the entire safety net, and it allows exactly this.
Bug 3 — a homemade false
The loader runs in fresh_theory() which calls EmptyTheory():
thy.add_type_sig("bool", 0)
thy.add_type_sig("fun", 2)
thy.add_term_sig("equals", …)
thy.add_term_sig("implies", …)
thy.add_term_sig("all", …)
That’s it. No false, no not, no &. The real false constant lives in library/logic_base.json, which isn’t shipped in the installed wheel — imports: ["logic_base"] errors with No such file or directory: '/challenge/.venv/.../library/'.
theorem_proves_false_unconditioned doesn’t notice. It just calls thm.concl.is_const("false") — true for any Const("false", _). So we declare a false : bool of our own with def.ax.
The exploit
{
"imports": [],
"content": [
{ "ty": "def.ax", "name": "false", "type": "bool" },
{ "ty": "thm.ax", "name": "customer_is_right", "vars": {}, "prop": "false" },
{
"ty": "thm", "name": "pwn", "vars": {}, "prop": "false", "num_gaps": 0,
"proof": [
{ "id": "0", "rule": "theorem", "args": "customer_is_right",
"prevs": [], "th": "⊢ false" }
]
}
]
}
Walking the loop:
| # | Item | len(get_axioms()) after | >1? |
|---|---|---|---|
| 0 | def.ax false : bool | 0 (constants don’t count) | ok |
| 1 | thm.ax customer_is_right : false | 1 (axiom) | ok |
| 2 | thm pwn with theorem rule | 1 (Bug 2) | ok |
For item 2, monitor.check_proof enters the item.proof branch, parses one step, and looks it up by kernel/theory.py:339:
if seq.rule == "theorem":
res_th = self.get_theorem(seq.args) # customer_is_right
res_th is Thm(false) with empty hyps. The advertised seq.th = "⊢ false" matches, status is ProofOK. checked_extend then re-adds pwn as a no-proof theorem (Bug 2), the axiom count is 1 (Bug 1 lets it slide), get_theorem("pwn") returns Thm(false) — concl.is_const("false") ✓, hyps == () ✓, assums == () (an unconditioned false has no implication prefix) ✓.
win() opens ./flag.txt.
Running it
hex=$(python3 -c "print(open('exploit.json').read().encode().hex())")
echo "$hex" | openssl s_client -quiet \
-connect grilled-souffle-beside-roasted-tomato-juss.gpn24.ctf.kitctf.de:443
give me your hex proof✓ Proof check passed
Congratulations! You've found the flag: GPNCTF{Ex-Un4-Line4-VACu4-sEqui7ur-QuOdl1b3t}
Why it was solvable at all
In a real LCF kernel none of this would land. The two real load-bearing mistakes:
- The author’s intent was “you can ship definitions and theorems but not raw axioms” — they tried to enforce that on the report. But
checked_extenditself was designed around the assumption that aTheoremextension with a proof is fine, and a Theorem without a proof is by definition an axiom.Theorem.get_extension()always produces the second flavour — the proof never makes it pastmonitor.check_proof. So everythmaccepted by the monitor is then re-typed as an axiom by the kernel. Fixing thelist == 1typo would also lock out every valid theorem; the real fix is to thread the checked proof into the extension. - The win check uses a string-name comparison instead of comparing the conclusion term to the actual logical
falsein scope. Since the loader runs inEmptyTheory()and thelibrary/directory isn’t shipped, there is no logical false to compare to — any opaqueConst("false", bool)wins.
Defender takeaway
- Name-based equality on logical terms is a recurring bug class in proof-checker glue code.
thm.concl.is_const("false")matches on the constant’s name string, not on whether the term is the logical false defined in the theory. In a fresh theory with no logical false in scope, any opaqueConst("false", _)passes — including ones the user declared. - List-vs-int comparison (
report.get_axioms() == 1) is a Python type-system idiom that should be caught by a linter.mypyorpyrightwith strict mode flagslist == intas a type error. Adopting one in your CI would have caught this bug at PR time. - LCF-style proof kernels are only as strong as the smallest interface around them. holpy itself is well-engineered; the 115-line wrapper that exposes it over a network is the vulnerable surface. Whenever you wrap a trusted kernel in glue code, the glue inherits the kernel’s trust assumptions — and the glue is almost always weaker.
Frequently asked questions
What’s the first bug — list == 1?
ExtensionReport.get_axioms() returns a list. The comparison list == 1 is always False in Python — list and int are not equal under ==. The author wrote the check as a guard against “any axioms” but the comparison shape never fires. The actual guard becomes len(get_axioms()) > 1 — up to one axiom per item is allowed.
What’s the second bug — thm re-axiomatization?
Theorem.get_extension() inherits from Axiom.get_extension and always returns a Theorem extension with prf=None. The kernel’s checked_extend sees prf=None and re-adds the theorem as an axiom, ticking the axiom counter. So even a fully-proved theorem gets axiomatized after acceptance — the budget of one axiom (from Bug 1) covers this re-axiomatization.
What’s the third bug — is_const("false")?
Term.is_const("false") matches on the constant’s name string, not on identity with the logical false. The loader runs in EmptyTheory() which has no false constant in scope. We declare our own false : bool opaque constant via def.ax, and the win check accepts it because the name matches — even though it has nothing to do with logical falsity.
Why is the exploit only three items?
Item 0: declare false : bool (a constant declaration; doesn’t count as an axiom). Item 1: declare an axiom customer_is_right : false (one axiom, fits the budget). Item 2: a thm named pwn with proof rule theorem customer_is_right — checks out under the monitor (the axiom is exactly what’s claimed), gets re-axiomatized by Bug 2 (axiom count still 1, within budget), and wins under Bug 3 (name-equality match on the opaque false constant).
Why is imports: [] necessary?
The real false, not, and, etc. live in logic_base, which isn’t shipped with the installed wheel. Attempting imports: ["logic_base"] errors with No such file or directory. Leaving imports empty runs in EmptyTheory, which has no logical false at all — making Bug 3 exploitable.
How would a real LCF kernel resist this?
Real LCF kernels (HOL Light, Isabelle/HOL, Coq) compare terms by their type-checked structure, not by name. The win check would be thm.concl == constants.false_const, where constants.false_const is the kernel’s authoritative reference. Name shadowing wouldn’t help; declaring false : bool would create a different term that == would reject. Also, no real LCF kernel re-axiomatizes proven theorems.
Where can I find the solver?
Full source at misc/customer-service including exploit.json and solve.sh. Master writeup at /ctf-writeups/gpn-ctf-2026-writeup/.
