TL;DR
Property 2 sorry elimination is in progress. All 5 sorry in Priority_Resolution.thy have been eliminated, and the process exposed 3 latent bugs that were subsequently fixed. Sorry #1 (highest_priority_exists) failed when the design document’s suggested linorder_class.ex_max turned out not to exist in Isabelle/HOL’s standard library; the resolution combined Finite_Set‘s Max_in and Max_ge. Sorry #2–#4 encountered THE operator bound variable name clashes, a class of problem that does not exist in paper mathematics. Sorry #5 (eventual_completion) required well-founded induction via an explicit predicate definition and less_induct. The combined_safety_liveness theorem in DQuencer_Instance.thy and the full 4-file build remain ahead.
What Property 1 Taught Us, What Property 2 Confirmed
In the previous post, we designed and implemented three generic locales in Priority_Resolution.thy and their regulatory instantiation in DQuencer_Instance.thy. All four theorems in deadlock_free_locking were proven, and fair_leader_system‘s pending_monotone and starvation_bound were also complete. Five sorry remained in Priority_Resolution.thy.
| File | Theorem | Sorry Content |
|---|---|---|
| Priority_Resolution.thy | highest_priority_exists | Existence direction |
| Priority_Resolution.thy | select_highest_deterministic | THE uniqueness |
| Priority_Resolution.thy | select_highest_in_set | Set membership |
| Priority_Resolution.thy | select_highest_is_max | Maximality |
| Priority_Resolution.thy | eventual_completion | Well-founded induction |
During Property 1’s sorry elimination, we discovered a pattern: removing sorry is not mere debugging but an opportunity to improve the model structure itself. The redesign of fold to λ during the regulatory_homomorphism proof was a representative case. We expected the same pattern to repeat in Property 2.
That expectation was half right and half wrong. Sorry elimination did produce model improvements, but a larger lesson emerged elsewhere. Understanding the proof tool’s internal mechanics can matter more than proof strategy at critical moments. This observation is the core finding of this sorry elimination cycle.
The Dependency Structure of Five Sorry
The first step was mapping out the dependency relationships among the five sorry. #1 through #4 all reside in the priority_system locale. #1 (highest_priority_exists) is the keystone, and #2–#4 depend on it. #5 (eventual_completion) lives in the fair_leader_system locale and is independent of #1–#4.
Figure 1: Sorry Resolution Dependency Chain and Exposed Bugs
The order was natural: solve #1 first, and #2–#4 follow. #5 can proceed independently. But the actual work did not end with these five. Sorry removal exposed three latent bugs in the existing codebase, cascading one after another.
Sorry #1: highest_priority_exists and the Non-Existent Lemma
This is the foundational theorem of the priority_system locale. It states that in a finite non-empty set with an injective priority function, the element achieving the maximum priority exists uniquely.
lemma highest_priority_exists:
assumes "finite S" and "S ≠ {}"
shows "∃!m. m ∈ S ∧ (∀m' ∈ S. priority m' ≤ priority m)"
The hint document (CLAUDE.md) written during the modeling phase recommended using linorder_class.ex_max. A reasonable assumption: surely the standard library has a lemma stating “a finite set has a maximum element.”
No such lemma exists.
linorder_class.ex_max is nowhere in Isabelle’s standard library. This is a textbook case of a modeling-phase conjecture breaking at the proving phase. Writing “use this lemma” in a design document and verifying that the lemma actually exists in the proof tool are entirely different activities. We had a similar experience in Property 1, but found alternatives quickly then. This time, finding the alternative took longer.

linorder_class.ex_max existed nowhere in the standard library. The path forward required searching Finite_Set for Max_in and Max_ge.
The approach shifted. After searching the Finite_Set library, we found Max_in and Max_ge. The combination of these two lemmas became the key.
Figure 2: Sorry #1 Proof Structure, Image → Max → Preimage Strategy
The strategy works as follows. Apply Max to the image set priority ` S. Max_in guarantees that Max is a member of the image set for any finite non-empty set, and Max_ge guarantees that every element is at most Max. Recovering the preimage of the image set’s Max yields the existence proof. Uniqueness follows from injectivity: if two elements both achieve maximum priority, antisymmetry implies equal priority, and priority_injective implies the elements themselves are equal.
proof -
have fin_img: "finite (priority ` S)" using assms(1) by simp
have ne_img: "priority ` S ≠ {}" using assms(2) by simp
obtain m where m_in: "m ∈ S" and m_pri: "priority m = Max (priority ` S)"
using Max_in[OF fin_img ne_img] by auto
have m_max: "∀m' ∈ S. priority m' ≤ priority m"
proof
fix m' assume "m' ∈ S"
hence "priority m' ∈ priority ` S" by simp
hence "priority m' ≤ Max (priority ` S)" using Max_ge[OF fin_img] by simp
thus "priority m' ≤ priority m" using m_pri by simp
qed
show ?thesis
proof (rule ex1I[of _ m])
show "m ∈ S ∧ (∀m' ∈ S. priority m' ≤ priority m)"
using m_in m_max by auto
next
fix m2 assume "m2 ∈ S ∧ (∀m' ∈ S. priority m' ≤ priority m2)"
then have m2_in: "m2 ∈ S" and m2_max: "∀m' ∈ S. priority m' ≤ priority m2" by auto
from m_max m2_in have "priority m2 ≤ priority m" by auto
from m2_max m_in have "priority m ≤ priority m2" by auto
then have "priority m = priority m2" using ‹priority m2 ≤ priority m› by auto
then show "m2 = m" using priority_injective by auto
qed
qed
The notable technique here is the use of ex1I. To prove ∃!x. P x, one must show that witness m satisfies P m (existence), and that any m2 also satisfying P m2 implies m2 = m (uniqueness). Existence uses the Max_in + Max_ge combination; uniqueness uses priority_injective. The core idea, applying Max to the image set and recovering the preimage, is the axis of the entire proof.
Sorry #2–#4: THE Operator and Variable Name Collisions
With #1 resolved, the path to #2–#4 opened. All three theorems prove properties of the select_highest function.
definition select_highest :: "'m set ⇒ 'm option" where
"select_highest S =
(if S = {} then None
else Some (THE m. m ∈ S ∧ (∀m' ∈ S. priority m' ≤ priority m)))"
THE is Isabelle/HOL’s definite description operator. Once ∃!x. P x is proven, THE x. P x returns that unique x. Since #1 proved ∃!, the use of THE is justified.
The first attempt on #2 (select_highest_deterministic) failed. The cause was unexpectedly simple. Inside select_highest_def, the THE m. ... expression’s bound variable m collided with the free variable m in the theorem statement. Isabelle’s parser could not distinguish the two, producing an internal syntax error.
This is a problem that does not exist in paper mathematics. On paper, context disambiguates bound and free variables. Isabelle’s parser, however, identifies variables by name, so identical names at different binding levels cause confusion.
The solution for #2 was to avoid referencing THE‘s internal structure entirely. If S ≠ {}, then select_highest S necessarily returns some Some v. Uniqueness follows from Some‘s injectivity.
theorem select_highest_deterministic:
assumes fin: "finite S" and ne: "S ≠ {}"
shows "∃!m. select_highest S = Some m"
proof -
from ne obtain v where hv: "select_highest S = Some v"
unfolding select_highest_def by simp
show ?thesis
proof (rule ex1I[of _ v])
show "select_highest S = Some v" by (rule hv)
next
fix w assume "select_highest S = Some w"
with hv show "w = v" by simp
qed
qed
For #3 (select_highest_in_set) and #4 (select_highest_is_max), a different workaround was used: renaming THE m‘s bound variable to x to avoid collision with the free variable m. Applying theI'[OF uniq] extracts the property satisfied by THE x. P x, unfolding select_highest_def yields the equation m = THE x. ..., and substitution delivers the conclusion.
theorem select_highest_in_set:
assumes fin: "finite S" and sel: "select_highest S = Some m"
shows "m ∈ S"
proof -
from sel have ne: "S ≠ {}"
unfolding select_highest_def by (simp split: if_splits)
from highest_priority_exists[OF fin ne] have uniq:
"∃!x. x ∈ S ∧ (∀m' ∈ S. priority m' ≤ priority x)" .
from theI'[OF uniq] have the_in:
"(THE x. x ∈ S ∧ (∀m' ∈ S. priority m' ≤ priority x)) ∈ S" by auto
from sel ne have "m = (THE x. x ∈ S ∧ (∀m' ∈ S. priority m' ≤ priority x))"
unfolding select_highest_def by simp
with the_in show ?thesis by simp
qed
#4 (select_highest_is_max) follows the exact same pattern as #3. theI'[OF uniq] extracts the maximality condition, and the m = THE x. ... equation completes the substitution. Once the pattern is discovered, structurally identical problems resolve mechanically.
The lesson from these three sorry is that the bottleneck was not proof strategy but understanding the proof tool’s parsing rules. Mathematically identical proofs diverged between success and failure depending on how Isabelle’s parser handles variable names.
Sorry #5: eventual_completion and Well-Founded Induction
The final theorem of the fair_leader_system locale. It states that all pending requests are eventually processed.
theorem eventual_completion: shows "∃e_final. pending e_final = 0"
The core idea is straightforward. pending is a natural number, and starvation_bound guarantees periodic strict decrease. Natural numbers cannot decrease indefinitely, so pending must reach 0. Mathematically, this is well-founded induction.
The first attempt applied nat_less_induct and measure_induct_rule directly, but failed because the induction hypothesis shape did not match Isabelle’s expected rule signature. Isabelle’s induction rules expect goals in specific forms, and freely written propositions often do not match.
The solution was an explicit predicate definition followed by less_induct.
define P where "P n ≡ ∀epoch. pending epoch = n → (∃e_final. pending e_final = 0)" for n :: nat
P n is defined as “if pending equals n at some epoch, then pending eventually reaches 0.” less_induct provides strong induction: assume ∀m < n. P m, prove P n.
Figure 3: Sorry #5, Well-Founded Induction Structure
Base case (n = 0): if pending epoch = 0, then e_final = epoch. Trivial.
Inductive step (n > 0): since pending epoch > 0, starvation_bound yields some epoch e where pending (Suc e) < pending e. By pending_monotone, pending (Suc e) < n, enabling the induction hypothesis P (pending (Suc e)).
proof -
define P where
"P n ≡ ∀epoch. pending epoch = n → (∃e_final. pending e_final = 0)"
for n :: nat
have all_P: "⋀n. P n"
proof -
fix n show "P n"
unfolding P_def
proof (induction n rule: less_induct)
case (less n)
show "∀epoch. pending epoch = n → (∃e_final. pending e_final = 0)"
proof (intro allI impI)
fix epoch assume eq: "pending epoch = n"
show "∃e_final. pending e_final = 0"
proof (cases "n = 0")
case True thus ?thesis using eq by auto
next
case False
have pos: "pending epoch > 0" using eq False by auto
from starvation_bound[OF pos] obtain e where
e_ge: "epoch ≤ e" and e_dec: "pending (Suc e) < pending e"
by auto
have lt: "pending (Suc e) < n"
using pending_monotone[OF e_ge] eq e_dec by simp
from less.IH[OF lt] show ?thesis
unfolding P_def by blast
qed
qed
qed
qed
from all_P[of "pending 0"] show ?thesis
unfolding P_def by blast
qed
The interaction between starvation_bound and pending_monotone is what makes this work. starvation_bound provides “when the decrease happens.” pending_monotone guarantees “no increase in between.” Together, they construct the decreasing measure for well-founded induction. Since pending_monotone ensures pending e ≤ pending epoch, the chain pending (Suc e) < pending e ≤ pending epoch = n holds, making the induction hypothesis applicable.
The final line from all_P[of "pending 0"] says that since P holds for all natural numbers, it holds for pending 0 in particular. From this, ∃e_final. pending e_final = 0 follows.
Solving Sorry Exposed the Bugs
All five sorry were eliminated successfully. But as the sorry disappeared, three latent bugs surfaced in the existing code. When sorry is present, Isabelle accepts the theorem unconditionally, masking errors in code that depends on it. Removing sorry triggers a cascade of newly exposed failures. We experienced this same pattern in Property 1.
pending_monotone: Reversed calc Chain
In the fair_leader_system locale, the pending_monotone helper lemma had a reversed calc chain. The inc_induct rule’s case (step e) produces an induction hypothesis of the form pending e2 ≤ pending (Suc e). The original code attempted pending (Suc e) ≤ pending e first and pending e ≤ pending e1 second, but the actual goal was pending e2 ≤ pending e.
(* Before: direction reversed *) case (step e) then have "pending (Suc e) ≤ pending e" using non_honest_bounded by auto also have "... ≤ pending e1" using step.IH by auto finally show ?case . (* After *) case (step e) have "pending e2 ≤ pending (Suc e)" by (rule step.IH) also have "pending (Suc e) ≤ pending e" using non_honest_bounded by auto finally show ?case .
Isabelle’s calc blocks compose inequality chains sequentially: a ≤ b, also, b ≤ c, finally yields a ≤ c. When the direction is reversed, finally fails to compose. Mathematically the same inequality; mechanically, calc chain composition is order-sensitive.
starvation_bound: Missing Case Split
Applying honest_progress in the starvation_bound proof requires pending h > 0 as a precondition. After obtaining an honest leader epoch h from fair_leader, the question is whether pending h is still positive.
pending_monotone only provides pending h ≤ pending epoch. From pending epoch > 0, we cannot directly conclude pending h > 0; pending may have dropped to 0 between epoch and h.
The fix was a case split on pending h > 0 versus pending h = 0. In the positive case, honest_progress applies directly. In the zero case, pending went from positive (at epoch) to zero (at h), so a strict decrease must have occurred somewhere in between. Finding that point requires induction on n = h - epoch. This case split is mathematically obvious but expands to roughly 20 lines of Isabelle code. The omission was hidden while sorry was in place.
lock_eventually_expires: Wrong Witness
In deadlock_free_locking, the original witness for lock_eventually_expires was lock_time + timeout. The goal requires ∃t'. t' ≥ current_time ∧ ¬ lock_effective lock_time t', but there is no guarantee that lock_time + timeout ≥ current_time.
Concretely: if lock_time = 0 and timeout = 1, then with current_time = 100, the witness 1 fails 1 ≥ 100. The lock expired in the past, but the witness does not satisfy “at or after the current time.”
The fix changed the witness to lock_time + timeout + current_time. This value is always ≥ current_time and always ≥ lock_time + timeout, satisfying both conditions simultaneously.
(* Before *)
show "lock_time + timeout ≥ current_time ∧
¬ lock_effective lock_time (lock_time + timeout)"
(* After *)
show "lock_time + timeout + current_time ≥ current_time ∧
¬ lock_effective lock_time (lock_time + timeout + current_time)"
unfolding lock_effective_def by auto
This bug echoes Property 1’s experience. In constructive proofs, witness selection can diverge from mathematical intuition. “The moment a lock expires” is lock_time + timeout, but “a moment that is both after lock expiry and after the current time” requires lock_time + timeout + current_time. Isabelle mechanically checks that the witness satisfies all conditions simultaneously; human shortcuts (“obviously the expiry time is in the future”) do not pass.
Two Axes of Isabelle/HOL Proof Difficulty
With five sorry and three bugs resolved, Priority_Resolution.thy is sorry-free. The combined_safety_liveness theorem in DQuencer_Instance.thy and the full build remain ahead, but an observation has crystallized from the process so far.
Figure 4: Difficulty Taxonomy, Mathematical vs Tooling Difficulty
The distribution of difficulty across sorry elimination was not what we expected.
Easy as expected: the entire deadlock_free_locking proof. As reported in the previous post, timeout-based deadlock freedom is proven directly by witness construction. The single-resource locking structure eliminates circular waits structurally, so complex graph theory was unnecessary.
Harder than expected: interactions with the THE operator. Mathematically, THE is the natural companion of ∃!, but in Isabelle, working with THE requires correct bound variable naming, careful unfolding strategy, and precise application of the theI' rule. The variable name collision in sorry #2–#4 is a problem that does not exist in pure mathematics.
Most mathematically challenging: sorry #5’s well-founded induction. Defining an explicit predicate P and transforming the goal into a form compatible with less_induct required mathematical insight. However, once P‘s definition was correct, the rest was mechanical.
Consumed time without requiring math: the pending_monotone calc direction error and the lock_eventually_expires witness selection. These were not failures of mathematical understanding but failures of understanding Isabelle’s mechanical composition rules.
The emerging observation is that formal verification difficulty has two axes. One is mathematical difficulty (proof strategy, induction structure, auxiliary lemma selection). The other is tooling difficulty (parser rules, library availability, type inference mechanisms). The latter accumulates only through experience and is nearly impossible to predict from design documents.
We experienced the same pattern in Property 1. The multi_domain_preservation locale instantiation conflict with existing interpretations was not a mathematical problem but an Isabelle locale mechanism problem. With Property 2 repeating this pattern, we are increasingly confident that this is a structural characteristic of formal verification work itself.
Key Isabelle Standard Library Lemmas Used
The following Isabelle standard library lemmas were central to this sorry elimination cycle. This may serve as a reference for researchers attempting similar proofs.
| Lemma | Source | Role |
|---|---|---|
Max_in | Finite_Set | Max of a finite non-empty set is a member |
Max_ge | Finite_Set | Every element ≤ Max |
ex1I | HOL | Proof structure for ∃! |
theI' | HOL | ∃!x. P x ⟹ P (THE x. P x) |
less_induct | Nat | Strong induction on nat |
inc_induct | Nat | Upward induction (e1 ≤ e2) |
What Remains
Priority_Resolution.thy is sorry-free. But the overall verification is not yet complete.
The combined_safety_liveness theorem in DQuencer_Instance.thy remains. This theorem is the junction connecting Property 1’s valid_state_preservation with Property 2’s liveness guarantees. Mathematically, it should be a matter of directly invoking Property 1’s results, so we expect the difficulty to be manageable. But after the linorder_class.ex_max experience in sorry #1, our trust in “expectations” has diminished.
And then the full build. Assembling all four files into a single ROOT session. Even with zero sorry in each individual file, the full build can surface type compatibility issues, import conflicts, and library dependency problems. We encountered unexpected issues with ROOT session configuration in Property 1 as well.
The next post covers the combined_safety_liveness resolution, the full build, and the final results.
References
[1]. Paulson, L. C. (2022). Dealing with Descriptions in Isabelle/HOL: Least, Greatest, Whatever. https://lawrencecpaulson.github.io/2022/06/08/baby-descriptions.html
[2]. Wenzel, M. (2006). Structured Induction Proofs in Isabelle/Isar. LNCS 3863. https://link.springer.com/chapter/10.1007/11812289_3
[3]. Isabelle Standard Library Reference. Finite_Set Theory. https://isabelle.in.tum.de/library/HOL/HOL/Finite_Set.html
Learn More
Building the Formal Model: Priority Resolution and Byzantine Tolerance in Isabelle/HOL





