Modular Verification of Preemptive OS Kernels

Alexey Gotsman
IMDEA Software Institute
Alexey.Gotsman@imdea.org

Hongseok Yang
University of Oxford
Hongseok.Yang@cs.ox.ac.uk

Abstract

Most major OS kernels today run on multiprocessor systems and are preemptive: it is possible for a process running in the kernel mode to get descheduled. Existing modular techniques for verifying concurrent code are not directly applicable in this setting: they rely on scheduling being implemented correctly, and in a preemptive kernel, the correctness of the scheduler is interdependent with the correctness of the code it schedules. This interdependency is even stronger in mainstream kernels, such as Linux, FreeBSD or XNU, where the scheduler and processes interact in complex ways.

We propose the first logic that is able to decompose the verification of preemptive multiprocessor kernel code into verifying the scheduler and the rest of the kernel separately, even in the presence of complex interdependencies between the two components. The logic hides the manipulation of control by the scheduler when reasoning about preemptable code and soundly inherits proof rules from concurrent separation logic to verify it thread-modularly. This is achieved by establishing a novel form of refinement between an operational semantics of the real machine and an axiomatic semantics of OS processes, where the latter assumes an abstract machine with each process executing on a separate virtual CPU. The refinement is local in the sense that the logic focuses only on the relevant state of the kernel while verifying the scheduler. We illustrate the power of our logic by verifying an example scheduler, modelled on the one from Linux 2.6.11.

Categories and Subject Descriptors D.2.4 [Software Engineering]: Software/Program Verification; F.3.1 [Logics and Meanings of Programs]: Specifying and Verifying and Reasoning about Programs; D.4.1 [Operating Systems]: Process Management

General Terms Languages, Theory, Verification

Keywords Verification, Concurrency, Scheduling, Modularity

1. Introduction

Developments in formal verification now allow us to consider the full verification of an operating system (OS) kernel, one of the most crucial components in any system today. Several recent projects have demonstrated that formal verification can tackle realistic OS kernels, such as a variant of the L4 microkernel [16] and Microsoft’s Hyper-V hypervisor [3]. Having dealt with relatively small microkernels, these projects nevertheless give us hope that in the future we will be able to verify the likes of kernels from today’s mainstream operating systems, such as Windows and Linux.

In this paper, we tackle one of the main challenges in realising this hope—handling kernel preemption in a multiprocessor system. Most major OS kernels are designed to run with multiple CPUs and are preemptive: it is possible for a process running in the kernel mode to get descheduled. Reasoning about such kernels is difficult for the following reasons.

First of all, in a multiprocessor system several invocations of a system call may be running concurrently in a shared address space, so reasoning about the call needs to consider all possible interactions among them. This is a notoriously difficult problem; however, we now have a number of logics [3–5, 13, 19, 22] that can reason about concurrent code. The way the logics make verification tractable is by using thread-modular reasoning principles that consider every thread of computation in isolation under some assumptions about its environment and thus avoid direct reasoning about all possible interactions.

The problem is that all these logics can verify code only under so-called interleaving semantics, expressed by the well-known operation semantics rule:

\[ C_k \rightarrow C'_k \]
\[ C_1 \parallel \ldots \parallel C_k \parallel \ldots \parallel C_n \rightarrow C_1' \parallel \ldots \parallel C'_k \parallel \ldots \parallel C_n \]

This rule effectively assumes an abstract machine where every process \( C_k \) has its own CPU, whereas in reality, the processes are multiplexed onto available CPUs by a scheduler. Furthermore, in a preemptive kernel, the scheduler is part of the kernel being verified and its correctness is interdependent with the correctness of the rest of the kernel (which, in the following, we refer to as just the kernel). Thus, what you see in a C implementation of OS system calls and what most logics reason about is not what you execute in such a kernel. When reasoning about a system call implementation in reality, we have to consider the possibility of context-switch code getting executed at almost every program point. Upon a context switch, the state of the system call will be stored in kernel data structures and subsequently loaded for execution again, possibly on a different CPU. A bug in the scheduling code can load an incorrect state of the system call implementation upon a context switch, and a bug in the system call can corrupt the scheduler’s data structures.

It is, of course, possible to reason about the kernel together with the scheduler as a whole, using one of the available logics. However, in a mainstream kernel, where kernel preemption is enabled most of the time, such reasoning would quickly become intractable.

In this paper we propose a logic that is able to decompose the verification of safety properties of preemptive OS code into verifying the scheduler and preemptable code separately. This is the first logic that can handle interdependencies between the scheduler and the kernel present in mainstream OS kernels, such as Linux, FreeBSD and XNU. Our logic consists of two proof systems, which we call high-level and low-level. The high-level system verifies preemptable code assuming that the scheduler is implemented cor-
rectly (Section 4.3). It hides the complex manipulation of control by the scheduler, which stores program counters of processes (describing their continuations) and jumps to one of them during a context switch. In this way, the high-level proof system provides the illusion that every process has its own virtual CPU—the control moves from one program point in the process code to the next without changing its state. This illusion is justified by verifying the scheduler code separately from the kernel in the low-level proof system (Section 4.4).

A common way to simplify reasoning about program components sharing an address space, such as the scheduler and the kernel, is to introduce the notion of ownership of memory areas: only the component owning an area of memory has the right to access it. The main difficulty of decomposing the verification of the mainstream OS kernels mentioned above lies in the fact that in such kernels there is no static address space separation between data structures owned by the scheduler and the rest of the kernel: the boundary between these changes according to a protocol for transferring the ownership of memory cells and permissions to access them in a certain way. For example, when an implementation of the fork system call asks the scheduler to make a new process runnable, the scheduler usually gains the ownership of the process descriptor provided by the system call implementation. This leads to several technical challenges our logic has to deal with.

First, this setting introduces an obligation to prove that the scheduler and the kernel do not corrupt each other’s data structures. To this end, we base our proof systems on concurrent separation logic [19], which allows us to track the dynamic memory partitioning between the scheduler and the kernel and prohibit memory accesses that cross the partitioning boundary. For example, assertions in the high-level proof system talk only about the memory belonging to the kernel and prohibit memory accesses that cross the partitioning boundary. These assertions are also interpreted as exclusive permissions to schedule the corresponding processes, which allows us to reason about scheduling on multiprocessors. A novel feature of the low-level proof system that allows verifying schedulers separately from the rest of the kernel is its locality: proofs about the scheduler focus only on a small relevant part of the state of processes.

Even though all of the OS verification projects carried out so far had to deal with a scheduler (see Section 7 for a discussion), to our knowledge they have not produced methods for handling practical multiprocessor schedulers with a complicated scheduler/kernel interface. We illustrate the power of our logic by verifying an example scheduler, modelled on the one from Linux 2.6.11 (Sections 2.2 and 5), which exhibits the issues mentioned above.

2. Informal development

We first explain our results informally, sketching the machine we use for formalising them (Section 2.1), illustrating the challenges of reasoning about schedulers by an example (Section 2.2) and describing the approach we take in our program logic (Section 2.3).

2.1 Example machine

To keep the presentation tractable, we formalise our results for a simple machine, defined in Section 3. Here we present it informally to the extent needed for understanding the rest of this section.

We consider a machine with multiple CPUs, identified by integers from 1 to NCPUS, communicating via the shared memory. We assume that the program the machine executes is stored separately from the kernel and may not be modified; its commands are identified by labels. For simplicity we also assume that programs can synchronise using a set of built-in locks (in reality they would be implemented as spin-locks). Every CPU has a single interrupt, with its handler located at a distinguished label schedule, which a scheduler can use to trigger a context switch. There are four special-purpose registers, ip, if, ss and ap, and m general-purpose ones, gr1, . . . , grm. The ip register is the instruction pointer. The if register controls interrupts: they are disabled on the corresponding CPU when it is zero and enabled otherwise. As if affects only one CPU, we might have several instances of the scheduler code executing in parallel on different CPUs. Upon an interrupt, the CPU sets if to 0, which prevents nested interrupts. The ss register keeps the starting address of the stack, and ap points to the top of the stack, i.e., its first free slot. The stack grows upwards, so we always have ss < ap.

Since we are primarily interested in interactions of components within an OS kernel, our machine does not make a distinction between the user mode and the kernel mode—all processes can potentially access all available memory and execute all commands.

The machine executes programs in a minimalist assembly-like programming language. It is described in full in Section 3; for now it suffices to say that the language includes standard commands for accessing registers and memory, and the following special ones:

• lock(l) and unlock(l) acquire and release the lock l.
• savecpuid(e) stores the identifier of the CPU executing it at the address e.
• call(l) is a call to the function that starts at the label l. It pushes the label of the next instruction in the program and the values of the general-purpose registers onto the stack, and jumps to the label l. icall(l) behaves the same as call(l), except that it also disables interrupts by modifying the if register.
• ret is the return command. It pops the return label and the saved general-purpose registers off the stack, updates the registers with the new values, and jumps to the return label. iret is a variant of ret that additionally enables interrupts.
2.2 Motivating example

Figure 1 presents an implementation of the scheduler we use as a running example. We would like to be able to verify safety properties of OS processes managed by this scheduler using off-the-shelf concurrency logics, i.e., as though every process has its own virtual CPU. The scheduler uses data structures and an interface with the rest of the kernel similar to the ones in Linux 2.6.11 [2]. To concentrate on key issues of scheduler verification, we make some simplifying assumptions: we do not consider virtual memory and assume that processes are never removed and never go to sleep. We have also omitted the code for data structure initialisation.

The scheduler’s interface consists of two functions: schedule and create. The former is called as the interrupt handler or directly by a process and is responsible for switching the process running on the CPU and migrating processes between CPUs. The latter can be called by the kernel implementation of the fork system call and is responsible for inserting a newly created process into the scheduler’s data structures, thereby making it runnable. Both functions are called by processes using the icall command that disables interrupts, thus, the scheduler routines always execute with interrupts disabled.

Programming language. Even though we formalise our results for a machine executing a minimalistic programming language, we present the example in C. We now explain how a C program, such as the one in Figure 1, is mapped to our machine.

We assume that global variables are allocated at fixed addresses in memory. Local variable declarations allocate local variables on the stack in the activation records of the corresponding procedures; these variables are then addressed via the sp register. When the variables go out of scope, they are removed from the stack by decrementing the sp register. The general-purpose registers are used to store intermediate values while computing complex expressions. We allow the ss and sp registers to be accessed directly as _ss and _sp. Function calls and returns are implemented using the call and ret commands of the machine. By default, parameters and return values are passed via the stack; in particular, a zero-filled slot for a return value is allocated on the stack before calling a function. Parameters of functions annotated with __regparam (such as create) are passed via registers. We assume macros lock, unlock, savecpuid and iret for the correspondence machine commands. We also use some library functions: e.g., remove_node deletes a node from the doubly-linked list it belongs to, and insert_node_after inserts the node given as its second argument after the list node given as its first argument.

Data structures. Every process is associated with a process descriptor of type Process. Its prev and next fields are used by the scheduler to connect descriptors into doubly-linked lists of processes it manages (runqueues). The scheduler uses per-CPU runqueues with dummy head nodes pointed to by the entries in the runqueue array. These are protected by the locks in the runqueue_lock array, meaning that a runqueue can only be accessed with the corresponding lock held. The entries in the current array point to the descriptors of the processes running on the corresponding CPUs; these descriptors are not members of any runqueue. Thus, every process descriptor is either in the current array or in some runqueue. Note that every CPU always has at least one process to run—the one in the corresponding slot of the current array. Every process has its own kernel stack of a fixed sizeStackSize, represented by the kernel_stack field of its de-

We modelled our scheduler on an older version of the Linux kernel (from 2005) because it uses simpler data structures. Newer versions use more efficient data structures [17] that would only complicate our running example without adding anything interesting.

Figure 1. The example scheduler
The invariant of the stack of a preempted process

![activation records](ip, gr1, ..., gr_m, cpu, old_process)...

\[ \text{saved_sp} \]

**Figure 2.** The invariant of the stack of a preempted process

... and locks it together with the current runqueue in the order determined by the corresponding CPU identifiers, to avoid deadlocks. The function then removes one process from the victim runqueue, if it is non-empty, and inserts it into the current one. Note that two concurrent scheduler invocations executing \text{load_balance} on different CPUs may access the same runqueue. While verifying the OS, we have to ensure they synchronise their accesses correctly.

The **create** function inserts the descriptor of a newly created process with the address given as its parameter into the runqueue of the current CPU. We pass the parameter via a register, as this simplifies the following treatment of the example. The descriptor must be initialised like that of a preempted process, hence, its stack must satisfy the invariant in Figure 2. To prevent deadlocks, create must be called using \text{icall}, which disables interrupts. Upon a call to create, the ownership of the descriptor is transferred from the kernel to the scheduler.

The **fork** function is not part of the scheduler. It illustrates how the rest of the kernel can use create to implement a common system call that creates a clone of the current process. This function allocates a new descriptor, copies the stack of the current process to it and initialises the stack as expected by create (Figure 2). This amounts to discarding the topmost activation record of fork and pushing a fake activation record of schedule (note that the values of registers the new process should start from have been saved on the stack upon the call to fork). Since stack slots for return values are initialised with zeros, this is what fork in the child process will return; we return 1 in the parent process.

The **need for modularity.** We could try to verify the scheduler and the rest of the kernel as a whole, modelling every CPU as a process in one of the existing program logics for concurrency [3–5, 13, 19, 22]. However, in this case our proofs would have to consider the possibility of the control-flow going from any statement in a process to the schedule function, and from there to any other process. Thus, in reasoning about a system call implementation we would end up having to reason explicitly about invariants and actions of both schedule and all other processes, making the reasoning unintuitive and, most likely, intractable. In the rest of the paper we propose a logic that avoids this pitfall.

### 2.3 Approach

Before presenting our logic in detail, we give an informal overview of the reasoning principles behind it.

**Modular reasoning via memory partitioning.** The first issue we have to deal with while designing the logic is how to verify the scheduler and the kernel separately, despite the fact that they share the same address space. To this end, our logic partitions the memory into two disjoint parts. The memory cells in each of the parts are owned by the corresponding component, meaning that only this component can access them. It is important to note that this partitioning does not exist in the semantics, but is enforced by proofs in the logic to enable modular reasoning about the system. Modular reasoning becomes possible because, while reasoning about one component, one does not have to consider the memory partition owned by the other, since it cannot influence the behaviour of the component. An important feature of our logic, required for handling schedulers from mainstream kernels, is that the memory partitioning is not required to be static: the logic permits ownership transfer of memory cells between the areas owned by the scheduler and the kernel according to an axiomatically defined interface. For example, in reasoning about the scheduler of Section 2.2, the logic permits the transfer of the descriptor for a new process from the kernel to the scheduler at a call to create.

As we have noted before, our logic consists of two proof systems: the high-level system (Section 4.3) for verifying the ker-
nel and the low-level one for the scheduler (Section 4.4). These proof systems implement a form of assume-guarantee reasoning between the two components, where one component assumes that the other does not touch its memory partition and provides well-formed pieces of memory at ownership transfer points.

**Concurrent separation logic.** We use concurrent separation logic [19] as a basis for modular reasoning within a given component, i.e., either among concurrent OS processes or concurrent scheduler invocations on different CPUs. This choice was guided by the convenience of presentation; see Section 8 for a discussion of how more advanced logics can be integrated. However, the use of a version of separation logic is crucial, because we inherently rely on the *frame property* validated by the logic: the memory that is not mentioned in the inclusions in a proof of a command is guaranteed not to be changed by it. While reasoning about a component, we consider only the memory partition belonging to it. Hence, we automatically know that the component cannot modify the others.

Concurrent separation logic achieves modular reasoning by further partitioning the memory owned by the component under consideration into disjoint process-local parts (one for each process or scheduler invocation on a given CPU) and protected parts (one for each free lock). A process-local part can only be accessed by the corresponding process or scheduler invocation, and a lock-protected part only when the process holds the lock. The resulting partitioning of the system state is illustrated in Figure 3. The frame property guarantees that a process cannot access the partition of the heap belonging to another one. To reason modularly about parts of the state protected by locks, the logic associates with every lock an assertion—its lock invariant—that describes the part of the state it protects. Resource invariants restrict how processes can change the protected state, and hence, allow reasoning about them in isolation.

**Scheduler-agnostic verification of kernel code.** The high-level proof system (Section 4.3) reasons about preemptible code assuming an abstract machine where every process has its own virtual CPU. It relies on the partitioned view of memory described above to hide the state of the scheduler, with all the remaining state split among processes and locks accessible to them, as illustrated in Figure 4. We have primed process identifiers in the figure to emphasise that the virtual state of the process can be represented differently in the abstract and physical machines: for example, if a process is not running, the values of its local registers can be stored in scheduler-private data structures, rather than in CPU registers.

Apart from hiding the state of the scheduler, the high-level system also hides the complex manipulation of the control-flow performed by it: the proof system assumes that the control move from one point in the process code to the next without changing its state, ignoring the possibility of the scheduler getting executed upon an interrupt. Explicit calls to the scheduler are treated as if they were executed atomically.

Technically, the proof system is a straightforward adaptation of concurrent separation logic, which is augmented with proof rules axiomatising the effect of scheduler routines explicitly called by processes. The novelty here is that we can use such a scheduler-agnostic logic in this context at all.

**Proving schedulers correct via logical refinement.** The use of the high-level proof system is justified by verifying the scheduler implementation using a low-level proof system (Section 4.4). What does it mean for a scheduler to be functionally correct? Intuitively, a scheduler must provide an illusion of a system where every process has its own virtual CPU with a dedicated set of registers. To formalise this, we could define a semantics of such an abstract system and prove that any behaviour of the concrete system is reproducible in the abstract one, thus establishing a refinement between the two systems. The main technical challenge we have to deal with in this paper is that for realistic OS schedulers, defining a semantics for the abstract system a scheduler implements is difficult. This is because, in reasoning about mainstream operating systems, the ownership transfer between the scheduler and the kernel can involve not only fixed memory cells, but arbitrary *logical* facts describing them, which is difficult to describe operationally (see the treatment of the desc predicate in Section 4.3).

In this paper we resolve this problem in a novel way. Instead of defining the semantics of the abstract machine operationally, we define it only axiomatically as the high-level proof system described above. As expected, the low-level proof system is used to reason about the correspondence between the concrete and the abstract system, with its assertions relating their states. However, proofs in neither of the two systems are interpreted with respect to any semantics alone: our soundness statement (Section 6) interprets a proof of the kernel in the high-level system and a proof of the scheduler in the low-level one together with respect to the semantics of the concrete machine. Thus, instead of relating sets of executions of the two systems, the soundness statement relates logical statements about the abstract system (given by high-level proofs) to logical statements about the concrete one (given by a constraint on concrete states). We call this form of establishing a correspondence between the two systems a *logical refinement*. Note that in this case the soundness statement for the logic does not yield a semantic statement of correctness for the scheduler being considered. Rather, its correctness is established indirectly by the fact that reasoning in the high-level proof system, which assumes
3.2 Commands

Programs for our machine consist of primitive commands \( c \):

\[
\begin{align*}
\text{Reg} &= \{ \text{ip, ss, sp, gr}_1, \ldots, \text{gr}_m \} & \text{Loc} & \subseteq \text{Val} \\
\text{Context} &= \text{Reg} \rightarrow \text{Val} & \text{CPUId} &= \{1, \ldots, \text{NCPU} \} \\
\text{GContext} &= \text{CPUId} \rightarrow \text{Context} & \text{Heap} &= \text{Loc} \rightarrow \text{Val} \\
\text{Lock} &= \{ \ell_1, \ell_2, \ldots, \ell_n \} & \text{Lockset} &= \mathcal{P}(\text{Lock}) \\
\text{Config} &= \text{GContext} \times \text{Heap} \times \text{Lockset}
\end{align*}
\]

Figure 5. The set of machine configurations Config. We assume sets Loc of valid memory addresses and Val of values, respectively.

The abstract one-CPU-per-process machine, is sound with respect to the concrete machine.

To verify the scheduler separately from the processes it manages, low-level assertions focus only on a small relevant part of the state of the kernel, which we call scheduler-visible. Namely, the assertions relate the state local to a scheduler invocation on a particular CPU in the concrete system (e.g., the region marked CPU1 in Figure 3) to parts of abstract states of some of the OS processes (e.g., the dark regions in Figure 4). The latter parts can include, e.g., the values of registers of the virtual CPU of the process, but not the process-local memory. They are used in the low-level proof system to verify that the operations performed by the scheduler in the concrete machine correctly implement the required actions in the abstract machine. These parts also function as permissions to schedule the corresponding processes, i.e., a given part can be owned by at most one scheduler invocation at a time. For example, a scheduler invocation owning the parts of process states marked in Figure 4 has a permission to schedule processes 1 and 2, but not 3. Such a permission reading is crucial for handling scheduling on multiprocessors, as it ensures that a process may not be scheduled at two CPUs at the same time.

Summary. In the following we formalise the above approach for a particular class of schedulers. Despite the formalisation being performed for this class, the technical methods we develop here can be reused in other settings (see Section 8 for a discussion). In particular, we propose the following novel ideas:

- exploiting a logic validating the frame property to hide the state of the scheduler while verifying the kernel and vice versa;
- using a logical refinement in a context where defining an abstract semantics refined by the concrete one is difficult; and
- focusing on relevant parts of the two systems related in the refinement and giving a permission interpretation to them.

3. Preliminaries

In this section, we give a formal semantics to the example machine informally presented in Section 2.1.

3.1 Storage model

Figure 5 gives a model for the set of configurations Config that can arise during an execution of the machine. A machine configuration is a triple with the components describing the values of registers of the CPUs in the machine, the state of the heap and the set of locks taken by some CPU. The configurations in which the heap or the global context is a partial function are not encountered in the semantics we define in this section. They come in handy in Sections 4 and 6 to give a semantics to the assertion language and express the soundness of our logic.

In this paper, we use the following notation for partial functions:

\[
f[x : y] \text{ is the function that has the same value as } f \text{ everywhere, except for } x, \text{ where it has the value } y; \quad [\cdot] \text{ is a nowhere-defined function; } f \uplus g \text{ is the union of the disjoint partial functions } f \text{ and } g.
\]
Figure 6. Semantics of primitive commands. We have omitted standard definitions for \texttt{skip} and most of assignments (see [20]). We have also omitted them for \texttt{icall} and \texttt{iret}: the definitions are the same as for \texttt{call} and \texttt{ret}, but additionally modify \texttt{if}. In the figure \(\sim \subset \top\) indicates that the command \(c\) crashes, and \(\not\sim c\) means that it does not crash, but diverges. The function \([\_\_\_]\) evaluates expressions with respect to the context \(r\).

\[
\begin{align*}
(k, r, h[[e] r : u], L, l', t') &\sim_{\text{run}} (r[e] r : u, [h[e] r : u], L, l') \\
(k, r, h, L, l', t') &\sim_{\text{assume}(b)} ((r h, L, l', t'), \text{if } [b] r = \text{true}) \\
(k, r, h, L, l', t') &\not\sim_{\text{lock}(l)} ((r h, L \cup \{l\}, t'), \text{if } l \notin L) \\
(k, r, h, L, l', t') &\not\sim_{\text{unlock}(l)} ((r h, L - \{l\}, t'), \text{if } l \in L) \\
(k, r, h[r[(\_\_\_\_)] r : \_\_\_\_, L, l', t') &\sim_{\text{savecpuid}(c)} ((r h[(\_\_\_\_)] r : \_\_\_, L, l', t') \\
(k, r, h[r] r : \_\_\_, \ldots, r(ap)+m : \_\_\_, L, l', t') &\sim_{\text{call}(l''')} (([r ap : r(ap)+m]+1, l'''), (r h[l'''], t''', l''')) \\
(k, r, h[r] r : \_\_\_, \ldots, r(ap)+m : r(gr_m), L, l', t') &\sim_{\text{ret}} (([r ap : r(ap)+m-1, gr_1 : g_1, \ldots, gr_m : g_m], L, l', t') \\
(k, r, h, L, l', t') &\not\sim_{\text{unlock}(l)} (\_\_\_), \_\_\_, \_\_\_, otherwise
\end{align*}
\]

Figure 7. Operational semantics of the machine

and uses the result of this run to update the registers of CPU \(k\) and the heap and the lockset of the machine. The next rule concerns interrupts. Upon an interrupt, the interrupt handler label \texttt{schedule} is loaded into \texttt{ip}, and the label of the command to execute after the handler returns is pushed onto the stack together with the values of the general-purpose registers. The remaining rules deal with crashes arising from erroneous execution of primitive commands, undefined command labels and a stack overflow upon an interrupt.

4. The logic

In this paper we consider schedulers whose interface consists of two routines: \texttt{create} and \texttt{schedule}. Like in our example scheduler (Section 2.2), \texttt{create} makes a new process runnable, and \texttt{schedule} performs a context-switch. (Our results can be extended when new scheduler routines are introduced; see Section 8 for a discussion.) Our logic thus reasons about programs of the form:

\[
C \cup \{l_c : (\texttt{iret}, \{l_c, +1\})\} \cup S \cup \{l_s : (\texttt{iret}, \{l_s, +1\})\} \cup K \ (\text{OS})
\]

where \(C\) and \(S\) are pieces of code implementing the \texttt{create} and \texttt{schedule} routines of the scheduler and \(K\) is the rest of the kernel code. Our high-level proof system is designed for proving \(K\), and the low-level system for proving \(C\) and \(S\).

We place several restrictions on programs. First, we require that \(C\) and \(S\) define primitive commands labelled \texttt{create} and \texttt{schedule}, which are meant to be the entry points for the corresponding scheduler routines. The \texttt{create} routine expects the address of the descriptor of the new process to be stored in the register \texttt{gr_1}. By our convention \texttt{schedule} also marks the entry point of the interrupt handler. Thus, \texttt{schedule} may be called both directly by a process or by an interrupt. For simplicity, we assume that the scheduler data structures are properly initialized when the program starts executing.

To ensure that the scheduler routines execute with interrupts disabled, we require that \(C\) and \(S\) may not contain \texttt{icall}, \texttt{iret} and assignments accessing the \texttt{if} register. We also need to ensure that the kernel may not affect the status of interrupts, become aware of the particular CPU it is executing on, or change the stack address. Thus, \(K\) may not contain \texttt{savecpuid}, \texttt{icall} and \texttt{iret} (except calls to the scheduler routines \texttt{schedule} and \texttt{create}), assignments accessing \texttt{if} or writing to \texttt{gr_1}. In reality, a kernel might need to disable interrupts. We discuss how our results can be extended to handle this in Section 8. Finally, we require that the kernel \(K\) and the scheduler \(C\) and \(S\) access disjoint sets of locks. This condition simplifies the soundness statement in Section 6 and can be lifted.

The core part of our logic is the low-level proof system for verifying scheduler code, which we present in Section 4.4. It extends the high-level proof system used for verifying kernel code, which, in turn, adapts concurrent separation logic to our setting. For this reason, we present the high-level system first.

4.1 Assertion language

We now present the assertion language of the high-level proof system. Assertions describe properties of a single process, as if it were running on a separate virtual CPU. The state of the process thus consists of the values of the CPU registers (its context), the heap local to the process and the locks the process has a permission to release (its lockset). Mathematically, states of a process are just elements of State defined in Section 3.3: State = Context \times Heap \times Lockset. However, unlike in the semantics of Section 3.3, a heap here can be a partial function, with its domain defining the part of the heap owned by the process. A lockset is now meant to contain only the set of locks that the process has a permission to release (in our logic such permissions can be transferred between processes).

To denote sets of process states in our logic, we use a minor extension of the assertion language of separation logic [20]. Let \(\texttt{NVar}\) and \(\texttt{CVar}\) be disjoint sets containing logical variables for values and contexts, respectively. Assertions are defined as follows:

\[
x, y \in \texttt{NVar} \quad \gamma \in \texttt{CVar} \\
\{\texttt{ip}\} \quad [\texttt{ip}, \texttt{if}, \texttt{ss}, \texttt{sp}, \texttt{gr_1}, \ldots, \texttt{gr_m}] \\
E ::= x | r | \{\texttt{ip}\} | {\texttt{if}} | {\texttt{ss}} | {\texttt{sp}} | {\texttt{gr_1}} | \ldots | {\texttt{gr_m}} \\
G ::= \gamma | [E : E, \texttt{if} : \texttt{E}, \texttt{ss} : E, \texttt{sp} : E, \texttt{gr} : E] \\
Σ ::= \varepsilon | E | Σ Σ \\
B ::= E = E | Σ = Σ | G = G | E ⊆ E | B \land B | B \lor B | \neg B \\
P ::= B \land \texttt{true} | P \land \neg P | \exists x. P | \exists y. P | \texttt{emp} \ |
\texttt{E} \leftrightarrow \texttt{E} | \texttt{E} \leftrightarrow Σ | P \ast P | \texttt{dll}(E, \texttt{E}, E, E) | \texttt{locked}(l) \quad \text{Expressions}\ E \text{ and } B \text{ are similar to those in programs, except that they allow logical variables to appear and include the lookup } G(r) \text{ of the value of the register } r \text{ in the context } G.\ A\ \text{context } G \text{ is either a logical variable or a finite map from register}
\]
(r, h) |-\eta B \iff [B]_{\eta}^r = \text{true}
(r, h) |-\eta P_1 \land P_2 \iff (r, h, L) |-\eta P_1 \land (r, h, L) |-\eta P_2
(r, h) |-\eta \text{emp} \iff h = [] \land L = \emptyset
(r, h, L) |-\eta E_0 \rightarrow E_1 \iff h = [[E_0]]_{\eta}^r \land L = \emptyset
(r, h, L) |-\eta E_0, E_1 \rightarrow \Sigma \iff \exists j \geq 0. \exists v_1, \ldots, v_j \in \text{Val.}
\quad L = \emptyset, \quad j = [E_0]_{\eta}^r + [E_1]_{\eta}^r + 1, \quad v_1, v_2, \ldots, v_j = [\Sigma]_{\eta}^r
\quad h = [[E_0]]_{\eta}^r \cdot v_1, \ldots, [[E_1]]_{\eta}^r \cdot v_j
(r, h, L) |-\eta \text{locked}(\ell) \iff h = [] \land L = \ell
(r, h, L) |-\eta P_1 \land P_2 \iff \exists h_1, h_2, L_1, L_2. \quad h = h_1 \uplus h_2,
\quad L = L_1 \uplus L_2, \quad (r, h_1, L_1) |-\eta P_1 \land (r, h_2, L_2) |-\eta P_2

Predicate \text{dll}(\ell) is the least one satisfying the equivalence below:
\text{dll}(E_h, E_p, E_n, E_i) \iff \exists \eta. (E_h = E_n \land E_p = E_i \land \text{emp}) \lor \exists h, \text{prev} \rightarrow E_p \land E_n \rightarrow \text{next} \rightarrow x \land \Lambda(E_h) \land \text{dll}(x, E_h, E_n, E_i)

Figure 8. Semantics of high-level assertions. We have omitted the standard clauses for most of the first-order connectives. The function \([\cdot]_{\eta}^r\) evaluates expressions with respect to the context \(r\) and the logical variable environment \(\eta\).

labels to \(r\) expressions. We denote the set of assertions defined here with \text{Assert}. Let a logical variable environment \(\eta\) be a mapping from \text{NVar} \cup \text{CVar} to \text{Val} \cup \text{Context} that respects the types of variables. Assertions denote sets of states from State as defined by the satisfaction relation \(|=\), in Figure 8. For an environment \(\eta\) and an assertion \(P\), we denote with \([P]_{\eta}\) the set of states satisfying \(P\).

The assertions in the first line of the definition of \(P\) except \text{emp} are connectives from the first-order logic with the standard semantics. We can define the missing connectives from the given ones. The following assertions from \text{emp} up to the \text{dll} predicate are standard assertions of separation logic [20]. Informally, \text{emp} describes the empty heap, and \(E \rightarrow E'\) the heap with only one cell at the address \(E\) containing \(E'\). The assertion \(E, E' \rightarrow \Sigma\) is the generalisation of the latter to several consecutive cells at the addresses from \(E\) to \(E'\) inclusive containing the sequence of values \(\Sigma\). For a value \(u\) of a C type \(t\) taking several cells, we shorten \(E, (E + \text{sizeof}(t) - 1) \rightarrow u\) to just \(E \rightarrow u\). For a field \(f\) of a C structure, we use \(E.f \rightarrow E'\) as a shortcut for \(E + f \rightarrow E'\), where \(f\) is the offset of \(f\) in the structure. The separating conjunction \(P_1 \land P_2\) talks about the splitting of the local state, which consists of the heap and the lockset of the process. It says that a pair \((h, L)\) can be split into two disjoint parts, such that one part \((h_1, L_1)\) satisfies \(P_1\) and the other \((h_2, L_2)\) satisfies \(P_2\).

The assertion \text{dll}(E_h, E_p, E_n, E_i) is an inductive predicate describing a segment of a doubly-linked list. It assumes a C structure definition with fields \text{prev} and \text{next}. Here \(E_h\) is the address of the head of the list, \(E_n\) the address of its tail, \(E_p\) the pointer in the \text{prev} field of the head node, and \(E_i\) the pointer in the \text{next} field of the tail node. The \(\Lambda\) parameter is a formula with one free logical variable describing the shape of each node in the list, excluding the \text{prev} and \text{next} fields; the logical variable defines the address of the node. For instance, a simple doubly-linked list can be expressed using \(\Lambda(x) = \text{emp}\). We included \text{dll} to describe the runqueues of the scheduler in our example. Predicates for other data structures can be added straightforwardly [20].

Finally, the assertion \text{locked}(\ell) is specific to reasoning about concurrent programs and denotes states with an empty local heap and the lockset consisting of \(\ell\), i.e., it denotes a permission to release the lock \(\ell\). Note that \text{locked}(\ell) \land \text{locked}(\ell)\) is inconsistent: acquiring the same lock twice leads to a deadlock.

To summarise, our assertion language extends that of concurrent separation logic with expressions to denote contexts and locked assertions to keep track of permissions to release locks.

4.2 Interface parameters
As we noted in Section 2.3, our logic can be viewed as implementing a form of assume-guarantee reasoning between the scheduler and the kernel. In particular, interactions between them involve ownership transfer of memory cells at points where the control crosses the boundary between the two components. Hence, the high- and low-level proof systems have to agree on the description of the memory areas being transferred and the properties they have to satisfy. These descriptions form the specification of the interface between the scheduler and the kernel, and, correspondingly, between the two proof systems. Here we describe parameters used to formulate it. We note that the interface parameters we present here are tied to a particular class of schedulers for which we present our logic. As we argue in Section 8, our results can be carried over to schedulers with more elaborate interfaces.

Ownership happens at calls to and returns from the scheduler routines \text{create} and \text{schedule}. When the kernel calls the \text{create} routine of the scheduler, the latter should get the ownership of the process descriptor supplied as the parameter. In the two proof systems, we specify this descriptor using an assertion \text{desc}(d, \gamma) \in \text{Assert}\_K with two free logical variables and no register occurrences. Our intention is that it describes the descriptor of a process with the context \(\gamma\), allocated at the address \(d\). However, the user of our logic is free to choose any assertion, depending on a particular scheduler implementation being verified. As the scheduler and the kernel access disjoint sets of locks, we require that all states in \([\text{desc}(d, \gamma)]_\eta\) have an empty lockset.

We fix the piece of state transferred from the kernel to the \text{schedule} routine upon an interrupt to be the free part of the stack of the process being preempted. The parameters determining its size are the size of the stack \text{StackSize} \in \mathbb{N} and the upper bound \text{StackBound} \in \mathbb{N} on the stack usage by the kernel (excluding the scheduler). To ensure that the stack does not overflow while calling an interrupt handler, we require that \text{StackSize} - \text{StackBound} \geq m + 1, where \(m\) is the number of general-purpose registers.

4.3 High-level proof system
The high-level proof system reasons about the kernel code \(K\). It is obtained by adapting concurrent separation logic to our setting and adding proof rules axiomatising the effect of scheduler routines.

The judgements of the high-level proof system are of the form \(\Delta \vdash C\), where \(I. \Delta \vdash C\) is a partial mapping from locks accessible in the kernel code to their invariants (see Section 2.3) and \(\Delta : \text{Label} \rightarrow \text{Assert}\_K\) is a total mapping from code labels to preconditions. The parameter \(\Delta\) in our judgement specifies local states of the process at various program points, which induce pre- and post-conditions for all primitive commands in \(C\). When considering a complete system in Section 4.5, we restrict \(\Delta\) so that it is false everywhere except at labels in the kernel code. An example of a lock invariant is

\[\exists x, y. 10. \text{prev} \rightarrow y \ast 10. \text{next} \rightarrow x \ast \text{dll}(x, 10, 10, y),\]

where \(\Lambda(x) = \text{emp}\). It states that the lock protects a non-empty cyclic doubly-linked list with the head node at address 10. We forbid lock invariants to contain registers or free occurrences of logical variables. We consider a version of concurrent separation logic where resource invariants are allowed to be imprecise [19] at the expense of excluding the conjunction rule from the proof system [12].

The rule \text{PROG-H} for deriving the judgements is given in Figure 9. The first premise of the rule says that all assertions in \(\Delta\) have to satisfy some restrictions regarding stack usage, formulated using parameters \text{StackSize} and \text{StackBound} introduced in Section 4.2. These ensure that the interrupt handler can safely execute on the stack of the process it preempts.
\[ \forall \ell' \in \text{Label}. \ (P*\text{sp}(\text{sp}+\text{m})\implies \ell') \implies (\Delta'(\text{sp}+\text{m}+1/\text{sp})) \]

Figure 9. High-level proof system. Here \(\text{mod}(c)\) is the set of registers modified by \(c\), \(\text{free}(P)\) is the set of registers appearing in \(P\), and \(\text{notCall\text{Ret}(c)}\) means that \(c\) is not one of \(\text{call}, \text{icall}, \text{ret}\) and \(\text{iret}\). Finally, \(\text{id} = ([p] : \gamma, \text{id} : \text{if}, \text{ss} : \text{ss}, \text{sp} : \text{sp}, \text{gr} : \gamma)\).
following technical presentation. The context is required to have if set, since after the context switch is finished, the process starts executing with interrupts enabled. Note that the descriptor is not present in the postcondition: it gets transferred to the scheduler and reappears in the precondition of the implementation of create (Section 4.5). The axiom also allows us to transfer the ownership of the part of the heap given by $P$ to the newly created process, thus providing it with an initial local state. This is a typical idiom for high-level reasoning about processes in separation logics [13]. The premise of the rule correspondingly requires that, after the registers and the stack are properly initialised, the state $P$ we are transferring should establish the assertion at the label the process starts executing from. The effect of loading registers from $\gamma$ is formulated using the context id.

For the example scheduler in Section 2.2, desc$(d, \gamma)$ should describe a process descriptor with the stack initialised according to the invariant of a preempted process pictured in Figure 2:

\[
\text{desc}(d, \gamma) = \text{d.prev} \rightarrow \ast \text{. d.next} \rightarrow \ast \text{. desc}_0(d, \gamma),
\]

where

\[
\text{desc}_0(d, \gamma) \leftrightarrow \gamma(\text{if}) = 1 \land \gamma(\text{ss}) = \text{d.kernel_stack} \land 0 \leq \gamma(\text{sp}) - \gamma(\text{ss}) \leq \text{StackBound} \land \text{d.timeslice} \rightarrow \ast .
\]

\[
\text{d.saved_sp} \rightarrow (\gamma(\text{sp}) + m + 1) \cdot \text{SCHED_FRAME} \ast .
\]

\[
\gamma(\text{sp}) = (\gamma(\text{sp}) + m) \rightarrow \gamma(\text{ip}) (\gamma(\text{ip}) + 1) \ast .
\]

\[
(\gamma(\text{sp}) + m + 1), (\gamma(\text{ss}) + \text{StackSize} - 1) \rightarrow .
\]

and SCHED_FRAME is the size of the activation record of schedule (Figure 1). The descriptor does not include filled stack slots; they can be passed to the process directly in the precondition $P$.

As we have noted before, desc$(d, \gamma)$ can be an arbitrary logical predicate. In some cases, e.g., when it is imprecise [19], its transfer from the kernel to the scheduler is hard to express operationally when defining a semantics of the kernel separately from the implementation of the scheduler; see [12] for a discussion. The situation would be worse had we based our logic on one of advanced modular concurrency logics, such as deny-guarantee [4], which are needed to handle real OS code. This is because proofs of soundness for such logics do not give an operational semantics to separate components of a program. The above difficulties with an operational definition of ownership transfer are a prime reason for using logical refinement in this paper.

The high-level proof system provides modern tools for modular reasoning about concurrent processes using proof rules of concurrent separation logic. The PROG-HH rule of the system subsumes the usual sequential composition rule of Hoare logic, which assumes that the control-flow follows the structure of the process code and ignores the possibility of scheduler code getting executed at an interrupt. The axioms SCHED and CREATE abstract the implementation of scheduler routines by treating them like atomic commands. Thus, the state and the control-flow of the scheduler is completely hidden by the proof system. The soundness of such an illusion is established by verifying the scheduler code using a low-level proof system, which we describe next.

4.4 Low-level proof system

We now present the core of our logic—the low-level proof system, which is used to prove that the commands $C$ and $S$ of the OS program implement scheduling correctly. As we explained in Section 2.3, assertions of the proof system relate the states of the concrete machine and an abstract one, where every process has its own virtual CPU. The state of the concrete machine can be described using separation logic assertions introduced in Section 4.1. To describe states of the abstract machine, we extend the assertion language of Section 4.3 with an additional predicate: $P := \ldots | Process(G)$, where $G$ ranges over context expressions. We denote the set of such assertions with Assertions. The Process$(G)$ predicate describes a process with the values of registers of its virtual CPU given by the context $G$.

The addition of the Process predicate changes objects described by assertions: they now denote relations defined by subsets of RelState $= \text{State} \times M(\text{Context})$, where $M(A)$ is the set of all finite multisets with elements from $A$. Relations in RelState connect the states of the concrete machine and the abstract machine with one CPU per process. As we have noted before, these relations do not describe the full state of the machines. The first component in a relation describes the local state of a scheduler invocation running on a CPU, including its context and the heap and the lockset local to it (e.g., the region marked CPU1 in Figure 3). The multiset in the second part records the scheduler-visible states of processes described by Process predicates in the assertion, i.e., parts of their local states that may be referred to by proofs about the scheduler (cf. the dark regions in Figure 4). These include the context of a process, but exclude its local heap and lockset: the latter are irrelevant for the schedulers we consider here and are therefore invisible to them. The low-level logic we present in this section is based on separation logic, hence, the invisibility of parts of process state to the scheduler automatically guarantees that it cannot access them.

Apart from keeping track of the state of a process, a Process predicate serves in the logic as an exclusive permission for the scheduler invocation owning it to schedule the corresponding process. To enforce this, the semantics of assertions defined below forbids the duplication of Process predicates: $\text{Process}(G) \not\Rightarrow \text{Process}(G) \ast \text{Process}(G)$. Furthermore, the proof obligations for the scheduler we define in Section 4.5 state that it needs a Process predicate to schedule a process. Such a permission interpretation of Process is a key feature of our logic that allows us to reason about schedulers for multiprocessors: it ensures that, at a given time, only one scheduler invocation can own a Process predicate for a process, and hence, it can be scheduled only on one CPU at a time.

We give the formal semantics of assertions using the satisfaction relation $\models_\eta$ in Figure 10, parameterised by environments $\eta$. The first two cases in the figure are the most interesting ones. Process$(G)$ relates a scheduler invocation having the empty heap and the empty lockset to a single process with the register values $G$. To be related by the separating conjunction $P \ast Q$, all parts of the state-multiset pair except the context should be split such that the first part is related by $P$ and the second by $Q$. The semantic definitions of the remaining assertions are obtained from the corresponding cases in our high-level proof system (Figure 8) either by requiring the multiset component $M$ to be empty, like in the case of emp, or by propagating $M$ to their sub-assertions, like in the case of $P \ast Q$. We denote with $[P]_\eta$ the set of states satisfying $P$.

The judgements of the low-level proof system have the form $I, \Delta \vdash_k C$, where $k \in \text{CPUId}$, $I: \text{Label} \rightarrow \text{Assertions}$ is a vector of resource invariants for locks accessible to the scheduler, and $\Delta: \text{Label} \rightarrow \text{Assertions}$ is a mapping from program positions to low-level assertions. When considering a complete system in Section 4.5,
we restrict $\Delta$ so that it is false everywhere except at labels in the scheduler code. The intuitive meaning of the judgements is the same as in the high-level system (Section 4.3), with the component describing scheduler-visible process states unchanged during the execution of scheduler commands. The judgements thus express how the scheduler code changes the relationship between the state of the scheduler on the CPU $k$ and those of processes running on the machine. The proof rule for deriving our judgements is:

$$
\frac{I, \Delta \triangleright_k \{\Delta(l)\} \mathit{comm}(C, l) \{\Delta'(l')\}}{I, \Delta \triangleright_k C} \quad \text{PROG-L}
$$

Note that the syntactic structure of the OS program (see the beginning of Section 4) ensures that the scheduler always executes with interrupts disabled. Thus, in the rule we are able to follow the control flow of $C$. The low-level system inherits the proof rules for deriving judgements for primitive commands $I, \Delta \triangleright_k \{P\} \in \{Q\}$ in Figure 9, adding the superscript $k$ to $\triangleright_k$ and ignoring the rules for $\mathit{icall}(\mathit{schedule})$ and $\mathit{icall}(\mathit{create})$. It also has a rule for $\mathit{savecpuid}$, which makes use of the index $k$:

$$
\frac{I, \Delta \triangleright_k \{e \mapsto \_\} \mathit{savecpuid}(e) \{e \mapsto k\}}{I, \Delta \triangleright_k \mathit{CPUId}}
$$

### 4.5 Putting the two proof systems together

The proof systems presented in Sections 4.3 and 4.4 allow us to reason about the kernel and the scheduler code. We now describe the structure of this kind in our example scheduler is the element of the current array corresponding to the current CPU. Let $J_k$ be an invariant of such data structures for CPU $k$, which is meant to be maintained when the scheduler is not running on it. Similarly to lock invariants, we forbid $J_k$ to contain free logical variables or registers, except $\mathit{ss}$. In this case we can allow $\mathit{ss}$ because we have previously required that the kernel cannot modify it. We denote with $J$ the vector of invariants $J_k$.

Consider assertions $I_k, \Delta_k$ and $I_\mathit{CPUId}, \Delta_\mathit{CPUId}$ for all $k \in \mathit{CPUId}$, such that:

- $\mathit{dom}(I_k) \cap \mathit{dom}(I_\mathit{CPUId}) = \emptyset$;
- $\forall I \not\in \mathit{dom}(K) \Rightarrow \Delta_k(l) = \mathit{false}$;
- $\forall I \not\in \mathit{dom}(K) \not\subseteq \mathit{dom}(C) \Rightarrow \Delta_\mathit{CPUId}(l) = \mathit{false}$.

The proof rule for the program OS is as follows:

$$
\frac{I_k, \Delta_k \vdash K}{\forall k \in \mathit{CPUId}. I_\mathit{CPUId}, \Delta_\mathit{CPUId} \vdash s. I_k, \Delta_k \vdash C}
$$

The first three premises require us to prove the kernel and the scheduler code in their respective proof systems. The rest define pre- and postconditions for $\mathit{schedule}$ and $\mathit{create}$ by fixing the assertions at the corresponding labels. This is done using the predicate $\mathit{SchedState}_k$, which describes the state of a scheduler invocation at CPU $k$ right after it is called using $\mathit{icall}$ or before it returns by executing $\mathit{iret}$.

When $\mathit{schedule}$ is called, the stack satisfies the bound on stack usage and interrupts are disabled. The scheduler gets the ownership of the per-CPU data structure $J_k$, a part of the stack of the process being preempted (which contains the values of registers saved upon the call together with the empty slots), and a Process predicate consistent with the registers saved on the stack. The predicate certifies that, when the scheduler starts executing, the state of the preempted process in the machine corresponds to its state in the abstract machine. The $\mathit{schedule}$ routine has to re-establish the same assertion before returning. In the case when it schedules a different process, this will be done using a different Process predicate. However, since the scheduler can only get a Process predicate in the precondition of $\mathit{schedule}$ (and when a new process is created; see below), its postcondition guarantees that the process being scheduled has the same register values it had last time it was preempted. Note that the precondition of $\mathit{schedule}$ mirrors the first premise of the $\mathit{PROG-H}$ rule. Thus, the assumptions it makes about the kernel are justified by the proof of the latter in the high-level system.

The precondition of $\mathit{create}$ is similar to that of $\mathit{schedule}$, but additionally assumes a process descriptor for a new process with the address in $\mathit{gr}_1$, and a corresponding Process assertion initialised according to the information in the descriptor. This descriptor is guaranteed to be provided by the kernel by the precondition of the $\mathit{CREATE}$ rule. Adding the new Process assertion can be understood intuitively as creating a fresh virtual CPU for the new process in the abstract machine.

### 5. Verifying the example scheduler

We have used the logic to manually construct a proof of the example scheduler of Section 2.2, establishing the judgements about $\mathit{schedule}$ and $\mathit{create}$ required by the proof rule in Section 4.5. By the soundness theorem for our logic (presented in Section 6), this implies that any property of a piece of high-level code proved in concurrent separation logic, including memory safety and functional correctness, holds of the code when it is managed by the example scheduler.

The detailed proof is given in Appendix A. Here we present only lock and per-CPU scheduler invariants together with some informal explanations.

The invariants of runqueue locks are as follows:

$$
I(\mathit{runqueue}\_\mathit{lock}[k]) = \exists x, y. \mathit{runqueue}[k] \mapsto z * \mathit{desc}_0(z, \gamma) \ast z. \mathit{prev} \mapsto y \ast z. \mathit{next} \mapsto x \ast \mathit{dll}(x, z, z, y)
$$

where $\Lambda(d) = \exists y. \mathit{desc}_0(d, \gamma) \ast \mathit{Process}(\gamma)$ and $\mathit{desc}_0$ is defined in Section 4.3. The per-CPU scheduler invariants are:

$$
J_k = \exists d. (d.\mathit{kernel}\_\mathit{stack}=\mathit{ss}) \land \mathit{current}[k] \mapsto d \ast d.\mathit{prev} \mapsto \_ \ast d.\mathit{next} \mapsto \_ \ast d.\mathit{timeslice} \mapsto \_ \ast d.\mathit{saved}\_\mathit{sp} \mapsto \_.
$$

According to these definitions, a runqueue for a CPU $k$ contains a list of descriptors of preempted processes together with Process predicates matching the state stored in them. When an invocation of $\mathit{schedule}$ acquires the runqueue lock and removes a node from the list, it gets the ownership of the corresponding Process predicate, which lets it schedule the process by establishing the postcondition of $\mathit{SchedState}_k$, of $\mathit{schedule}$ (see Section 4.5). The descriptor of the process just scheduled, pointed to by an entry in the current array, forms the scheduler’s per-CPU state and is described by $J_k$. When the process is preempted again, $\mathit{schedule}$ receives the Process predicate in its precondition $\mathit{SchedState}_k$. This predicate and the state in $J_k$ let the scheduler insert the descriptor back into the runqueue while maintaining its invariant.
6. Soundness

In this section, we explain the guarantees about the entire kernel that follow from proofs in our logic. Consider a program OS of the form introduced in Section 4. We formulate a theorem, proved in Appendix B, which describes how proofs of a scheduler and the kernel in our logic can be combined to construct an inductive invariant of the entire system. To aid understanding, we first state the theorem and explain the components used to formulate it informally. Only after this do we provide formal definitions.

**Theorem 1.** If $I_k: \Delta_k \vdash I_1, (\Delta_2)_{k \in \text{CPUid}} \vdash J \vdash (S, C, K)$, then for all environments $\eta$, the following set of configurations $R_k$ is preserved by $\text{OS}$:

$$
\text{compose}(\bigcup_{L \in \text{dom}(I_2)} \text{held}_S(L) \cap (\text{lowinv}_\eta \circ \text{lock}_L'), \\
\bigcup_{L \in \text{dom}(I_1)} \text{held}_S(L) \cap (\text{highinv}_\eta \circ \text{high}_L'))
$$

**Informal explanation.** The invariant $R$ is constructed in several steps by conjoining the descriptions of pieces of program state owned by different OS components. First, from assertions $\Delta_k$ and $J$ in the proof of the scheduler, we construct a predicate

$$\text{lowinv}_\eta \subseteq \text{RelConfig} \overset{\text{def}}{=} \text{Config} \times \text{M(Context)}$$

Consider $((r, h, L), M) \in \text{lowinv}_\eta$. For register values of the CPUs in the machine given by $R$, the components $h$ and $L$ describe the part of the machine state belonging to the scheduler, and $M$ the contexts of the processes has a permission to schedule. Similarly, from assertions $\Delta_k$ in the proof of the kernel, we construct a predicate

$$\text{highinv}_\eta \subseteq \text{HighConfig} \overset{\text{def}}{=} \text{M(Context)} \times \text{Heap} \times \text{Lockset}$$

Consider $(M, h, L) \in \text{highinv}_\eta$. For any set of processes with the contexts given by $M$, the components $h$ and $L$ describe the part of the machine state belonging to these processes.

To construct the complete machine state, we also have to take into account the parts of the heap protected by free locks. Thus, for any set of free locks $L'$ accessible to the scheduler, from resource invariants $I_k$ we construct a predicate $\text{lock}_L' \subseteq \text{RelConfig}$ describing the state protected by the locks. A similar predicate $\text{high}_L'$ describes the state protected by a set of free locks $L'$ accessible to the kernel. The predicates $\text{lock}_L'$ and $\text{high}_L'$ are then combined with $\text{lowinv}_\eta$ and $\text{highinv}_\eta$ using operations

$$\star_\eta : \mathcal{P}(\text{RelConfig}) \times \mathcal{P}(\text{RelConfig}) \rightarrow \mathcal{P}(\text{RelConfig})$$

$$\star_k : \mathcal{P}(\text{HighConfig}) \times \mathcal{P}(\text{HighConfig}) \rightarrow \mathcal{P}(\text{HighConfig})$$

To ensure that $L'$ is indeed the set of all free locks, we require that the rest of the locks $L$ are held by intersecting the result with $\text{held}_S(L) \subseteq \mathcal{P}(\text{RelConfig})$ or $\text{held}_K(L) \subseteq \mathcal{P}(\text{HighConfig})$.

Finally, we connect the resulting predicates describing the states of the scheduler and the kernel using a form of a relational composition, implemented by

$$\text{compose} : \mathcal{P}(\text{RelConfig}) \times \mathcal{P}(\text{HighConfig}) \rightarrow \mathcal{P}(\text{Config})$$

The operation conjoin the heaps and locksets described by the predicates and makes sure that the scheduler-visible states of processes they describe match. The result is an invariant of the entire machine maintained by each step of the kernel or the scheduler.

We now formally define the above operations and predicates.

**Composition operations.** Each of the operations $\star_\eta$, $\star_k$ and $\circ_k$ is obtained by lifting a partial function in $A \times B \rightarrow C$ to a function in $\mathcal{P}(A) \times \mathcal{P}(B) \rightarrow \mathcal{P}(C)$ pointwise. To define $\star_k$ we lift the operation $\circ_k$ on HighConfig that combines the information about processes, heaps and locksets:

$$(M_1, h_1, L_1) \bullet_k (M_2, h_2, L_2) = (M_1 \sqcup M_2, h_1 \sqcup h_2, L_1 \sqcup L_2)$$

(Recall that the $\sqcup$ operation on multisets adds up the number of occurrences of each element in its operands.)

To define $\star_\eta$ we similarly lift $\circ_\eta$ on RelConfig that combines the information about contexts, heaps, locksets and processes:

$$((R_1, h_1, L_1), M_1) \circ_\eta ((R_2, h_2, L_2), M_2) = ((R_1 \sqcup R_2, h_1 \sqcup h_2, L_1 \sqcup L_2), M_1 \sqcup M_2)$$

Finally, we lift $\circ_k : \text{RelConfig} \times \text{HighConfig} \rightarrow \text{Config}$ that combines heaps and locksets provided the scheduler-visible states of processes in both arguments match:

$$(R_1, h_1, L_1) \circ_k (M_1, h_2, L_2) = (R_1 \sqcup R_2, h_1 \sqcup h_2, L_1 \sqcup L_2)$$

if both unions are defined and $M_1 = M_2$; undefined otherwise. It is this operation that carries over statements proved in the high-level proof system about the abstract machine with one virtual CPU per process to the concrete machine: the second operand $(M_2, h_2, L_2)$ represents the state owned by the processes running on the abstract machine, and the first $(R_1, h_1, L_1)$ relates the scheduler state in the concrete machine to the processes it has permissions to schedule. The components $M_1$ and $M_2$ are used to ensure that the two operands describe the same set of processes.

**Predicate definitions.** Consider $p \subseteq \text{RelState}$ and $q \subseteq \text{State}$. Given $k \in \text{CPUid}$ and $r \in \text{Context}$, we define the following embedding operations converting states to configurations:

$$[p]_k = ((([[k : r], h, L], M) \in \text{RelConfig} \mid ((r, h, L), M) \in p)$$

$$[q]_r = (((\{r, h, L\} \in \text{HighConfig} \mid (r, h) \in \text{Stack} \circ \text{States}) : \sigma, L) \in q)$$

$$[p] = ((([[], h, L], M) \in \text{RelConfig} \mid ((r, h, L), M) \in p$$

$$[q] = (((\{r, h, L\} \in \text{HighConfig} \mid (r, h) \in \text{Stack} \circ \text{States})$$

The first one tags states with CPU identifiers and is used to construct lowinv$_\eta$. The second selects the states with a given context $r$ and is used for highinv$_\eta$. For technical reasons it removes the empty slots of the process stack, which are accounted for in the scheduler state (see the definition of SchedSleep$_p$ below). The remaining two operations are used for lock$_L'$ and highlock$_k$.$\gamma$. As resource invariants do not restrict registers, they ignore contexts. We also need predicates defining states where the CPU is at a particular label $l$, or configurations with a particular lockset $L$:

$$\text{at} \circ (l) = (((r, h, L), M) \in \text{RelState} \mid r(\text{ip}) = l)$$

$$\text{held} \circ (L) = (((r, h, L), M) \in \text{RelConfig})$$

$$\text{held} \circ (L) = (((r, h, L) \in \text{HighConfig})$$

The following predicate describes the state of the scheduler on CPU $k$, when a process is running on this CPU and is at label $l$:

$$\text{SchedSleep}_p(l) = J_k \circ \text{sp}(\text{Stack} \circ \text{States} - 1) \rightarrow _* \circ \text{Process}[\text{ip} : l, \text{if} : 1, \text{ss} : \text{ss}, \text{sp} : \text{sp}, \text{gr} : \text{gr}]$$

Finally, let $\otimes_\eta$ and $\otimes_k$ be the iterated versions of $\star_\eta$ and $\star_k$.

Using the above notation, we can define the predicates from the theorem. For $L_\eta \subseteq \text{dom}(I_2)$ and $L_K \subseteq \text{dom}(I_K)$, we have:

$$\text{lowinv}_\eta = \otimes_{k \in \text{CPUid}} \bigcup_{L \in L_\eta} \text{at}(l) \circ \text{SchedSleep}_p(l) \cap \text{low}(l) \cup \bigcup_{L \in \text{labels}(K)} \text{at}(l)$$

$$\text{highinv}_\eta = \bigcup_{M \in \text{M(Context)}} \bigcup_{r \in \text{R}} \text{at}(r)$$

$$\text{lock} \circ L_\eta = \bigcup_{L \in L_\eta} \text{at}(l)$$

$$\text{highlock} \circ L_K = \bigcup_{L \in L_K} \text{at}(l)$$
The definitions follow the informal explanation given at the beginning of this section. To determine the state of the scheduler on a given CPU when defining lowinv\_η, we branch over all possible program positions \( l \) of that CPU. Depending on whether \( l \) is in the scheduler or the kernel code, we use either the assertion in the scheduler proof or the invariant Sched\_Sleep\_η, describing the state of the scheduler when it is not running. Since assertions do not restrict the value of the \( ip \) register, we have to do this explicitly using \( at_\gamma \). Note that, although assertions in the high-level proof system mention the empty slots of the process stack, the slots in fact belong to the scheduler when the process is preempted. For simplicity we choose always to count them in the scheduler state (the assertion in \( L_k \eta \) or the scheduler invariant Sched\_Sleep\_η).

To define highinv\_η we branch over all possible finite multisets of contexts \( M \), representing processes that may run on the machine. For every context \( r \) in \( M \), the local state of the corresponding process is then determined by the assertion in the proof of the kernel at the program point \( r(ip) \), restricted to the states with the context \( r \). Note that the comprehension \( r \in M \) over a multiset \( M \) considers every duplicate of an element in the multiset separately.

Finally, lowlock\_η and highlock\_η are straightforward combinations of resource invariants for the given sets of locks.

**Ownership transfer.** It is instructive to analyse how ownership transfer between the scheduler and the kernel is handled by our soundness statement. For example, consider a transfer of a new process descriptor \( desc(d, \gamma) \) from the kernel to the scheduler at a call to create. Since the CREATE axiom requires the descriptor in its precondition, before the kernel calls create, the state partitioning defined by \( R \) counts the descriptor as part of highinv\_η. Since the implementation of create receives the descriptor in its precondition, in the configuration immediately after the call to create, \( R \) defines it to be part of lowinv\_η. Thus, ownership transfer re-partitions program state among the parts defined in Theorem 1.

**Consequences.** Theorem 1 allows us to check invariance properties of preemptable code. For example, assume that the initial configuration satisfies \( R \). Then the soundness statement ensures that the machine cannot reach an error label \( l_e \), on any CPU, provided the assertion at this program point in all high-level proofs is false. Indeed, in this case the invariant \( R \) does not contain any states where one of the CPUs is at \( l_e \). Note that the functional correctness of an OS kernel is usually formulated as a simulation between the kernel and its specification. As an OS kernel does not usually make any assumptions about user processes, proving the simulation can be reduced to proving an invariance property relating the two (e.g., [10, 16]). Thus, Theorem 1 can be also used to justify such proofs.

7. Related work

There have been a number of OS verification projects; see [15] for a survey. To our knowledge, none of these has included the verification of a scheduler in a preemptive kernel with the realistic features we consider. A representative example is the L4 verified project [16], which verified the L4 microkernel as a whole, together with the scheduler. There, proofs about kernel components other than the scheduler had to ensure the preservation of its invariants, e.g., the preservation of its runqueue. The proof was still tractable because the kernel was running on a uniprocessor and preemption was disabled most of the time. However, such architecture is not used by mainstream operating systems.

The closest work to ours is the one by Feng et al. [6–8], who verified an idealised scheduler without dynamic process creation. Their logic considers a uniprocessor and does not handle ownership transfer between the scheduler and processes. Like us, they have separate proof systems for the scheduler and preemptable code. However, their high-level system is non-modular in that it does not have a notion of a process-local state. Their approach to low-level reasoning and proving the soundness of the logic is also different from ours. Because Feng et al. consider a restricted scheduler and high-level proof system, they are able to avoid designing a special relational low-level logic. Instead, they view calls to and returns from the scheduler as jumps and compile proofs of the scheduler and the rest of the system into OCAP [6], a logic supporting first-class code pointers. According to our understanding, extending this approach to handle multiprocessing, ownership transfer and a modular high-level proof system would be non-trivial.

Maeda and Yonezawa have proved a simple context-switch routine using an extension of alias types [18]. Their proof expresses the disjointness of data structures belonging to the scheduler and the rest of the kernel using the tensor operator of alias types, which corresponds to our separating conjunction. However, their type system does not hide the internal data structures of the scheduler while proving the rest of the kernel, and is thus non-modular.

Yang and Hawblitzel [23] have recently proposed a kernel where most of the codebase is typechecked and therefore cannot directly access data structures belonging to the core part of the kernel, including the scheduler. However, the guarantees established by the type system do not take into account the contents of data structures, so the kernel can still subvert the scheduler by leaving them in an inconsistent state. The OS resorts to runtime checks in such cases, introducing a performance penalty. The relationship to this work is that of a trade-off: type safety guarantees are easier to get, but are not as strong as those provided by a program logic.

Refinement is a well-known approach in verification of both operating systems and general concurrent programs [1, 10, 14, 16, 21]. We advance it further by proposing its novel form where the target of the refinement is defined axiomatically and refinement relations focus only on the relevant state of the systems related. This allows us to handle systems with complex ownership transfers.

8. Discussion

In this paper we have neither verified a complete operating system nor built an automatic tool. Instead, we have proposed a proof rule that allows decomposing the verification of a preemptive OS kernel into two simpler tasks—verifying the scheduler and preemptable code separately. Such a result is relevant no matter what type of formal analysis of OS code one is performing: manual or automatic verification, or even bug-finding. Moreover, as we argued in Section 2.2, the straightforward approach of verifying the scheduler together with the rest of the kernel makes reasoning intractable; thus, a result such as ours is in fact indispensable for verifying realistic OS kernels.

The only way we could communicate the proposed reasoning principles understandably is by presenting our results in a simplified setting. Besides, we could not cover all the interesting features of mainstream OS kernels, even in regards to scheduling, in one paper. Below we list some of the limitations of our results and possible ways to lift them, which also provide avenues for future work:

- We based our logic for preemptable code on concurrent separation logic, which would not be able to handle complicated concurrency mechanisms employed in modern OS kernels. The proof of soundness of our logic follows an approach that has been applied extensively to various concurrent derivatives of separation logic [11, 12]. This leads us to believe that we can integrate more advanced logics from this class [4, 5, 22] without problems.
- Our treatment of procedure calls is naive in that it does not allow us to reason about procedures modularly. We consider this problem orthogonal to our goal and believe that our logic can be combined with more powerful logics for procedures in low-level code, such as [9].
• We have considered schedulers with only two procedures in their interface, and fixed the piece of state transferred between the scheduler and the kernel at schedule to be the empty slots of the process stack. It is straightforward to add new procedures and define their pre- and postconditions abstractly, like desc in the pre-condition of create. The real issue is how to restrict the ways the scheduler is allowed to change the state it receives before giving the state back to the kernel. For example, in some operating systems (e.g., XNU), schedule can receive the ownership of the whole stack of the process being preempted and may reallocate the stack when it schedules the process again, while preserving its contents. Such an interference is routinely described in combinations of separation logic and rely-guarantee [4, 5, 22] and can be integrated into our logic.

• Modern OS kernels have a number of features that break through the abstraction of a virtual CPU implemented by the scheduler. For example, they allow preemptable code to disable interrupts, e.g., to access data structures local to a particular CPU. The effects of such features can be axiomatised in the high-level logic in much the same way as we axiomatised the effect of the create routine of the scheduler. We plan to report on extensions of our logic to such features in future papers.

• Our logic is designed for proving safety properties only. Proof methods for liveness properties or the absence of deadlocks usually rely on modular methods for safety properties. Thus, our logic is a prerequisite for attacking liveness in the future.

Despite the above limitations, our logic is the first to handle patterns of interaction between the scheduler and the kernel that are present in mainstream operating systems. Even though the logic has been formalised in a particular setting, its key technical ideas—are transferable and can be reused in OS verification projects.

Acknowledgements

We would like to thank Anindya Banerjee, Xinyu Feng, Boris Koepf, Mark Marron, Peter O’Hearn, Matthew Parkinson, Noam Rinetzky, Zhong Shao, Viktor Vafeiadis and Jules Villard for comments and discussions that helped improve the paper. Yang was supported by EPSRC.

References


A. Proof of the example scheduler

Below we provide a proof outline for the scheduler in Figure 1. We establish the judgements for schedule and create in the low-level proof system required by the proof rule in Section 4.5 and verify fork in the high-level proof system.

Note that despite assertions in the proof being long, all the steps in it are purely mechanical. In fact, the data structure manipulations involved are of the kind that can be handled by automatic tools based on separation logic\(^2\).

In the proof we write \( \var \models P \) for a local \( C \) variable or procedure parameter \( \var \) instead of

\[ \exists \var. \text{sp} \rightarrow \text{var-off} \rightarrow \var \ast P \]

where \( \var \) is the offset of \( \var \) with respect to the top of the stack in the activation record of the function where it is declared (note that here \( \var \) is a program variable, whereas \( \var \) is a logical one). In the proof of fork, \( F \) is the local state of the parent, \( \Sigma \) the contents of its stack and \( P \) the precreation of the newly created thread. In the proof of load_balance, the assertion \( Q \) describes the local state of the schedule function calling it:

\[
\begin{align*}
(cpu, \text{old_process}) & \models \exists \vec{g}, d. \text{if } d = 0 \land d.\text{kernel_stack} = \text{ss} \land \\
& \quad \text{cpu} = k \land 0 \leq \text{sp} - \text{ss} - m - s - 1 \leq \text{StackBound} \\
& \quad \text{current}[k] \rightarrow d \ast d.\text{prev} \rightarrow \ast d.\text{next} \\
& \quad \text{d.timeslice} \rightarrow \ast d.\text{saved_sp} \\
& \quad \text{(sp} - s - m - 1) \rightarrow \vec{g} \rightarrow \\
& \quad \text{Process}([l, \text{if} : 1, \text{ss} : \text{ss}, \text{sp} : \text{sp} - s - m - 1, \vec{g} : \vec{g}])
\end{align*}
\]

We abbreviate SCHED_FRAME to \( s \) and FORK_FRAME to \( f \). We assume that

\[
\text{StackSize} - \text{StackBound} \geq 2 \ast m + 2 + 4 \ast \text{sizeof(int)} + 2 \ast \text{sizeof(Process*)}
\]

so that the kernel leaves enough space on the stack for the activation records of schedule and load_balance or create.

\#define FORK_FRAME sizeof(Process*)
\#define SCHED_FRAME sizeof(Process*)+sizeof(int)

struct Process {
    Process *prev;
    Process *next;
    word kernel_stack[StackSize];
    word *saved_sp;
    int timeslice;
};

Lock *runqueue_lock[NCPUS];
Process *runqueue[NCPUS];
Process *current[NCPUS];

void schedule() {
    \{SchedState\}
    int cpu;
    Process *old_process;
    \{cpu, old_process \models \exists \vec{g}, d. \text{if } d = 0 \land d.\text{kernel_stack} = \text{ss} \land \\
& \quad 0 \leq \text{sp} - \text{ss} - m - s - 1 \leq \text{StackBound} \\
& \quad \text{current}[k] \rightarrow d \ast d.\text{prev} \rightarrow \ast d.\text{next} \rightarrow \ast \\
& \quad \text{d.timeslice} \rightarrow \ast d.\text{saved_sp} \\
& \quad \text{(sp} - s - m - 1) \rightarrow \vec{g} \rightarrow \\
& \quad \text{sp.} \ast (\text{ss} + \text{StackSize} - 1) \\
& \quad \ast \vec{g}
\}

Process(ip, l, if: l, s, ss, sp: sp−s−m−1, ⃗g: ⃗y) *
∀x, y, z, runqueue[k] → x ∨ desc0(z, ⃗x) *
prev → old_process ⨝ next → x ∨ dl(l, x, old_process, y)\n current[cpu] = runqueue[cpu]−next; \\
{(cpu, old_process | locked(runqueue_lock[k]) | 3l, ⃗g, ⃗y | if = 0 ∧ \\
old_process kernel_stack = ss ∧ \\
cpu = k ∧ 0 ≤ sp−s−m−s−1 ≤ StackBound ∧ \\
current[k] → old_process |} \\
old_process prev → x ∨ old_process ⨝ next → x ∨ \\
old_process timeslice → x ∨ old_process.saved_sp ⨝ next → x ∨ \\
(sp, (as+StackSize−1) −→, ⃗x).
Process(ip, l, if: l, s, ss, sp: sp−s−m−1, ⃗g: ⃗y) *
∀x, y, z, runqueue[k] → x ∨ desc0(z, ⃗x) *
prev → old_process ⨝ next → x ∨ dl(l, x, old_process, y) \n remove_node(current[cpu]); \\
{(cpu, old_process | locked(runqueue_lock[k]) | 3l, ⃗g, ⃗y | if = 0 ∧ \\
old_process kernel_stack = ss ∧ \\
cpu = k ∧ 0 ≤ sp−s−m−s−1 ≤ StackBound ∧ \\
current[k] → old_process |} \\
old_process prev → x ∨ old_process ⨝ next → x ∨ \\
old_process timeslice → x ∨ old_process.saved_sp ⨝ next → x ∨ \\
(sp, (as+StackSize−1) −→, ⃗x).
Process(ip, l, if: l, s, ss, sp: sp−s−m−1, ⃗g: ⃗y) *
∀x, y, z, runqueue[k] → x ∨ desc0(z, ⃗x) *
prev → x ∨ next → x ∨ dl(l, x, old_process, y))
old_process(saved_sp = sp; \\
{(cpu, old_process | locked(runqueue_lock[k]) | 3l, ⃗g, ⃗y | if = 0 ∧ \\
old_process kernel_stack = ss ∧ \\
cpu = k ∧ 0 ≤ sp−s−m−s−1 ≤ StackBound ∧ \\
current[k] → old_process |} \\
old_process prev → x ∨ old_process ⨝ next → x ∨ \\
old_process timeslice → x ∨ old_process.saved_sp ⨝ next → x ∨ \\
(sp, (as+StackSize−1) −→, ⃗x).
Process(ip, l, if: l, s, ss, sp: sp−s−m−1, ⃗g: ⃗y) *
∀x, y, z, runqueue[k] → x ∨ desc0(z, ⃗x) *
prev → x ∨ next → x ∨ dl(l, x, old_process, y) \n lock(runqueue_lock[cpu]); \\
{(cpu, old_process | locked(runqueue_lock[k]) | 3l, ⃗g, ⃗y | if = 0 ∧ \\
d kernel_stack = ss ∧ \\
cpu = k ∧ 0 ≤ sp−s−m−s−1 ≤ StackBound ∧ \\
current[k] → d |} \\
d.prev → x ∨ next → x ∨ dl(l, x, old_process, y) \n lock(runqueue_lock[cpu]); \\
{(cpu, old_process | locked(runqueue_lock[k]) | 3l, ⃗g, ⃗y | if = 0 ∧ \\
d kernel_stack = ss ∧ \\
cpu = k ∧ 0 ≤ sp−s−m−s−1 ≤ StackBound ∧ \\
current[k] → d |} \\
d.prev → x ∨ next → x ∨ dl(l, x, old_process, y) \n unlock(runqueue_lock[cpu]); \\
{(cpu, old_process | locked(runqueue_lock[k]) | 3l, ⃗g, ⃗y | if = 0 ∧ \\
d kernel_stack = ss ∧ \\
cpu = k ∧ 0 ≤ sp−s−m−s−1 ≤ StackBound ∧ \\
current[k] → d |} \\
d.prev → x ∨ next → x ∨ dl(l, x, old_process, y) \n // We deallocate local variables here \\
{SchedState} \\
{int} \\
}
Q[sp−2+sizeof(int)−sizeof(Process)/sp] * 
sp.(as+StackSize−1) → x, locked(runqueue_lock[cpu]) * 3x, y, z, runqueue[cpu]2] → z’ *
desc(z, z’) * z.prev → y * z.next → x * dllA(x, z, z, y) 
unlock(runqueue_lock[cpu]);
{cpu, cpu2, non_empty, proc} |= 0 ≤ cpu < NCPUS ∧ Q[sp−2+sizeof(int)−sizeof(Process)/sp] * 
sp.(as+StackSize−1) → x
if (non_empty) {11 random(0, 100) { 
{cpu, cpu2, non_empty, proc} |= 0 ≤ cpu < NCPUS ∧ Q[sp−2+sizeof(int)−sizeof(Process)/sp] * 
sp.(as+StackSize−1) → x
// We deallocate local variables here 
{cpu} |= Q * sp.(as+StackSize−1) → x
return;
}
{cpu, cpu2, non_empty, proc} |= 0 ≤ cpu < NCPUS ∧ Q[sp−2+sizeof(int)−sizeof(Process)/sp] * 
sp.(as+StackSize−1) → x
do { cpu2 = random(0, NCPUS−1); } while (cpu2 == cpu2);
{lock < cpu2 ∧ 0 ≤ cpu < NCPUS ∧ Q[sp−2+sizeof(int)−sizeof(Process)/sp] * 
sp.(as+StackSize−1) → x
if (cpu < cpu2) {
lock(runqueue_lock[cpu]); lock(runqueue_lock[cpu2]);
} else {
lock(runqueue_lock[cpu2]); lock(runqueue_lock[cpu]);
}
{cpu, cpu2, non_empty, proc} |= 0 ≤ cpu < NCPUS ∧ locked(runqueue_lock[cpu]) * locked(runqueue_lock[cpu2]) * Q[sp−2+sizeof(int)−sizeof(Process)/sp] * 
sp.(as+StackSize−1) → x, y, z, x’, y’, z’, w. runqueue[cpu2] → z * runqueue[cpu2] → z’ *
desc(z, z’) * prev → y * next → x * dllA(x, z, z, y) 
desc(z’, z’) * prev → y’ * next → x’ * dllA(x’, z’, z’, y’) 
if (runqueue[cpu2]−next != runqueue[cpu2]) {
{lock < runqueue_lock[cpu2] + locked(runqueue_lock[cpu2]) * Q[sp−2+sizeof(int)−sizeof(Process)/sp] * 
sp.(as+StackSize−1) → x, y, z, x’, y’, z’, w. runqueue[cpu]2] → z * runqueue[cpu2] → z’ *
desc(z, z’) * prev → y * next → x * dllA(x, z, z, y) 
desc(z’, z’) * prev → y’ * next → x’ * dllA(x’, z’, z’, y’) 
proc = runqueue[cpu2]−next;
{cpu, cpu2, non_empty, proc} |= 0 ≤ cpu < NCPUS ∧ locked(runqueue_lock[cpu]) * locked(runqueue_lock[cpu2]) * Q[sp−2+sizeof(int)−sizeof(Process)/sp] * 
sp.(as+StackSize−1) → x, y, z, x’, y’, z’, w. runqueue[cpu]2] → z * runqueue[cpu2] → z’ *
desc(z, z’) * prev → y * next → x * dllA(x, z, z, y) 
desc(z’, z’) * prev → y’ * next → x’ * dllA(x’, z’, z’, y’) 
proc.prev → z’ * proc.next → w * 
(∃γ. desc(proc, γ) * Process(γ)) * dllA(w, proc, z’, y’)
} remove_node_after(proc);
{cpu, cpu2, non_empty, proc} |= 0 ≤ cpu < NCPUS ∧ locked(runqueue_lock[cpu]) * locked(runqueue_lock[cpu2]) * Q[sp−2+sizeof(int)−sizeof(Process)/sp] * 
sp.(as+StackSize−1) → x, y, z, x’, y’, z’, w. runqueue[cpu]2] → z * runqueue[cpu2] → z’ *
desc(z, z’) * prev → y * next → x * dllA(x, z, z, y) 
desc(z’, z’) * prev → y’ * next → x’ * dllA(x’, z’, z’, y’) 
proc.prev → z’ * proc.next → w * 
(∃γ. desc(proc, γ) * Process(γ)) * dllA(w, proc, z’, y’)
} insert_node_after(runqueue[cpu], proc); 
{cpu, cpu2, non_empty, proc} |= 0 ≤ cpu < NCPUS ∧ locked(runqueue_lock[cpu]) * locked(runqueue_lock[cpu2]) * Q[sp−2+sizeof(int)−sizeof(Process)/sp] * 
sp.(as+StackSize−1) → x, y, z, x’, y’, z’, w. runqueue[cpu]2] → z * runqueue[cpu2] → z’ *
desc(z, z’) * prev → y * next → x * dllA(x, proc, z, y) 
desc(z’, z’) * prev → y’ * next → w * 
proc.prev → z * proc.next → x * dllA(x, proc, z, y) 
(∃γ. desc(proc, γ) * Process(γ)) * dllA(w, z’, z’, y’)
} unlock(runqueue_lock[cpu]); 
{cpu, cpu2, non_empty, proc} |= 0 ≤ cpu < NCPUS ∧ locked(runqueue_lock[cpu]) * locked(runqueue_lock[cpu2]) * Q[sp−2+sizeof(int)−sizeof(Process)/sp] * 
sp.(as+StackSize−1) → x, y, z, x’, y’, z’, w. runqueue[cpu]2] → z * runqueue[cpu2] → z’ *
desc(z, z’) * prev → y * next → x * dllA(x, proc, z, y) 
desc(z’, z’) * prev → y’ * next → w * 
proc.prev → z * proc.next → x * dllA(x, proc, z, y) 
(∃γ. desc(proc, γ) * Process(γ)) * dllA(w, z’, z’, y’)
} regret
void create(Process *new_process) {
// Here we move the parameter from gr1 into
// the new_process local variable
{new_process |= ∃γ, ι(ι(ι) = 1 ∧ SchedStateι[ap−sizeof(Process)/sp] * desc(new_process, γ) * Process(γ))
int cpu;
{cpu |= γ, ι(ι(ι) = 1 ∧ SchedStateι[ap−sizeof(int)−sizeof(Process)/sp] * desc(new_process, γ) * Process(γ))
savecpu(&cpu);
{new_process, cpu} |= ∃γ, ι = k ∧ γ(ι(ι) = 1 ∧ SchedStateι[ap−sizeof(int)−sizeof(Process)/sp] * new_process.prev → x, new_process.next → x * desc(new_process, γ) * Process(γ))
new_process→timeslice = SCHED_QUANTUM;
{new_process, cpu} |= ∃γ, cpu = k ∧ γ(ι(ι) = 1 ∧ SchedStateι[ap−sizeof(int)−sizeof(Process)/sp] * new_process.prev → x, new_process.next → x * desc(new_process, γ) * Process(γ))
lock(runqueue_lock[cpu]);
{new_process, cpu} |= ∃γ, cpu = k ∧ γ(ι(ι) = 1 ∧ SchedStateι[ap−sizeof(int)−sizeof(Process)/sp] * new_process.prev → x, new_process.next → x * desc(new_process, γ) * Process(γ))
{new_process, cpu} |= ∃γ, new_process → timeslice = SCHED_QUANTUM;
{new_process, cpu} |= ∃γ, new_process.prev → x, new_process.next → x * desc(new_process, γ) * Process(γ))
lock(runqueue_lock[cpu]);
{new_process, cpu} |= ∃γ, new_process.prev → x, new_process.next → x * dllA(x, new_process, z, y) * locked(runqueue_lock[k]);
insert_node_after(runqueue[cpu], new_process);
{new_process, cpu} |= ∃γ, new_process.prev → x, new_process.next → x * dllA(x, new_process, z, y) * locked(runqueue_lock[k]);
unlock(runqueue_lock[cpu]);
{new_process, cpu} |= ∃γ, new_process.prev → x, new_process.next → x * dllA(x, new_process, z, y) * locked(runqueue_lock[k]);
unlock(runqueue_lock[cpu]);
} // We deallocate local variables here
{cpu} |= Q * sp.(as+StackSize−1) → x
}
int fork() {
{0 ≤ sp−ss ≤ StackBound−f ∧ 
ss.(ap−1) → 200sp + sp.(as+StackSize−1) → x * F * P} 
Process = new_process;
{new_process \triangleright= 0 \leq sp-ss \leq StackBound ∧
ss_.(ap-f–1) \triangleright= \Sigma0g \triangleright= sp_.(as+StackSize–1) \triangleright= _\ast F \ast P}
new_process = malloc(sizeof(Process));
{new_process \triangleright= 0 \leq sp-ss \leq StackBound ∧
ss_.(ap-f–1) \triangleright= \Sigma0g \triangleright= sp_.(as+StackSize–1) \triangleright= _\ast F \ast P
new_process.prev \rightarrow= \_\ast \text{new_process}\_\text{next} \rightarrow= \_\ast
new_process\_\text{time} \rightarrow= \_\ast \text{new_process}\_\text{saved_sp} \rightarrow= \_\ast
new_process\_\text{kernel_stack}..}
(new_process\_kernel_stack+StackSize–1) \rightarrow= ..
memcpy(new_process\_kernel_stack, _ss, StackSize);
{new_process \triangleright= 0 \leq sp-ss \leq StackBound ∧
ss_.(ap-f–1) \triangleright= \Sigma0g \triangleright= sp_.(as+StackSize–1) \triangleright= _\ast F \ast P
new_process\_\text{prev} \rightarrow= \_\ast \text{new_process}\_\text{next} \rightarrow= \_\ast
new_process\_\text{time} \rightarrow= \_\ast \text{new_process}\_\text{saved_sp} \rightarrow= \_\ast
new_process\_\text{kernel_stack}..}
(new_process\_kernel_stack+sp-ss–f–1) \rightarrow= \Sigma0g
(new_process\_kernel_stack+sp-ss–f–1).
(new_process\_kernel_stack+sp-ss–f–1).

We assume \(P\) satisfies the premise of the Create rule \textsc{\_call create(new_process)};

We deallocate local variables here
\{0 \leq sp-ss–f \leq StackBound ∧
ss_.(ap–1) \triangleright= \Sigma0g \triangleright= sp_.(as+StackSize–1) \triangleright= _\ast F\}
return 1;
}

B. Proof of Theorem 1

In this section we provide the proof of the soundness of our logic (Theorem 1). Note that to use the logic as a part of a verification system based on an interactive theorem prover (such as HOL4, Isabelle, Coq, etc.), our soundness proof would have to be formalised in the underlying logic of the proof assistant used. We would like to stress that the proof presented here uses only basic technical devices and is based on an approach that has been used extensively in proving various concurrent versions of separation logic [11]. Thus, we do not foresee any major difficulties in formalising it in an interactive theorem proving system, if such a need arises.

Auxiliary definitions. In the following we write \(\{E(.)\}\), where \(E\) is an expression with occurrences of \(\_\) substituted for any values from the corresponding domains.

For a set \(\Sigma, M \in \mathcal{M}(\text{Context})\) and an element \(\sigma \in \Sigma \cup \{\top\}\) we let \((\sigma, M) = \top\) when \(\sigma = \top\) and \((\sigma, M) = (\sigma, M)\) otherwise.

Recall the semantic domains used in this paper:

<table>
<thead>
<tr>
<th>State</th>
<th>Context × Heap × Lockset</th>
</tr>
</thead>
<tbody>
<tr>
<td>RelState</td>
<td>State × (\mathcal{M}(\text{Context}))</td>
</tr>
<tr>
<td>Config</td>
<td>GContext × Heap × Lockset</td>
</tr>
<tr>
<td>RelConfig</td>
<td>Config × (\mathcal{M}(\text{Context}))</td>
</tr>
<tr>
<td>HighConfig</td>
<td>(\mathcal{M}(\text{Context})) × Heap × Lockset</td>
</tr>
</tbody>
</table>

Let \(p \subseteq \text{State}, q \subseteq \text{RelState}, l \in \text{Label}, k \in \text{CPUId}, \ell \in \text{Lock}, R \in \text{Context} \text{ and } M \in \mathcal{M}(\text{Context}).\) In our proof, we use the following definitions:

\[
\begin{align*}
\text{atx}(l) &= \{ (r, h, L) \in \text{State} \mid r(ip) = l \} \\
\text{lkx}(l) &= \{ (\lfloor \ell \rfloor, \{ \ell \}) \in \text{State} \} \\
\text{lkx}(\ell) &= \{ (\lfloor \ell \rfloor, \{ \ell \}, \emptyset) \in \text{RelState} \} \\
|p|_k &= \{ (k \mid r, h, L) \in \text{Config} \mid (r, h, L) \in p \} \\
\text{conf}(R) &= \{ (R, \{ \emptyset \}, \emptyset) \in \text{Config} \} \\
\text{conf}_S(R) &= \{ (R, \{ \emptyset \}, \emptyset) \in \text{RelConfig} \} \\
\text{tok}_S(l, p) &= \{ (r(ip), l, h, L) \in \text{State} \mid (r, h, L) \in p \} \\
\text{tos}_S(l, q) &= \{ (r(ip), l, h, L) \in \text{RelState} \mid (r, h, L, M) \in q \}
\end{align*}
\]

We also let \(|\top|_k = \top\).

For a set \(\Sigma \subseteq \mathcal{P}(\Sigma)^T\) be the domain of subsets of \(\Sigma\) with a special element \(\top\). The order \(\subseteq\) in the domain \(\mathcal{P}(\Sigma)^T\) is subset inclusion with \(\top\) being the greatest element. For \(\sigma \in \Sigma \cup \{\top\}\) we denote with \(\{\sigma\}\) the singleton set \(\{\sigma\}\), if \(\sigma \in \Sigma\), and \(\top\), if \(\sigma = \top\). Thus, \(\{\sigma\}\) is in \(\mathcal{P}(\Sigma)^T\)

If \(\Sigma\) has a partial operation \(*\) : \(\Sigma \times \Sigma \rightarrow \Sigma\) defined on it, we can lift \(\ast\) to \(\mathcal{P}(\Sigma)^T\) pointwise: for all \(p, q \in \mathcal{P}(\Sigma)^T\)

\[
p \ast q = \bigcup \{ \sigma \ast q \mid \sigma \in p, \eta \in q, \sigma \ast \eta \text{ is defined}\}
\]

We now define three such partial operations

\[
\begin{align*}
\ast_K &= \text{State} \times \text{State} \rightarrow \text{State} \\
\ast_S &= \text{RelState} \times \text{RelState} \rightarrow \text{RelState} \\
\ast_C &= \text{Config} \times \text{Config} \rightarrow \text{Config}
\end{align*}
\]

with the first two interpreting the \(\ast\) connectives in the high- and low-level proof systems, respectively.

For \((r_1, h_1, L_1), (r_2, h_2, L_2) \in \text{State}\) we let

\[
(r_1, h_1, L_1) \ast_K (r_2, h_2, L_2) = (r_1, h_1 \uplus h_2, L_1 \uplus L_2)
\]

if \(r_1 = r_2\); undefined otherwise.

For \((r_1, h_1, L_1), (r_2, h_2, L_2), (M_1) \in \text{RelState}\) we let\(^3\)

\[
(r_1, h_1, L_1) \ast_S (r_2, h_2, L_2), (M_2) =
(r_1, h_1 \uplus h_2, L_1 \uplus L_2, M_1 \uplus M_2)
\]

if \(r_1 = r_2\); undefined otherwise.

For \((R_1, h_1, L_1), (R_2, h_2, L_2) \in \text{Config}\) we let

\[
(R_1, h_1, L_1) \ast_C (R_2, h_2, L_2) = (R_1 \uplus R_2, h_1 \uplus h_2, L_1 \uplus L_2)
\]

We lift the definitions above to the corresponding domains.

It is convenient for us to reformatulate the the semantics of primitive commands in Figure 6 in terms of transformers

\[
\begin{align*}
\text{f}_c^k : \text{Label} \times \text{Label} \times \text{State} &\rightarrow \mathcal{P}(\text{State})^T, \quad k \in \text{CPUId} \\
\text{for } c \in \text{PComm}, \text{ defined as follows: } \text{f}_c^k(l, l', (r, h, L)) &= \top, \text{ if } k, (r, h, L), l, l' \sim_c \top, \text{ and }
\text{f}_c^k(l, l', (r, h, L)) &= \bigcup \{ (r'[ip : l''], h', L') \mid (k, (r, h, L), l, l') \sim_c ((r', h', L'), l'') \}
\end{align*}
\]

\(^3\) Recall that the \(\uplus\) operation on multisets adds up the number of occurrences of each element in its operands.
otherwise. We extend the transformers to
\[ f_k^k : \text{Label} \times \text{Label} \times \text{RelState} \to \mathcal{P}(\text{RelState})^\top, \quad k \in \text{CPUUid}, \]

as follows:

\[ f_k^k(l, l', ((r, h, L), M)) = (f_k^k(l, l', (r, h, L)), M) \] (1)

We lift the above transformers to \( \mathcal{P}(\text{State})^\top \) and \( \mathcal{P}(\text{RelState})^\top \) pointwise. For example, for \( p \in \mathcal{P}(\text{State})^\top \) we let

\[ f_k^k(l, l', p) = \bigcup\{ f_k^k(l, l', (r, h, L)) \mid (r, h, L) \in p \}, \quad \text{if } p \neq \top \]
\[ \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \text{if } p = \top \]

The transformers thus defined satisfy the property of \textit{locality} with respect to the operations \( *_k \) and \( *_s \):

\[ f_k^k(((l, h_1, L_1),) \in \mathcal{S}(\varnothing, h_2, L_2)) \subseteq f_k^k((l, h_1, L_1) \in \mathcal{S}(\varnothing, h_2, L_2)) \] (2)

\[ f_k^k(((l, h_1, L_1),) \in \mathcal{S}(\varnothing, h_2, L_2)) \subseteq f_k^k((l, h_1, L_1) \in \mathcal{S}(\varnothing, h_2, L_2)) \] (3)

Finally, consider a process descriptor \( \text{desc}(d, \gamma) \) with free logical variables \( d \) and \( \gamma \) and an interpretation of logical variables \( \eta \). We define \( \text{desc}_s : \text{Val} \times \text{Context} \to \text{State} \) as follows: for \( u \in \text{Val} \) and \( r \in \text{Context} \) we let \( \text{desc}(d, \gamma) = \{ \text{desc}(d, \gamma) \} \) for \( \gamma \).

\textbf{Semantic proofs.} To prove Theorem 1, we translate a syntactic proof in our logic (Section 4.5) into a semantic form. Namely, given an interpretation of logical variables \( \eta \), a \textit{semantic proof} [11, 12] of the OS program is defined as a tuple \( (G_S, G_K, I_S, I_K, J) \), where

- \( G_S \) : Label \to \mathcal{P}(\text{RelState}), \( k \in \text{CPUUid} \);
- \( G_K \) : Label \to \mathcal{P}(\text{State})
- \( I_S \in \text{Lock} \to \mathcal{P}(\text{RelState})
- \( I_K \in \text{Lock} \to \mathcal{P}(\text{State})
- \( J_K \in \mathcal{P}(\text{RelState}), k \in \text{CPUUid} \)

such that

\[ \text{dom}(I_K) \cap \text{dom}(I_S) = \emptyset \] (4)

\[ \forall l \not\in \text{dom}(S) \cup \text{dom}(C) \cup \{ I_s, I_c \} \to G_k^S(l) = \emptyset \] (5)

\[ \forall l \not\in \text{dom}(S) \cup \text{dom}(C) \cup \{ I_s, I_c \} \to G_k^S(l) = \emptyset \] (6)

\[ G_k^S(\text{schedule}) = \{ \text{SchedState}(\gamma) \} \cap \text{ats} \] (7)

\[ G_k^S(\text{create}) = \{ \exists \gamma, \gamma(\text{if}) = 1 \land \text{SchedState}(\gamma) \cap \text{ats}(\text{create}) \} \] (8)

\[ G_k^S(\text{lock}) = \{ \exists \gamma, \gamma(\text{if}) = 1 \land \text{SchedState}(\gamma) \cap \text{ats}(\text{lock}) \} \] (9)

for all \( l \in \text{Label} \) and \( (r, h, L) \in G_k(L) \).

\[ 0 \leq r(s_p) - r(s(s)) \leq \text{StackBound} \land \]
\[ \text{dom}(h) \subseteq \{ r(s), r(s(s)), r(s(s)) \} \} \land \]
\[ \forall h'. (\forall u \not\in \{ r(s), r(s(s)), r(s(s)) \} \} \land \]
\[ h(u) = h'(u) \Rightarrow (r, h', L) \in G_k(l); \] (10)

and for all \( l \in \text{labels}(\text{OS}), l' \in \text{next}(\text{OS}, l), c = \text{comm}(\text{OS}, l) \) and \( k \in \text{CPUUid} \), we have:

- if \( c \) is not \textit{lock} or \textit{unlock}, and \( l \in \text{labels}(S) \cup \text{labels}(C), \)

\[ f_k^S(l, l', G_k^S(l)) \neq \top \land \]
\[ \forall((r, h, L), M) \in f_k^S(l, l', G_k^S(l)).((r, h, L), M) \in G_k^S(r(ip)) \] (11)

- if \( c \) is not \textit{lock}, \textit{unlock} or \textit{icall} and \( l \in \text{labels}(K), \)

\[ f_k^S(l, l', G_k^S(l)) \neq \top \land \]
\[ \forall((r, h, L), M) \in f_k^S(l, l', G_k^S(l)).((r, h, L), M) \in G_k^S(r(ip)) \] (12)

- if \( c \) is \textit{lock}(l) and \( l \in \text{labels}(S) \cup \text{labels}(C), \)

\[ \text{tos}_S(l', G_k^S(l) \cup S(l) \cup \text{Stacks}(\ell)) \subseteq G_k^S(l') \] (13)

- if \( c \) is \textit{lock}(l) and \( l \in \text{labels}(K), \)

\[ \text{tos}_K(l', G_k^S(l) \cup K(l) \cup \ell) \subseteq G_k^S(l') \] (14)

- if \( c \) is \textit{unlock}(l) and \( l \in \text{labels}(S) \cup \text{labels}(C), \)

\[ \text{tos}_S(l', G_k^S(l) \cup S(l) \cup \ell) \subseteq G_k^S(l') \] (15)

- if \( c \) is \textit{unlock}(l) and \( l \in \text{labels}(K), \)

\[ \text{tos}_K(l', G_k^S(l) \cup K(l) \cup \ell) \subseteq G_k^S(l') \] (16)

- if \( c \) is \textit{icall}(schedule), then

\[ \text{tos}_S(l', G_k^S(l) \subseteq G_k^S(l') \] (17)

- if \( c \) is \textit{icall}(create), then for some high-level assertions \( P, Q \in \text{Assert}_k \) such that free(\( P \)) \cap \text{Reg} = \emptyset \) we have

\[ G_k(l) \subseteq \{ \exists \gamma, \gamma(\text{if}) = 1 \land \text{desc}(\gamma, \gamma) \} \cap \text{at}(l) \] (18)

\[ G_k(l) \subseteq \{ \exists \gamma, \gamma(\text{if}) = 1 \land \text{desc}(\gamma, \gamma) \} \cap \text{at}(l) \] (19)

and for all \( r \in \text{Context} \)

\[ \{(r, r(s)\ldots(r(s)+\text{StackBound}-1); \} \subseteq \text{G}_k(r(ip)) \] (20)

Conditions (4)–(20) are semantic counterparts of the axioms in the high-level and low-level proof systems.

The following lemma shows that a syntactic proof can be converted into a semantic one.

\textbf{Lemma 1.} Given a proof \( J_K, \Delta_K \vdash J_S, \{ \Delta_S \} \) \( \vdash \) \( J \vdash (S, C, K) \) and an interpretation \( \eta \), there exists a semantic proof \( (G_S, G_K, I_S, I_K, J) \) such that for all \( l \in \text{Label} \) and \( k \in \text{CPUUid} \) we have

\[ G_k(l) = \{ \exists \gamma, \gamma(\text{if}) = 1 \land \text{desc}(\gamma, \gamma) \} \cap \text{at}(l) \]

We omit the straightforward proof of the lemma and proceed to prove the main soundness theorem.

\textbf{Proof of Theorem 1.} Let us fix an interpretation \( \eta \). We first apply Lemma 1 to construct a semantic proof \( (G_S, G_K, I_S, I_K, J) \) from the given syntactic one. Assume now that \( \gamma \in \mathcal{R} \) and \( \gamma \rightarrow_{\text{OS}} \sigma' \) for some \( \sigma' \in \text{Config} \cup \{ \top \} \). We need to show that \( \sigma' \in \mathcal{R} \).

Let the command in \( \gamma \rightarrow_{\text{OS}} \sigma' \) be executed by thread \( k \). We can thus assume

\[ \sigma = (R[k : r], h, L), \quad R(k) \text{ is undefined}, \quad r(ip) = l \]
\[ c = \text{comm}(\text{OS}, l), \quad l' \in \text{next}(\text{OS}, l) \]

Recall that \( \mathcal{R} \) is defined as

\[ \text{compose}(\cup_{L \cup L'} = \text{dom}(I_K) \land \text{lowinv}_\eta \cup \text{lowlock}_L), \]
\[ \cup_{L \cup L'} = \text{dom}(I_K) \land \text{highinv}_\eta \cup \text{highlock}_L) \]
Hence, there exist
\[
 h_1, h_2 \in \text{Heap}, \quad L_1 \subseteq \text{dom}(I_s), \quad L_2 \subseteq \text{dom}(I_{r})
\]
such that
\[
 (R[k : r], h_1, L_1), M \in \text{lowinv}_\eta \circ \text{lowlock}_{\text{dom}(I_s) - L_1 - L_2}
\]
\[
 (M, h_2, L_2) \in \text{highinv}_\nu \circ \text{highlock}_{\text{dom}(I_{r})} - L_2
\]
\[
h = h_1 \uplus h_2, \quad L = L_1 \uplus L_2
\]
We now consider several cases of how \( \sigma' \) may be obtained.

**Case 1.** \( \sigma' \) is obtained by applying the fourth rule in Figure 7. This case is impossible, since by (21) and the definition of lowinv_\eta we have \( l = r(ip) \in \text{labels}(OS) \).

**Case 2.** \( \sigma' \) is obtained by applying the first or the third rule in Figure 7, with the command executed by the scheduler and different from lock, unlock or iret. In this case \( l \in \text{labels}(S) \uplus \text{labels}(C) \) and
\[
 \{ \sigma' \} \subseteq \text{conf}(R) \circ \text{conf}_{\text{SchedSleep}_k(l)} \cap \text{ats}_k(l) \cap \text{lowlock}_{\text{dom}(I_s) - L_1 - L_2}
\]
From (21), for some \( h_5 \in \text{Heap}, L_5 \in \text{Lockset} \) and \( M_5 \in \mathcal{M}(\text{Context}) \) we have
\[
 (R[k : h_5, L_5, M_5]) \in G_{k_5}^k(\{ \sigma' \}) \cap \text{dom}(I_s) - \{ h_5, L_5, M_5 \}
\]
and
\[
 (R[k : r], h_1, L_1), M \in \text{lowinv}_\eta \circ \text{lowlock}_{\text{dom}(I_s) - L_1 - L_2}
\]
We have:
\[
 \{ (l', \{ r, h, L \} ) \} \cap \text{lowlock}_{\text{dom}(I_s) - L_1 - L_2}
\]
\[
 \{ \sigma' \} \subseteq \text{conf}(R) \circ \text{conf}_{\text{SchedSleep}_k(l)} \cap \text{ats}_k(l) \cap \text{lowlock}_{\text{dom}(I_s) - L_1 - L_2}
\]
\[
 (R[k : r'], h', L'), M \in \text{lowinv}_\eta \circ \text{lowlock}_{\text{dom}(I_s) - L_1 - L_2}
\]
From (11), \( f_k^k(l', \{ r, h, L \} ) \neq \emptyset \), hence, \( \sigma' \neq \emptyset \). Let \( \sigma' = (R[k : r'], h', L') \). Then by (11), we have
\[
 (R[k : r'], h', L'), M \in \text{conf}(R) \circ \text{conf}_{\text{SchedSleep}_k(l)} \cap \text{ats}_k(l) \cap \text{lowlock}_{\text{dom}(I_s) - L_1 - L_2}
\]
Then from (22) we get \( \sigma' \in \mathcal{R} \).

**Case 3.** \( \sigma' \) is obtained by applying the first rule in Figure 7, with the scheduler executing lock. In this case
\[
 l \in \text{labels}(S) \uplus \text{labels}(C), \quad c = \text{lock}(\ell), \quad \ell \notin L
\]
\[
 \{ \sigma' \} = (R[k : r[ip] \ell], h, L \uplus \{ \ell \})
\]
From (21), for some \( h_5, L_5, M_5 \) we have
\[
 (R[k : r], h_1, L_1), M \in G_{k_5}^k(\{ \sigma' \}) \cap \text{ats}_k(l) \cap \text{lowlock}_{\text{dom}(I_s) - L_1 - L_2}
\]
and
\[
 (R[k : h_5, L_5, M_5]) \in \text{lowinv}_\eta \circ \text{lowlock}_{\text{dom}(I_s) - L_1 - L_2}
\]
\[
 \{ \sigma' \} \subseteq \text{conf}(R) \circ \text{conf}_{\text{SchedSleep}_k(l)} \cap \text{ats}_k(l) \cap \text{lowlock}_{\text{dom}(I_s) - L_1 - L_2}
\]
Then
\[
 (R[i[p] : l'], h_1, L_1 \uplus \{ \ell \}, M) \in \text{tos}(l', G_{k_5}^k(l) \cap \text{SchedSleep}_k(l)) \cap \text{ats}_k(l) \cap \text{lowlock}_{\text{dom}(I_s) - L_1 - L_2}
\]
By (13), this implies
\[
 (R[i[p] : l'], h_1, L_1 \uplus \{ \ell \}, M) \in G_{k_5}^k(l') \cap \text{SchedSleep}_k(l) \cap \text{ats}_k(l)
\]
From this and (5) we get \( l' \in \text{dom}(S) \uplus \text{dom}(C) \uplus \{ l_1, l_2 \} \). Hence, by (27)
\[
 (R[k : r[ip] : l'], h_1, L_1 \uplus \{ \ell \}, M) \in \text{lowinv}_\eta \circ \text{lowlock}_{\text{dom}(I_s) - L_1 - L_2}
\]
Then from (22) and the definition of compose we get \( \sigma' \in \mathcal{R} \).

**Case 4.** \( \sigma' \) is obtained by applying the first or the third rule in Figure 7, with the scheduler executing unlock. In this case \( l \in \text{labels}(S) \uplus \text{labels}(C), c = \text{unlock}(\ell) \) and (24) holds.

From (21), there exist \( h_5, L_5, M_5 \) satisfying (25) and (26).

From (25) we then get
\[
 (R[i[p] : l'], h_1, L_1 \uplus \{ \ell \}, M) \in \text{tos}(l', G_{k_5}^k(l) \cap \text{SchedSleep}_k(l)) \cap \text{ats}_k(l) \cap \text{lowlock}_{\text{dom}(I_s) - L_1 - L_2}
\]
Then by (15)
\[
 (R[i[p] : l'], h_1, L_1 \uplus \{ \ell \}, M) \in \text{conf}(R) \circ \text{conf}_{\text{SchedSleep}_k(l)} \cap \text{ats}_k(l) \cap \text{lowlock}_{\text{dom}(I_s) - L_1 - L_2}
\]
Hence, \( \ell \in L_1 \), which means that \( \sigma' \neq \emptyset \). Then \( \sigma' = (R[k : r[ip] : l'], h, L \uplus \{ \ell \}) \). The above also implies
\[
 (R[i[p] : l'], h_1, L_1 \uplus \{ \ell \}, M) \in G_{k_5}^k(l') \cap \text{SchedSleep}_k(l) \cap \text{ats}_k(l)
\]
From this and (5) we get \( l' \in \text{dom}(S) \uplus \text{dom}(C) \uplus \{ l_1, l_2 \} \). Hence, by (26)
\[
 (R[k : r[ip] : l'], h_1, L_1 \uplus \{ \ell \}) \in \text{lowinv}_\eta \circ \text{lowlock}_{\text{dom}(I_s) - L_1 - L_2}
\]
Then from (22) and the definition of compose we get \( \sigma' \in \mathcal{R} \).

**Case 5.** \( \sigma' \) is obtained by applying the first or the third rule in Figure 7, with the command executed by the kernel and different from lock, unlock or icall. In this case \( l \in \text{labels}(K) \) and (24) holds.

From (21), there exist \( h_5, L_5, M_5 \) satisfying (26) such that
\[
 (R[k : r], h_1, L_1), M \in \text{conf}(R) \circ \text{conf}_{\text{SchedSleep}_k(l)} \cap \text{ats}_k(l) \cap \text{lowlock}_{\text{dom}(I_s) - L_1 - L_2}
\]
Then for some \( h'_5 \in \text{Heap} \) and
\[
 h_0 \in \{ [r(a)p]...[r(a)s]+StackSize-1 : \}
\]
we have \(h_1 = h'_1 \uplus h_0\) and
\[
((r, h'_1, L_1), M) \in J_k \uplus s\{((\_; h_5, L_5), \{r\} \uplus M_2)\}
\]
Let \(h'_2 = h_2 \uplus h_0\), then
\[
h = h'_1 \uplus h'_2, \quad L = L_1 \uplus L_2 \tag{29}
\]
Note that from the above it follows that \(r \in M\). Then by (22) and (10) for some \(h_k \in \text{Heap}\) and \(L_k \in \text{Lockset}\) we have
\[
(r, h'_2, L_2) \in G_k(l) * k\{\_; h_k, L_k\} \tag{30}
\]
and
\[
(M - \{r\}, h_k, L_k) \in \bigcup_{r'' \in M - \{r\}} [[\Delta_k(r''(\_ip))]_{h''_1} * k \text{highlock}_{\text{dom}(h_k) - L_2} \tag{31}
\]
We have:
\[
\{\sigma'\} \subseteq \text{conf}(R) * c [f_k(l, l', (r, h, L))]_k \tag{24}
\]
\[
\text{conf}(R) * c [f_k(l, l', (r, h, h'_1, L_2))]_{k} \tag{29}
\]
\[
\text{conf}(R) * c [f_k(l, l', \text{G}_k(l)) * k\{\_; h'_1, h_k, L_1 \uplus L_2\}]_{k} \tag{30}
\]
\[
\text{conf}(R) * c [f_k(l, l', \text{G}_k(l)) * k\{\_; h'_1, h_k, L_1 \uplus L_2\}]_{k} \tag{2}
\]
By (12), \(f_k(l, l', \text{G}_k(l)) \neq \emptyset\), hence, \(\sigma' \neq \emptyset\). Let \(\sigma' = (R[k : r', h', L]). Then by (12)
\[
(R[k : r', h', L]) \in \text{conf}(R) * c [G_k(r''(\_ip)) * k\{\_; h'_1, h_k, L_1 \uplus L_2\}]_k
\]
Using (10), we conclude that for some \(h'_2 \in \text{Heap}\) and \(h'_0 \in [r'(\_sp) .. r'(\_ss) + \text{StackSize} - 1] : \_\_\_
\]
we have \(h' = h'_2 \uplus h'_0 \uplus h'_1\) and
\[
((r', h'_2, L_2) \in [[\Delta_k(r''(\_ip))]_{h''_1} * k\{\_; h'_1, h_k, L_1 \uplus L_2\}_l \tag{31}
\]
From this and (4) we get \(r''(\_ip) \in \text{dom}(k)\). Let \(M' = (M - \{r\}) \cup \{r''\}\). Then by (31) this implies
\[
(M', h'_2, L_2) \in \text{highinv}_k * k \text{highlock}_{\text{dom}(h_k) - L_2} \tag{32}
\]
Let \(h'' = h'_1 \uplus h'_0\). Then from (28) we get
\[
((r', h''_1, L_1), M') \in [\text{SchedSleep}_k(r''(\_ip))]_{h''_1} \uplus \text{ats}(r''(\_ip)) * s\{\_; h_5, L_5, M_5\}
\]
Since \(r''(\_ip) \in \text{dom}(k), \) together with (26), this implies
\[
((R[k : r'], h''_1, L_1), M') \in \text{lowinv}_k * k \text{lowlock}_{\text{dom}(h_k) - L_1}
\]
By the definition of compose, from this and (32) we get \(\sigma' \in \mathcal{R}\).

Case 6. \(\sigma'\) is obtained by applying the first rule in Figure 7, with the kernel executing lock. In this case
\[
l \in \text{labels}(k), \ c = \text{lock}(l), \ l \notin L \sigma' = (R[k : r'[\_ip : l]], h, L \cup \{l\})
\]
Like in Case 5, there exist \(h_5, L_5, h'_1, h_0, h'_2\) satisfying the conditions stated there. Additionally, from (22) for some \(h_k, L_k\) we get
\[
(r, h'_2, L_2) \in G_k(l) * k \text{I}_k (l) * k\{\_; h_k, L_k\}
\]
and
\[
(M - \{r\}, h_k, L_k) \in \bigcup_{r'' \in M - \{r\}} [[\Delta_k(r''(\_ip))]_{h''_1} * k \text{highlock}_{\text{dom}(h_k) - L_2} \cup \{l\}] \tag{33}
\]
This implies
\[
(r\_ip : l], h'_{L_2}, L_2 \cup \{l\}) \in \text{to}_{k}(l, (G_k(l) * k \text{I}_k (l) * k \{h_k, L_k\}) * k\{\_; h_k, L_k\})
\]
Hence, by (14)
\[
(r\_ip : l], h'_{L_2}, L_2 \cup \{l\}) \in G_k(l) * k\{\_; h_k, L_k\}
\]
Then from (10) it follows that
\[
((r\_ip : l], h'_{L_2}, L_2 \cup \{l\}) \in \bigcup_{r'' \in M - \{r\}} [[\Delta_k(r''(\_ip))]_{h''_1} * k \{h_k, L_k\}
\]
From this and (4) we get \(l' \in \text{dom}(k)\). Let \(M' = (M - \{r\}) \cup \{r\_ip : l']\). Then by (33) we get
\[
(M', h'_2, L_2 \cup \{l\}) \in \text{highinv}_k * k \text{highlock}_{\text{dom}(h_k) - L_2} \tag{34}
\]
From (28) we get
\[
((r\_ip : l], h'_1, L_1), M') \in \bigcup_{r'' \in M - \{r\}} [[\text{SchedSleep}_k(r''(\_ip))]_{h''_1} \uplus \text{ats}(r''(\_ip)) * s\{\_; h_5, L_5, M_5\}
\]
Since \(l' \in \text{dom}(k)\), together with (26), this implies
\[
((R[k : r\_ip : l'], h'_1, L_1), M') \in \text{lowinv}_k * k \text{lowlock}_{\text{dom}(h_k) - L_1}
\]
By the definition of compose, from this and (32) we get \(\sigma' \in \mathcal{R}\).

Case 7. \(\sigma'\) is obtained by applying the first or the third rule in Figure 7, with the kernel executing unlock. In this case \(l \in \text{labels}(k), c = \text{unlock}(l)\) and (24) holds.
Like in Case 5, there exist \(h_5, L_5, h'_1, h_0, h'_2, h_k, L_k\) satisfying the conditions stated there. Then using (30), we get
\[
(r\_ip : l], h'_{L_2}, L_2 \cup \{l\}) \in \text{to}_{k}(l, G_k(l) * k\{\_; h_k, L_k\}
\]
Hence, by (16)
\[
(r\_ip : l], h'_{L_2}, L_2 \in G_k(l') * k \text{I}_k (l) * k \{h_k, L_k\}
\]
Hence, \(l \in L_2\), which means that \(\sigma' \neq \emptyset\). Then \(\sigma' = (R[k : r\_ip : l'], h, L - \{l\})\). The above also implies
\[
(r\_ip : l'], h'_2, L_2 - \{l\}) \in G_k(l') * k \text{I}_k (l) * k\{\_; h_k, L_k\}
\]
Then from (10) it follows that
\[
((r\_ip : l'], h'_2, L_2 - \{l\}) \in \bigcup_{r'' \in M - \{r\}} [[\Delta_k(r''(\_ip))]_{h''_1} * k \text{I}_k (l) * k\{\_; h_k, L_k\}
\]
From this and (4) we get \(l' \in \text{dom}(k)\). Let \(M' = (M - \{r\}) \cup \{r\_ip : l']\). Then by (31) we get
\[
(M', h'_2, L_2 - \{l\}) \in \text{highinv}_k * k \text{highlock}_{\text{dom}(h_k) - L_2 - \{l\}}
\]
Like in the previous case, from (28) and (26) we can establish (35). Together with the last inclusion, this implies \(\sigma' \in \mathcal{R}\).

Case 8. \(\sigma'\) is obtained by applying the first or the third rule in Figure 7, with the kernel executing ical1(schedule). In this case \(l \in \text{labels}(k), c = \text{ical1}(\text{schedule})\) and (24) holds.
Like in Case 5, there exist \(h_5, L_5, h'_1, h_0, h'_2, h_k, L_k\) satisfying the conditions stated there. From (30) we then get
\[
(r\_ip : l], h'_{L_2}, L_2 \in \text{to}_{k}(l', G_k(l) * k\{\_; h_k, L_k\}
\]
By (17), this implies 
\[(r[sp : l'], h', L'_2) \in G_K(l') \ast K\{(h_k, L_K)\}\]

Then using (10) we get 
\[((r[sp : l'], h, L_2) \in {[\Delta K(l'_1)]_{r[sp : l']} \ast K\{(h, L_K)\}}\]

Let \(M' = (M - \{r\} \cup \{r[sp : l']\}\). Then by (31) we have 
\[(M', h_2, L_2) \in \text{highinv}_r \ast S \text{lockdom}(L_k) - L_2\]
\[(36)\]

From (28) we get dom\((h) \supseteq \{r[sp], \ldots, r[sp]+m+1\}\), which implies that \(\sigma' \neq \emptyset\). Then \(\sigma' = (R[k : r''', h''_2 \cup h_2, L])\), where 
\[r'' = r[sp : \text{schedule}, \text{sp : r(sp)+m+1}, \text{if : 0}]\]

and 
\[h''_2 = h_1[r[sp] : l', r[sp]+1 : r(gr_1), \ldots, r[sp]+m : r(gr_m)]\]

From (28) we also get 
\[((r, h_1, L_1), M') \in \text{J}_K \ast S \}
\[((\{r[sp], (r[sp]+StackSize-1) \cup h_2, L_2\}, [r[sp : l']]) \cup M_3)\]
\[(37)\]

Hence, 
\[(r''', h''_2, L_1), M') \in \text{J}_K \ast S \}
\[((\{r''[sp] - m - 1, r''[sp] - 1 : r'(gr_1) \ldots r'(gr_m), r''[sp]), r''[sp] + 1 + StackSize - 1], h_2, L_2), [r[sp : l']]) \cup M_3)\]

From (30) and (10) we get \(0 \leq r[sp](r[sp] - ) \leq \text{StackBound}\), so that \(0 \leq r''[sp] - r''[sp] - m - 1 \leq \text{StackBound}\). Besides, the form of the OS program ensures that \(r[if] = 1\). Thus, 
\[((r'', h''_2, L_1), M') \in \}
\[\{r[sp : l'] \in \text{lowinv}_r \ast S \text{lockdom}(L_k) - L_1\}\]

By the definition of compose, from this and (36) we get \(\sigma' \in \mathbb{R}\).

Case 9. \(\sigma'\) is obtained by applying the second or the last rule in Figure 7, i.e., by executing an interrupt. This case is virtually identical to the previous one and is omitted.

Case 10. \(\sigma'\) is obtained by applying the first or the third rule in Figure 7, with the scheduler executing \text{iret} at \(l_s\) or \(l_c\). In this case 
\(l \in \{l_s, l_c\}, \quad l' \in \{l_s+1, l_c+1\}, \quad c = \text{iret}\)

and \((24)\) holds.

From (21), there exist \(h_s, L_s, M_s\) satisfying \((25)\) and \((26)\). Then from \((25)\), \((7)\) and \((9)\) we get 
\[((r, h_1, L_1), M) \in \{r[sp : l'] \in \text{SchedState}_{\emptyset} \cap \}
\[(\text{ats}(l_s) \cup \text{ats}(l_c))) \ast S \{\{h_2, L_2, M_2\}\}\]
\[(38)\]

Hence, dom\((h_1) \supseteq \{r[sp] - m - 1, \ldots, r[sp] - 1\}\) and \(\sigma' \neq \emptyset\). Let 
\[l'' = h_1(r[sp] - m - 1), g_1 = h_1(r[sp] - m), \ldots, g_m = h_1(r[sp] - 1)\]

Then \(\sigma' = (r', h, L)\), where 
\[r' = r[sp : l', sp : r[sp] - m - 1, gr_1 : g_1, \ldots, gr_m : g_m, if : 1]\]

From (38) we now obtain 
\[((r[sp : l'], h_1, L_1), M) \in \}
\[\{\text{SchedState}_{\emptyset} \cap \text{ats}(l'_2)\} \ast S \{\{h_2, L_2, M_2\}\}\]

Hence, 
\[((r', h_1, L_1), M) \in \}
\[\text{ats}(l''_2) \cap \{\{r[sp] \cup \text{StackSize} - 1 : h_2, L_2\},
\{r' \cup M_3\}\} \ast S \mathbb{J}_n\]

which is equivalent to 
\[((r', h_1, L_1), M) \in \}
\[\{\text{SchedSleep}_{\emptyset}(l''_2) \cap \text{ats}(l''_2)\} \ast S \{\{h_2, L_2, M_2\}\}\]

Note that \(r' \in M\). Hence, from (22) and (4) we get \(l'' \in \text{Labels}(K)\).

By (26) we then have 
\[((R[k : r'], h_1, L_1), M) \in \text{lowinv}_r \ast S \text{lockdom}(L_k) - L_1\]

From (22) and the definition of compose we get \(\sigma' \subseteq \mathbb{R}\).

Case 11. \(\sigma'\) is obtained by applying the first or the third rule in Figure 7, with the kernel executing \text{call}(create). In this case \(l \in \text{Labels}(K), c = \text{call}([\text{create}]\) and \((24)\) holds.

Like in Case 5, there exist \(h_s, L_s, M_s, h'_2, h_2, h'_2, L_K\) satisfying the conditions stated there. Then from (30) and (18) we get 
\((r, h'_2, L_2) \in \{\{h_k, L_K\}\} \ast K\)
\[\{37, \gamma(\text{if}) = 1 \wedge \text{desc}(gr_1, \gamma) \ast P + Q\}\]

Hence, there exists \(r'\) such that \(r'[\text{if}] = 1\) and 
\((r, h'_2, L_2) \in \text{desc}(u, r') \ast K\{P\}' \ast K\{Q\}' \ast K\{\{h_k, L_K\}\}

where \(u = r(gr_1)\) and \(\eta' = \eta[\gamma : r']\). Since \(\text{free}(P) \land \text{Reg} = \emptyset\) and \(\text{free}(\text{desc}(d, \gamma)) \land \text{Reg} = \emptyset\), we have 
\[r[sp : l', h'_2, L_2) \in \text{desc}(u, r') \ast K\]
\[\{P\}' \ast K\text{tok}(l', [Q]' \ast K\{\{h_k, L_K\}\}

Using (19), we then get 
\[r[sp : l', h'_2, L_2) \in \text{desc}(u, r') \ast K\]
\[\{P\}' \ast K\text{G}_k(l') \ast K\{\{h_k, L_K\}\}

According to (10), this implies 
\[\}
\[\{\text{desc}_k(u, r') \ast K\{P\}' \ast K\{\Delta K(l'_1)\}_r[sp : l'] \ast K\{\emptyset, h, L_K\}\}

Then for some \(h'_2, h_2) \in \text{Heap}\) such that \(h_2 = h'_2 \cup h_2\) we have 
\[\{\text{desc}_k(u, r')\} \text{recall that all states from amp; doc; desc}_k(u, r') \text{have an empty lockset; see Section 4.2}\] and 
\[\}
\[\{r[sp : l'], h'_2, L_2) \in \]
\[\{P\}' \ast K\{\Delta K(l'_1)\}_r[sp : l'] \ast K\{\emptyset, h, L_K\}\]

Then from (20) and (10) we get 
\[\}
\[\{r[sp : l', r'), h'_2, L_2) \in \]
\[\{\Delta K(r''[sp])\}_r'[sp] \ast K\{\Delta K(l'_1)\}_r[sp : l'] \ast K\{\emptyset, h, L_K\}\]

Let \(M' = (M - \{r\} \cup \{r[sp : l'], r'\}\). Then by (31) we have 
\[(M', h''_2, L_2) \in \text{highinv}_r \ast S \text{lockdom}(L_k) - L_2\]
\[(39)\]

Like in Case 8, we can assume that 
\[\sigma' = (R[k : r''], h''_2 \cup h_2, L) = (R[k : r''], h''_2 \cup h_2 \cup h'_2, L)\]
where
\[
r'' = r[\text{ip} : \text{create}, \text{sp} : r(\text{sp})+m+1, \text{if} : 0]
\]
and \(h''_1\) is defined by (37). Let \(h'''_1 = h'_1 \sqcup h_d\). Then from (28) we get
\[
((r, h'''_1, L_1), M') \in J_k \ast S(\text{desc}_q(u, r') \times \{\emptyset\}) \ast S
\]
\[
\lbrace \langle\langle \text{sp} \rangle, (r(\text{sp})+m) : l' r(\text{gr}_1) \ldots r(\text{gr}_m),
(r(\text{sp})+m+1) \ldots (r(\text{sp})+\text{StackSize}-1) : \rangle \sqcup h_S, L_S),
\{r[\text{ip} : l'], r' \} \sqcup M_S \rbrace
\]
Similarly to how it was done in Case 8, using (8) we now establish
\[
((r'', h''', L_1), M') \in G^*_S(\text{create}) \ast S \lbrace \langle\langle \_ , h_S, L_S \rangle, M_S \rbrace
\]
Together with (26), this implies
\[
((R[k : r''], h''', L_1), M') \in \text{lowinv}_\eta \ast S \text{lowlock}_{\text{dom}(L_S)-L_1}
\]
By the definition of compose, from this and (39) we get \(\sigma' \in \mathcal{R}\). \(\square\)