Initial commit: XOmB exokernel foundation

Core kernel infrastructure: - Multiboot2 boot with GRUB, long mode setup, higher-half kernel - Serial port output for debugging - Unified boot info abstraction for future UEFI support Memory management: - Physical frame allocator with bitmap tracking - Page table manipulation via recursive mapping (PML4[510]) - Support for 4KB, 2MB, and 1GB page mappings - TLB invalidation and proper NXE support Build system: - Cargo-based build with custom x86_64 target - Makefile for QEMU and Bochs testing - GRUB ISO generation for multiboot2 boot
2026-07-01 04:54:09 +02:00 · 2025-12-26 14:20:39 -05:00
commit c6fa2895e2
28 changed files with 4146 additions and 0 deletions
--- a/docs/MAIN.md
+++ b/docs/MAIN.md
@@ -0,0 +1,27 @@
+XOmB is an exokernel.
+
+An exokernel is a type of kernel that provides a very minimal abstraction. It multiplexes hardware resources in userspace in the form of 'Library OSes'. These library OSes provide the implementations necessary to drive devices for the needs of applications that are linked to them.
+
+The role of the privileged kernel is just what is necessary to secure access to resources. All responsibility for resource management and decisions based around how to effectively use those resources is relegated to the library OSes and the needs of each application.
+
+In XOmB, the main mechanism to secure access to resources is the paging system. Resources are memory mapped whenever possible and then access to resources are given to applications, through their library OSes, via the paging system. For instance, access to the network card might be given by mapping the register space onto a region of memory and then placing that into the memory space of a library OS. If an application might only want to read that memory space, it could have those pages mapped into their virtual address space with a read-only flag set.
+
+The exokernel will make extensive use of any optimization or technique available to it in order to most efficiently use virtual memory to provide access and access control of such resources. For example, in x86-64, the kernel would certainly make use of 'superpages', which are large virtual memory allocations made by shallow page table entries marked as terminating at higher levels. These pages, then, represent multiples of the normal page size in terms of their allocations. It might even be beneficial to use gigabyte-large virtual address spaces to very efficently map out large linear address spaces for application use.
+
+Updating access controls (or revoking access) will be as simple as updating the page table entries themselves and flushing any cache or TLB that might still be referencing it. Multiplexing a scarce resource might entail atomically updating access in one page table entry with another entry within a different process virtual address space to, thus, atomically swap such access. Though, the atomicity is still affected by our ability to flush relevant caches in time. Therefore, it is still rather important to consider the scheduling of these actions when requested by each application.
+
+The main kernel maintains the root page table. The root page table is effectively set at the start and not changed. Within the page table structure, the entries within maintain the state of the system and the current process or processes and the current resources. Each Process is effectively represented as a page table entry and a slightly lower level. So, if we assume a five-level paging system, like on modern x86-64, the root page table is created and maintained by the kernel and it maps into that as a page table entry a pointer to the root page table of a process. The process, then, in turn owns its page table and maps in resources via kernel primitive functions and system calls. Once resources are securely provisioned to the process, the process can effectively do anything the page tables permit. Resources are, like processes, represented by slightly lower level page table structures that are able to be mapped into the process page table (which in turn can be placed into the system address space via the kernel's root page table.)
+
+To summarize the kernel actions we need to make this kernel operate:
+
+- Create a process which is basically a virtual address space (which is represented as a page table entry and is the main data structure that represents a process or process group)
+- Allocate a resource (which is represented as a page table entry)
+- Attach a resource to a virtual address space (link a page table entry in a 'process' to the page table entry serving as the root of a resource)
+- Update the access of a resource (update a page table entry in a 'process')
+- Atomically swap access to a resource (update two 'process' structures by nulling a resource entry while adding it to another)
+
+These operations need to be verifiable in our kernel. Applications rely on these operations being secure.
+
+The unanswered questions so far have to do with scheduling and preemption.
+
+For stage 1, we will have a non-preemptive single process kernel to sidestep solving these problems.
--- a/docs/MEMORY.md
+++ b/docs/MEMORY.md
@@ -0,0 +1,5 @@
+The XOmB exokernel multiplexes hardware resources mostly through the use of the virtual memory system. So, a hardware resource might be memory mapped into a certain memory range and then that memory range is, when access is granted, mapped into the virtual address space of a process.
+
+This means that the kernel needs to allocate memory pages in order to allocate the page table entries. So, it maintains the data structure responsible for keeping track of free pages of physical memory. Applications may request specific physical pages to be allocated. Applications may create resources which are represented by page table entries... in our case, the kernel is a PML4, a process is a PML3, and thus a resource is either a PML2 (PD) or a first-level page table (PT). An application or library OS (libOS) can do this by asking the kernel to allocate a resource with a particular physical page to be used as its relative page table root.
+
+The kernel should be somewhat aware of hardware memory mapped ranges and allocate resources for those that can be securely provisioned when asked by a user application via a library OS. The kernel does not create those resources itself, however, or have any logic that is specific to any hardware except what it needs for debugging itself. An 'init' process will eventually exist that will allocate some of those initial resources that it can pass off to driving libOSes and applications that run later.
--- a/docs/tasks/STAGE_1.md
+++ b/docs/tasks/STAGE_1.md
@@ -0,0 +1,180 @@
+# Stage 1: Non-Preemptive Single Process Kernel
+
+This document outlines the work required to build the initial XOmB exokernel as described in docs/MAIN.md. Stage 1 focuses on a non-preemptive, single-process kernel to establish the core mechanisms without solving scheduling and preemption problems.
+
+## Goals
+
+- Establish the kernel's page table as the root of the system
+- Implement the five core kernel actions (for a single process)
+- Demonstrate resource allocation and access control via paging
+- Provide a foundation for Library OS development
+
+## Core Kernel Actions to Implement
+
+### 1. Create a Process (Virtual Address Space)
+
+A process is represented as a page table entry at a level below the kernel's root page table. On x86-64 with 5-level paging, this means:
+
+- Kernel owns PML5 (or PML4 on 4-level systems)
+- A process is a PML4 (or PML3) entry that the kernel maps into its root
+
+**Tasks:**
+- [ ] Define the process data structure (essentially a page table root + metadata)
+- [ ] Implement process creation (allocate page table, initialize entries)
+- [ ] Map the process into the kernel's address space
+
+### 2. Allocate a Resource
+
+Resources are memory-mapped regions represented as page table structures that can be attached to processes.
+
+**Tasks:**
+- [ ] Define resource types (physical memory regions, device MMIO, etc.)
+- [ ] Implement resource allocation (create page table entries representing the resource)
+- [ ] Track allocated resources (ownership, reference counting?)
+
+### 3. Attach a Resource to a Virtual Address Space
+
+Link a resource's page table entry into a process's page table at a specified virtual address.
+
+**Tasks:**
+- [ ] Implement resource attachment (map resource page table into process page table)
+- [ ] Handle alignment requirements (superpages: 2MB, 1GB)
+- [ ] Set appropriate access flags (read, write, execute, user/supervisor)
+
+### 4. Update Resource Access
+
+Modify the access permissions of an already-attached resource.
+
+**Tasks:**
+- [ ] Implement permission updates (modify page table entry flags)
+- [ ] Handle TLB invalidation (invlpg, or full flush)
+- [ ] Consider cache coherency implications
+
+### 5. Atomically Swap Resource Access
+
+Transfer a resource from one process to another atomically.
+
+**Tasks:**
+- [ ] Implement atomic swap (null one entry while setting another)
+- [ ] Handle TLB/cache synchronization
+- [ ] Note: In single-process Stage 1, this may be simplified
+
+## Infrastructure Required
+
+### Physical Memory Management
+
+**Tasks:**
+- [ ] Parse memory map from bootloader (multiboot2/UEFI)
+- [ ] Implement physical frame allocator
+- [ ] Track free/used physical pages
+
+### Page Table Management
+
+**Tasks:**
+- [ ] Implement page table creation and manipulation
+- [ ] Support for 4KB, 2MB, and 1GB pages (superpages)
+- [x] Kernel mapping strategy: higher-half at 0xFFFFFFFF80000000 with recursive mapping at PML4[510]
+
+### System Call Interface
+
+**Tasks:**
+- [ ] Define syscall mechanism (syscall/sysret instruction)
+- [ ] Implement syscall handler
+- [ ] Define initial syscall ABI for the five core actions
+
+### Initial Process Loading
+
+**Tasks:**
+- [ ] Define executable format (ELF?)
+- [ ] Load initial process from boot module or embedded binary
+- [ ] Transfer control to user mode
+
+## Design Decisions
+
+### Memory Layout
+
+- **Higher-half kernel**: Kernel mapped at 0xFFFFFFFF80000000 (top 2GB, required for kernel code model)
+- **4-level paging**: Using standard x86-64 4-level paging (PML4 → PDPT → PD → PT). 5-level paging (LA57) is a future consideration.
+- **Self-referencing page table**: PML4[510] points to the PML4 itself, enabling recursive page table access
+- **Kernel mapping**: PML4[511] maps the kernel's higher-half address space
+
+#### PML4 Layout
+
+| Index | Purpose | Virtual Address Range |
+|-------|---------|----------------------|
+| 0 | Identity map (boot only) | 0x0000_0000_0000_0000 - 0x0000_007F_FFFF_FFFF |
+| 510 | Recursive mapping | 0xFFFF_FF00_0000_0000 - 0xFFFF_FF7F_FFFF_FFFF |
+| 511 | Kernel higher-half | 0xFFFF_FF80_0000_0000 - 0xFFFF_FFFF_FFFF_FFFF |
+
+#### Recursive Mapping Implications
+
+With PML4[510] as the self-reference entry:
+- **Recursive region base**: 0xFFFF_FF00_0000_0000
+- **PML4 accessible at**: 0xFFFF_FF7F_BFDF_E000
+- **Any page table** can be accessed by constructing the appropriate virtual address
+
+The recursive mapping formula for accessing page table entries:
+```
+PML4:       0xFFFFFF7FBFDFE000 + (pml4_idx * 8)
+PDPT:       0xFFFFFF7FBFC00000 + (pml4_idx * 0x1000) + (pdpt_idx * 8)
+PD:         0xFFFFFF7F80000000 + (pml4_idx * 0x200000) + (pdpt_idx * 0x1000) + (pd_idx * 8)
+PT:         0xFFFFFF0000000000 + (pml4_idx * 0x40000000) + (pdpt_idx * 0x200000) + (pd_idx * 0x1000) + (pt_idx * 8)
+```
+
+## Open Questions
+
+The following questions need to be answered before or during implementation:
+
+### Resource Model
+
+1. **Resource granularity**: What's the minimum resource size? A single 4KB page, or always aligned to superpages?
+
+2. **Device resources in Stage 1**: Do we need device MMIO support, or just physical memory? For a minimal kernel, memory-only may suffice.
+
+3. **Resource metadata**: Where do we store resource metadata (size, type, owner)? Separate structures, or encoded in page table entries?
+
+### Process Model
+
+4. **Process metadata location**: Where does process state live? In kernel memory, or in a reserved area of the process's own address space?
+
+5. **Initial process origin**: Is the first process loaded from a multiboot module, embedded in the kernel, or loaded from a filesystem?
+
+### Stage 1 Scope
+
+6. **User mode in Stage 1?**: Does Stage 1 require actual user-mode execution, or can we demonstrate the mechanisms with kernel-mode "processes" first?
+
+7. **Serial/console output from processes**: How do processes output debug information? Direct serial access? Kernel-provided syscall?
+
+## Boot Code Status
+
+The boot assembly (`src/boot/multiboot2_header.asm`) and linker script (`linker-multiboot2.ld`) have been updated:
+
+1. **[DONE] Identity map first 1GB** - PML4[0] → PDPT_LOW → PD (512 x 2MB pages)
+2. **[DONE] Map kernel at higher-half** - PML4[511] → PDPT_HIGH → PD at 0xFFFFFFFF80000000
+3. **[DONE] Set up recursive mapping** - PML4[510] = physical address of PML4 | flags
+4. **[DONE] Jump to higher-half** - Boot code transitions to higher-half stack and calls Rust entry point
+5. **[TODO] Unmap identity mapping** - Remove PML4[0] once running in higher-half (can be done in Rust)
+
+## Suggested Implementation Order
+
+1. ~~Update boot code for higher-half + recursive mapping~~ **[DONE]**
+2. ~~Update linker script for higher-half kernel~~ **[DONE]**
+3. Physical memory allocator (using boot memory map)
+4. Page table manipulation primitives (using recursive mapping)
+5. Process creation (allocate PML4, map into kernel space)
+6. Resource allocation (physical memory regions as page table structures)
+7. Resource attachment (map into process address space)
+8. Permission updates and TLB management
+9. System call interface
+10. Initial process loading and user-mode transition
+11. Atomic resource swapping
+
+## Success Criteria
+
+Stage 1 is complete when:
+
+- A single process can be created with its own virtual address space
+- Physical memory resources can be allocated and mapped into the process
+- Access permissions can be set and modified
+- The process can execute code in user mode
+- Basic syscalls allow the process to request resources from the kernel