11 KiB
The XOmB exokernel multiplexes hardware resources mostly through the use of the virtual memory system. So, a hardware resource might be memory mapped into a certain memory range and then that memory range is, when access is granted, mapped into the virtual address space of a process.
This means that the kernel needs to allocate memory pages in order to allocate the page table entries. So, it maintains the data structure responsible for keeping track of free pages of physical memory. Applications may request specific physical pages to be allocated. Applications may create resources which are represented by page table entries... in our case, the kernel is a PML4, a process is a PML3, and thus a resource is either a PML2 (PD) or a first-level page table (PT). An application or library OS (libOS) can do this by asking the kernel to allocate a resource with a particular physical page to be used as its relative page table root.
The kernel should be somewhat aware of hardware memory mapped ranges and allocate resources for those that can be securely provisioned when asked by a user application via a library OS. The kernel does not create those resources itself, however, or have any logic that is specific to any hardware except what it needs for debugging itself. An 'init' process will eventually exist that will allocate some of those initial resources that it can pass off to driving libOSes and applications that run later.
The kernel has the following flags for page table entries where some are hardware-defined and others are specifically maintained by the kernel:
ReadOnly- When set, this page cannot be written to. This is generally an existing hardware-defined flag.NoExecute- When set, this page cannot be executed as though it were runnable code. This is generally an existing hardware-defined flag, but not always available on all systems.Present- Whether or not there is meant to be an actual page mapped in at the given moment. When it is clear, the page is 'not present' and therefore will fault on any attempt to access. One use of this is to have an otherwise valid entry, but one where the page data has been saved elsewhere and needs to be restored when it faults. This is generally an existing hardware-defined flag.Owner- When set for a root table of a Resource or Process within the Process userspace address space, the current process owns the Resource or is the parent of the Process. This grants extra ability to the calling Process when maintaining the Resource or Process that simple grantees of the Resource do not have. See RESOURCE and PROCESS files for more information about how a Resource or Process is maintained.Grant- When set for an entry that points to a Resource, it denotes that the Resource is on Grant and not owned. When the kernel walks the page table structure, if it hits an entry that is marked as aGrant, that calling Process cannot affect that Resource in any way.
The only flags likely to not already exist are the Owner and Grant flags. The kernel reserves this. Library OSes can then establish what other flags might be on their own (although there may not be any bits left.) The kernel will facilitate setting any non-reserved bits in page table entries for the benefit of library OSes. For instance, a library OS will likely want to implement copy-on-write behavior and may want to introduce a CopyOnWrite flag for this purpose. When the kernel faults on a read-only page, the library OS can decide to then ask the kernel to map in a particular physical page. The kernel will zero this page.
When the flags are specified to a related kernel system call, they are supplied in this order where the first item is the least significant bit: Present, ReadOnly, NoExecute. To mark a page as NoExecute and Present, the flags field would equal 5. For a Present and ReadOnly page, the flags are equal to 3. The Owner bit cannot be specified as only the kernel uses this flag. The flags field can also specify user-defined bits. These bits are the 16th bit (starting from 0) on up. So, if we had a library OS want to create a CopyOnWrite bit, specifying that with a Present bit set would mean we would send hex 0x101 (decimal 257) for the flags field. The kernel sets the user-defined bits in its own order by mapping them to available bits in the hardware page table entries (PTEs). System calls that are given flags fields where there are bits set that the kernel does not expect, or more specifically user-defined bits that cannot be accommodated, that system call will fail and return some falsey value as an indication.
The kernel system calls related to page allocation and mapping have an enumerated return value to indicate the error or 0 if successful. The MapperError error codes are as follows:
0:SUCCESS- Success! It performed the specified action.1:INVALID_FLAGS- Invalid flags were specified.2:NOT_FREE- The physical page requested is already allocated elsewhere.3:NOT_ALLOCATED- The physical page cannot be freed because it was not allocated.4:INVALID_SOURCE- The given virtual address is not a valid virtual address for the current Process.5:INVALID_TARGET- The given virtual address to map to is not a valid virtual address for the current Process.6:NOT_EMPTY- The physical page is a page table structure that represents a Resource or Process and it is not empty.
The kernel provides these basic physical memory primitive functions:
ALLOC_PAGE(physicalAddress, virtualAddress, flags) -> MapperError- Allocates the given physical page to the given virtual address of the Process, if there is no page currently mapped there.virtualAddresspoints to the specific page table entry to modify using the recursive entry (PML4[510]) to do so. The kernel will mark that physical page allocated. This returnsNOT_FREEif that physical page is already allocated. Sets the flags on the leaf PML1 page table entry. Theflagsfield is a set of flags that are not in any hardware specific order. They are defined by the system. See the aforemented structure. Theflagsfield must indicate thePresentflag and otherwise returnsINVALID_FLAGSand fails. If the givenvirtualAddressis not aligned to a page table entry, the page table structure does not exist, or it is not one that is owned by the current Process, then it fails with anINVALID_TARGETerror. Ownership is decided by whether or not there is anOwnerflag when walking the page table structure while not seeing aGrantflag before getting to the point specified by the givenvirtualAddress. The root of the process is always considered markedOwner. Therefore, you can only allocate pages into a Resource or child Process that is 'owned' by the calling Process.REMAP_PAGE(virtualAddress, targetAddress) -> MapperError- Atomically moves the physical page that is allocated to the page table entry denoted by the givenvirtualAddressto being mapped into the empty page table entry given bytargetAddress. Fails by returningINVALID_SOURCEif thevirtualAddressis not aligned to a page table entry, is in a non-existing page table structure, is unmapped, or not owned by the current Process. Ownership is known by finding anOwnerflag set when walking the page table structure. The root of the process is always considered markedOwner. Ownership is always void if aGrantflag is found when walking the structure instead. Does nothing ifvirtualAddressandtargetAddressare effectively the same. Fails by returningINVALID_TARGETif thetargetAddressis not aligned to a page table entry, is within a page table structure that does not exist, or it is not one that is owned by the current Process. It keeps the same flags.CHMOD_PAGE(virtualAddress, flags) -> MapperError- Sets new flags on the given virtual address page table entry. Must be aligned to a page. Userspace processes can use the recursive page indexPML4[510]in order to target an inner page table entry. Fails by returningINVALID_SOURCEif the page table entry to update is not in userspace. Fails by returningINVALID_FLAGSif theflagsprovided cannot be accommodated or specify flags that do not exist. Fails by returningINVALID_SOURCEifvirtualAddressis not aligned to a page table entry, is within a non-existant page table structure, or the page table structure is not owned by the calling Process. Ownership is known by finding anOwnerflag set when walking the page table structure. The root of the process is always considered markedOwner. Ownership is always void if aGrantflag is found when walking the structure instead.UNMAP_PAGE(virtualAddress) -> MapperError- Clears the page mapped at the page table entry at the givenvirtualAddress. This frees that physical page while voiding out that page table entry within the Process. Fails by returningNOT_ALLOCATEDif the physical page is not one that is in the allocatable space tracked by the page allocator. Fails by returningINVALID_SOURCEif the virtualAddress is unmapped or not owned by the current Process. Fails by returningINVALID_SOURCEif the page table structure is not owned by the calling Process. Ownership is known by finding anOwnerflag set when walking the page table structure. The root of the process is always considered markedOwner. Ownership is always void if aGrantflag is found when walking the structure instead. If the page table entry in question has theOwnerflag, this is a Resource or Process. For aGrantentry, this will clear the page table entry and decrement the reference count. For anOwnerentry, this call will fail withNOT_EMPTYunless the page table is empty. This call will fail if theOwnerentry is set but the reference count for the Resource is not zero. The calling Process must revoke all grantees of this Resource to free the page. To then ultimately free a Resource, the owner must free all pages within the Resource including the page table structure itself.MAP_ZERO(virtualAddress, flags) -> MapperError- Maps in a kernel maintained 'zero' page which is a page that is prewritten with zeros. The kernel forcibly maps this in read-only. Therefore, this fails by returningINVALID_FLAGSifflagsdoes not indicate theReadOnlyandPresentflags. This fails withINVALID_TARGETif thevirtualAddressis not aligned to a page table entry, is indicating a page table structure that does not exist, or is not owned by the calling Process. Ownership is known by finding anOwnerflag set when walking the page table structure. The root of the process is always considered markedOwner. Ownership is always void if aGrantflag is found when walking the structure instead.
The kernel physical page bitmap that shows which pages are allocated and which ones are not is a readable structure at a well-known virtual address. All library OSes can read the bitmap to find a free page. Therefore, a library OS can implement its own page allocator, although it must cooperate with other page allocators on the system.