Updates system design docs.

This commit is contained in:
wilkie
2025-12-30 02:26:30 -05:00
parent 3da042e53b
commit ac18e25ddc
3 changed files with 33 additions and 9 deletions

View File

@@ -24,13 +24,14 @@ The kernel system calls related to page allocation and mapping have an enumerate
* `3`: `NOT_ALLOCATED` - The physical page cannot be freed because it was not allocated.
* `4`: `INVALID_SOURCE` - The given virtual address is not a valid virtual address for the current Process.
* `5`: `INVALID_TARGET` - The given virtual address to map to is not a valid virtual address for the current Process.
* `6`: `NOT_EMPTY` - The physical page is a page table structure that represents a Resource or Process and it is not empty.
The kernel provides these basic physical memory primitive functions:
* `ALLOC_PAGE(physicalAddress, virtualAddress, flags) -> MapperError` - Allocates the given physical page to the given virtual address of the Process, if there is no page currently mapped there. `virtualAddress` points to the specific page table entry to modify using the recursive entry (`PML4[510]`) to do so. The kernel will mark that physical page allocated. This returns `NOT_FREE` if that physical page is already allocated. Sets the flags on the leaf PML1 page table entry. The `flags` field is a set of flags that are not in any hardware specific order. They are defined by the system. See the aforemented structure. The `flags` field must indicate the `Present` flag and otherwise returns `INVALID_FLAGS` and fails. If the given `virtualAddress` is not aligned to a page table entry, the page table structure does not exist, or it is not one that is owned by the current Process, then it fails with an `INVALID_TARGET` error. Ownership is decided by whether or not there is an `Owner` flag when walking the page table structure while not seeing a `Grant` flag before getting to the point specified by the given `virtualAddress`. The root of the process is always considered marked `Owner`. Therefore, you can only allocate pages into a Resource or child Process that is 'owned' by the calling Process.
* `REMAP_PAGE(virtualAddress, targetAddress) -> MapperError` - Atomically moves the physical page that is allocated to the page table entry denoted by the given `virtualAddress` to being mapped into the empty page table entry given by `targetAddress`. Fails by returning `INVALID_SOURCE` if the `virtualAddress` is not aligned to a page table entry, is in a non-existing page table structure, is unmapped, or not owned by the current Process. Ownership is known by finding an `Owner` flag set when walking the page table structure. The root of the process is always considered marked `Owner`. Ownership is always void if a `Grant` flag is found when walking the structure instead. Does nothing if `virtualAddress` and `targetAddress` are effectively the same. Fails by returning `INVALID_TARGET` if the `targetAddress` is not aligned to a page table entry, is within a page table structure that does not exist, or it is not one that is owned by the current Process. It keeps the same flags.
* `CHMOD_PAGE(virtualAddress, flags) -> MapperError` - Sets new flags on the given virtual address page table entry. Must be aligned to a page. Userspace processes can use the recursive page index `PML4[510]` in order to target an inner page table entry. Fails by returning `INVALID_SOURCE` if the page table entry to update is not in userspace. Fails by returning `INVALID_FLAGS` if the `flags` provided cannot be accommodated or specify flags that do not exist. Fails by returning `INVALID_SOURCE` if `virtualAddress` is not aligned to a page table entry, is within a non-existant page table structure, or the page table structure is not owned by the calling Process. Ownership is known by finding an `Owner` flag set when walking the page table structure. The root of the process is always considered marked `Owner`. Ownership is always void if a `Grant` flag is found when walking the structure instead.
* `UNMAP_PAGE(virtualAddress) -> MapperError` - Clears the page mapped at the page table entry at the given `virtualAddress`. This frees that physical page while voiding out that page table entry within the Process. Fails by returning `NOT_ALLOCATED` if the physical page is not one that is in the allocatable space tracked by the page allocator. Fails by returning `INVALID_SOURCE` if the virtualAddress is unmapped or not owned by the current Process. Fails by returning `INVALID_SOURCE` if the page table structure is not owned by the calling Process. Ownership is known by finding an `Owner` flag set when walking the page table structure. The root of the process is always considered marked `Owner`. Ownership is always void if a `Grant` flag is found when walking the structure instead.
* `MAP_ZERO(virtualAddress, flags) -> MapperError` - Maps in a kernel maintained 'zero' page which is a page that is prewritten with 0s. The kernel forcibly maps this in read-only. Therefore, this fails by returning `INVALID_FLAGS` if `flags` does not indicate the `ReadOnly` and `Present` flags. This fails with `INVALID_TARGET` if the `virtualAddress` is not aligned to a page table entry, is indicating a page table structure that does not exist, or is not owned by the calling Process. Ownership is known by finding an `Owner` flag set when walking the page table structure. The root of the process is always considered marked `Owner`. Ownership is always void if a `Grant` flag is found when walking the structure instead.
* `UNMAP_PAGE(virtualAddress) -> MapperError` - Clears the page mapped at the page table entry at the given `virtualAddress`. This frees that physical page while voiding out that page table entry within the Process. Fails by returning `NOT_ALLOCATED` if the physical page is not one that is in the allocatable space tracked by the page allocator. Fails by returning `INVALID_SOURCE` if the virtualAddress is unmapped or not owned by the current Process. Fails by returning `INVALID_SOURCE` if the page table structure is not owned by the calling Process. Ownership is known by finding an `Owner` flag set when walking the page table structure. The root of the process is always considered marked `Owner`. Ownership is always void if a `Grant` flag is found when walking the structure instead. If the page table entry in question has the `Owner` flag, this is a Resource or Process. For a `Grant` entry, this will clear the page table entry and decrement the reference count. For an `Owner` entry, this call will fail with `NOT_EMPTY` unless the page table is empty. This call will fail if the `Owner` entry is set but the reference count for the Resource is not zero. The calling Process must revoke all grantees of this Resource to free the page. To then ultimately free a Resource, the owner must free all pages within the Resource including the page table structure itself.
* `MAP_ZERO(virtualAddress, flags) -> MapperError` - Maps in a kernel maintained 'zero' page which is a page that is prewritten with zeros. The kernel forcibly maps this in read-only. Therefore, this fails by returning `INVALID_FLAGS` if `flags` does not indicate the `ReadOnly` and `Present` flags. This fails with `INVALID_TARGET` if the `virtualAddress` is not aligned to a page table entry, is indicating a page table structure that does not exist, or is not owned by the calling Process. Ownership is known by finding an `Owner` flag set when walking the page table structure. The root of the process is always considered marked `Owner`. Ownership is always void if a `Grant` flag is found when walking the structure instead.
The kernel physical page bitmap that shows which pages are allocated and which ones are not is a readable structure at a well-known virtual address. All library OSes can read the bitmap to find a free page. Therefore, a library OS can implement its own page allocator, although it must cooperate with other page allocators on the system.

View File

@@ -36,7 +36,7 @@ The kernel offers several primitive functions to faciliate the creation and cont
* `ALLOC_PROCESS(physicalAddress, processAddress) -> ProcessError | processId` - Spawns a child process that will be mapped to the provided userspace address of the calling Process. Returns the `processId`, which is a virtual address in kernel space that, via the recursive page table entry (`PML4[510]`), points to the root page table of the new process. The `processAddress` needs to be a valid page table entry to attach the root page table for the child Process using the recursive entry `PML4[510]` to do so. Fails with `INVALID_TARGET` if the page table structure does not exist or not owned by the calling Process. This fails with `NO_ROOM` if there are no available processes left in the system because the kernel's process map is full. This fails with `NOT_FREE` if the `physicalAddress` specified is not actually free.
* `YIELD_PROCESS(processId) -> ProcessError` - Cooperatively yields to the given Process. The `onYield` upcall of the target Process will be passed the calling Process `processId`. Effectively, this swaps the current root page table (PML4 via CR3) to the one for the given process. The process has to deal with restoring its own state. There's no context being stored because the kernel is effectively stateless except for maintaining access to all address spaces. On success, this function never returns. If the `processId` is in any way invalid, it will return with the `INVALID_TARGET` error.
* `FREE_PROCESS(processId, processAddress) -> ProcessError` - Yields to a child process that was previously allocated via an early `ALLOC_PROCESS` call to allow it to deal with closing itself. It calls the `onFree` upcall while passing the current Process `processId`. The calling Process might expect that the target Process yield back. Fails with `INVALID_SOURCE` if the calling Process is not the parent of the given Process (that is, `processAddress` does not point to the same physical page as the PML4 within the address space rooted at `processId`). On success, the Process is effectively freed and can no longer be the target of a yield.
* `FREE_PROCESS(processId, processAddress) -> ProcessError` - Yields to a child process that was previously allocated via an early `ALLOC_PROCESS` call to allow it to deal with closing itself. It calls the `onFree` upcall while passing the current Process `processId`. The calling Process might expect that the target Process yield back. Fails with `INVALID_SOURCE` if the calling Process is not the parent of the given Process (that is, `processAddress` does not point to the same physical page as the PML4 within the address space rooted at `processId`). On success, the Process is effectively freed and can no longer be the target of a yield. The parent Process effectively frees a child Process on its own by deallocating all pages of the child Process.
The kernel has the following check (which is identical to the one for checking if a Process owns a Resource) of whether or not the calling Process is the parent of the given Process, which it checks on the Process free operation:

View File

@@ -18,14 +18,13 @@ The kernel system calls related to resource allocation and mapping have an enume
The kernel offers several primitive functions to faciliate the creation and control of resources:
* `ALLOC_RESOURCE(physicalAddress, resourceAddress) -> ResourceError | resourceId` - Creates an empty Resource with an implied depth using the given unallocated `physicalAddress` as the page. The `resourceAddress` is the virtual address of the userspace page table entry (PTE) to write the Resource root using the `PML4[510]` recursive route. The `depth` is thus determined by the `resourceAddress` by virtue of the type of page table structure the PTE would be written to. If the PTE is written to a PML3 in the Process, then the Resource must be a PML2 root and have a depth of 2. The `depth` then determines the size of the virtual address space the Resource will represent. The kernel will apply the `Owner` flag on the Resource when mapping it to the Process address space at the given `resourceAddress`. The kernel will essentially create the root page table for the Resource and nothing else while mapping that into the Process address space. Given that this succeeds, it will return the `resourceId` for the Resource, which is the virtual address in the kernel that points to kernel-space memory for the root page table of that Resource. **Note**: This does not return `0` on success. Instead, it returns a `resourceId` which is always, unsigned, larger than `0xff`. This fails with `INVALID_TARGET` if the `resourceAddress` points to a page table structure that does not exist or is not owned by the calling Process. This fails with `NO_ROOM` if we have hit the Resource cap or there's no room for it. This fails with `NOT_FREE` if the provided `physicalAddress` is not actually free.
* `FREE_RESOURCE(resourceId, resourceAddress) -> ResourceError` - Frees an existing Resource via the resource id. Fails with `INVALID_SOURCE` if the Resource does not exist. A grantee of a Resource can also call this which will unmap the Resource. Resources are reference counted. When the owner frees the Resource, it is revoked from any Process that had been granted it.
* `GRANT_RESOURCE(resourceId, resourceAddress, processId, targetResourceAddress, flags) -> ResourceError` - Attaches a shared copy of the Resource to the given Process. This will effectively add a page table entry signified by the `targetResourceAddress` in the target Process with the given `flags` and also give it the `Grant` flag. If any `flags` are invalid, it will fail with the `INVALID_FLAGS` error. Resources are reference counted. This will increment the reference count. The Resource can be unmapped from the target Process by calling `REVOKE_RESOURCE` with the same arguments. The Resource will be forcibly revoked if the owning Process calls `FREE_RESOURCE` on this Resource or otherwise has this Resource deallocated at cleanup. Fails with `INVALID_SOURCE` if the Resource does not exist or is not owned by the calling Process. Fails with `INVALID_TARGET` if the given Process does not have the `targetResourceAddress` free, that is it is already occupied in its own page table. Fails with `INVALID_TARGET` if the given Process provided by `processId` does not exist. Fails with `INVALID_SOURCE` if the `resourceAddress` is not the same Resource as indicated by `resourceId`.
* `GRANT_RESOURCE(resourceId, resourceAddress, processId, targetResourceAddress, flags) -> ResourceError` - Attaches a shared copy of the Resource to the given Process. This will effectively add a page table entry signified by the `targetResourceAddress` in the target Process with the given `flags` and also give it the `Grant` flag. If any `flags` are invalid, it will fail with the `INVALID_FLAGS` error. Resources are reference counted. This will increment the reference count. The Resource can be unmapped from the target Process by calling `REVOKE_RESOURCE` with the same arguments. Fails with `INVALID_SOURCE` if the Resource does not exist or is not owned by the calling Process. Fails with `INVALID_TARGET` if the given Process does not have the `targetResourceAddress` free, that is it is already occupied in its own page table. Fails with `INVALID_TARGET` if the given Process provided by `processId` does not exist. Fails with `INVALID_SOURCE` if the `resourceAddress` is not the same Resource as indicated by `resourceId`.
* `REVOKE_RESOURCE(resourceId, resourceAddress, processId, targetResourceAddress) -> ResourceError` - Detaches a Resource from a grantee that had been granted this Resource via `GRANT_RESOURCE` earlier. This unmaps the page table root for this Resource from the target Process given by `processId`. This fails with `INVALID_SOURCE` if the Resource does not exist or is not owned by the current Process. Fails with `INVALID_TARGET` if the given Process does not have the `targetResourceAddress` mapped in or if the `targetResourceAddress` is not the Resource indicated by `resourceId`.
* `CHOWN_RESOURCE(resourceId, resourceAddress, processId, targetResourceAddress) -> ResourceError` - Updates the owner of the Resource to the given Process. Atomically swaps the `Owner` flags on each Process's mapped in Resource. Fails with `INVALID_SOURCE` if the Resource does not exist or is not owned by the calling Process. Fails with `INVALID_SOURCE` if the given Process does not have the Resource mapped in. Fails `INVALID_TARGET` if the given Process provided by `processId` does not exist. Fails with `INVALID_TARGET` if the given Resource provided by `targetResourceAddress` is not mapped into the given Process provided by `processId` or if that Resource is not the same Resource as indicated by `resourceId`.
The kernel creates a Resource by using this procedure:
The kernel creates a Resource by using this procedure (`ALLOC_RESOURCE`):
* Allocate a root page table by allocating a single physical page.
* Allocate a root page table for the Resource using the given `physicalAddress` and fail with `NOT_FREE` if that physical page is not free.
* Map this into the kernel memory space. The kernel maintains all created resources in a single root PML3. So, it can manage up to 512 3-level resources. It would be rare to create a 3-level resource (512 GiB of virtual space), so we can expect mostly 1GiB Resources which are rooted at PML2 and 2MiB resources rooted at the PML1 level. The kernel can keep 256M second-level (1GiB) resources mapped in at a time. The identifier for a Resource is, then, the virtual address that uses the recursive mapping (PML4[510]) to point to the root table of the Resource. Whenever a Process gives the Resource id, the kernel can securely check that the physical page pointed to by that virtual addres is also present within the Process's root page table.
* The kernel places a record in its hash table for mapping the physical address of that root page to the virtual address which serves as that Resource id. This is so it may look it up in the event that it needs to deallocate the Resource forcibly at any point. Normally, the application will always deallocate the Resource at some point before it ends execution. However, the program may crash.
* Then, it can map this Resource page table into the Process so that the Process now owns the Resource. It does this by mapping the new resource into the position requested by the Process via `resourceAddress`. The virtual address given by `resourceAddress` is effectively pointing to the same physical page that the kernel's own copy of the Resource is mapped to, which is known as `resourceId`. Therefore, this provides the kernel with a constant-time check for ownership: the `resourceId` is a virtual address which is pointing to the same physical address as the virtual address of `resourceAddress`. It should map the resource in with the `Owner` flag since it was the calling Process that created the Resource. (See `docs/MEMORY.md` for information about these flags)
@@ -33,7 +32,31 @@ The kernel creates a Resource by using this procedure:
The kernel has the following check of ownership, which it checks on every Resource operation:
* `is_mapped(resourceId, resourceAddress) -> boolean` - This looks at the `resourceId` and ensures that it is a virtual address that runs through the kernel's Resource map. Let's say that we maintain a mapping of all resources on `PML4[508]`, so the 508th index of the root page table points to a PML3 that contains, as leaves of the tree, the root page tables of all known Resource objects. We can then tell very easily if `resourceId` is a virtual address that uses the recursive entry (`PML4[510]`) to point to the physical page of the Resource root. That physical page must be the same one pointed to by the given `resourceAddress`. We also verify that the `resourceAddress` is not in higher memory, which is always owned by privileged kernel code. If all of these hold true, the current calling Process owns the given Resource.
* `is_mapped(resourceId, resourceAddress) -> boolean` - This looks at the `resourceId` and ensures that it is a virtual address that runs through the kernel's Resource map. Let's say that we maintain a mapping of all resources on `PML4[508]`, so the 508th index of the root page table points to a PML3 that contains, as leaves of the tree, the root page tables of all known Resource objects. We can then tell very easily if `resourceId` is a virtual address that uses the recursive entry (`PML4[510]`) to point to the physical page of the Resource root. That physical page must be the same one pointed to by the page table entry given `resourceAddress`. We also verify that the `resourceAddress` is not in higher memory, which is always owned by privileged kernel code. If all of these hold true, the current calling Process owns the given Resource.
The kernel grants a Resource to another Process by using this procedure:
The kernel grants a Resource to another Process by using this procedure (`GRANT_RESOURCE`):
* Validates that the `resourceId` matches the mapped in `resourceAddress` and otherwise fails with `INVALID_SOURCE`.
* Validates that the current Process owns the given Resource.
* Validates that the `processId` is a valid process otherwise fails with `INVALID_TARGET`.
* Validates that the `targetResourceAddress` is pointing to a valid and empty page table entry that is owned by the target Process. Otherwise fails with `INVALID_TARGET`.
* Writes the page table entry in the target Process to point to the same physical address as `resourceAddress` but without the `Owner` flag and with the `Grant` flag.
* Returns `SUCCESS` to the original calling Process.
The kernel revokes a Resource from another Process by using this procedure (`REVOKE_RESOURCE`):
* Validates that the `resourceId` matches the mapped in `resourceAddress` and otherwise fails with `INVALID_SOURCE`.
* Validates that the current Process owns the given Resource.
* Validates that the `processId` is a valid process otherwise fails with `INVALID_TARGET`.
* Validates that the `targetResourceAddress` is pointing to a valid page table entry that is owned by the target Process and points to the given Resource. Otherwise fails with `INVALID_TARGET`.
* Voids the page table entry in the target Process.
* Returns `SUCCESS` to the original calling Process.
The kernel changes ownership of a Resource by using this procedure (`CHOWN_RESOURCE`):
* Validates that the `resourceId` matches the mapped in `resourceAddress` and otherwise fails with `INVALID_SOURCE`.
* Validates that the current Process owns the given Resource.
* Validates that the `processId` is a valid process otherwise fails with `INVALID_TARGET`.
* Validates that the `targetResourceAddress` is pointing to a valid page table entry that is owned by the target Process and points to the given Resource. Otherwise fails with `INVALID_TARGET`.
* Atomically swaps the page table entries. Failing atomicity, it can set the `Grant` bit and clear the `Owner` in the calling Process and then set the `Owner` bit and clear the `Grant` bit on the target Process.
* Returns `SUCCESS` to the original calling Process.