Build System: ============= 1) Pass on configuration parameters to scons from a configuration file: (Platform name, CPP defines.) 2) Build files in a separate directory. - DONE 3) Build files in subdirectories depending on CPP defines for the files in those subdirectories. CML2 will cover mostly for (1) and (3) What to do next: 1) Define a more interesting linker script. 2) Add uart code to get immediate printfs. (For both loader + payload image) 3) Try linking with the loader and loading the payload image with the loader. What to do next: 4) Define a KIP structure and page. 5) Implement a page table section. 6) Implement routines to modify page tables. 7) Define a platform memory description (Peripherals, memory pool). 8) Define a mapper. 9) Map platform peripheral memory. 10) Implement a memory allocator. 11) Implement a page fault handler for kernel page faults. 12) Implement arch-specific cache and tlb operations The Big TODOs: -------------- 1) A memory manager 2) System calls and IPC calls. 3) Scheduler/Context switching. 4) Virtual memory support for v5 and v6. The key approach: ----------------- Progressing with minimal steps at a time. Progress Log: ------------- 1) Don't forget to modify control register to enable high-vectors. 2) Investigate ROM/System bits, access permissions and domains. 3) Don't forget to map page tables region as unbufferable/uncacheable. ---- Done ---- 1) Enable caches/wb after jumping to virtual memory. See if this helps. 2) Flush caches/wb right after jumping before any data references. See if this also helps. *) Don't forget to modify control register to enable high-vectors. ---- Done ---- 1) Test printascii and printk *) Don't forget to modify control register to enable high-vectors. Was not accessing the right uart offset, which caused precise external aborts. ---- Done ---- 1) Implement a boot memory allocator 2) Allocate memory for secondary level page tables -> At least to describe kernel memory. For example: first level table ------>>> section ------>>> section second level table ------>>>>>>>>>>>>>>>---------->>>>>>>>>>>>> Timer Page ------>>> section ---------->>>>>>>>>>>>> UART Page ---------->>>>>>>>>>>>> VIC Page Here, one secondary table would be enough to map devices with page granularity. This would be the table that describes the first IO region. 3) Add add/remove mapping functions with page size granularity. *) Don't forget to modify control register to enable high-vectors. ---- Done ---- 1) Add remove_mapping() function to remove early boot one-to-one mapping etc. (This might internally handle both section and page mappings depending on given size) 2) Test kmalloc/alloc_page. Implement solution for their dependency on each other. ---- Done --- 1) Make sure to build 3 main executables, for kmalloc/alloc_page/memcache 2) Make sure to be able to pass parameters as arguments to each test. 3) Write python scripts that look for expected output. (Determine criteria other than init_state == end_state) ---- Done ---- Reading python script output by hand works well for now. 1) Must allocate PGD's from a memcache? i.e. a cache with 10 pgds (175K) should suffice. 2) Must allocate PMD's from a memcache? a cache with ~100 pmds, should suffice (100K). These must be allocated as unmapped/uncached. (use __alloc_page()?) Don't think they need phys_to_ptab kind of conversion. They can have regular virtual offsets as long as they're mapped with correct settings. (AP's and c/b bits) 3) Implement tcbs. User-space thread spawning and scheduler. -> Tough one! - tcb's must keep track of the pgd_ptr of each task. which in turn are connected to their pmds. - each new pgd must copy all kernel-space pmds. - each pgd has distinct pmds for userspace area and same pmds for kernel area. How to initiate a svc task, switch back and forth with the kernel: - Load the task. - Kernel is aware of where it is loaded. TODO: - Load qemu/insight tutorial to wikia - Try loader on a board and/or qemu - Add bootdesc to kernel image, and try reading it and discovering inittask from it. NEXT: - Try jumping to inittask from init.c - Create a syscall page. Done the inittask jumping. About to create syscalls. TODO: Link the inittask with l4lib. This will reveal what it exactly does in userspace to call the system calls. i.e. what offset it jumps to, and what it fills its registers with. Then use this information to implement the first system call. e.g. it would call a dummy system call by reading the kip. the system call could then dump the state of all registers to see what it passed to the kernel. We need to do all this checking because it seems there's no syscall page but just kip. Hints: See user.S. Implemented in kernel, it ensures sp contains syscall offset. this is then checked in the syscall. Both implemented in kernel means this can be changed without breaking users. e.g. it could be the direct vector address in kip (like 0xFFFFFF00), they could all be swi's, SVC and USR LR's determine where they came from and where to return. Wrote the syscall page. Test from userspace each system call, i.e. detect hat is called etc. TODO: Done all of above. Variety of system calls can be called. However, it fails in various ways. Must fix this. qemu works up to some 4 system calls. It is unknown why any more causes weird exceptions. Real hardware fails in any setup, with Invalid CPU state. needs investigating. TODO: Fixed everything, wasn't saving context upon context switch. kernel corrupted the registers. TODO: Fixed anything that wasn't there. Added irqs and scheduler. Now. We need to sort out per process kernel context. ^ | Userspace V ... sp_svc ^ | | | | 4 KB Page | | | V tcb I think every process must have unique virtual page for stack + ktcb. If the same page is used (but different physical pages) Then I can't keep track of ktcbs when they aren't runnable. Now the paths. -> USR running. (USR Mode) -> System call occurs. (USR Mode -> SVC Mode) - Save user context, (in the ktcb) load sp_svc, continue. -> IRQ occurs ((USR Mode, SVC Mode, or IRQ Mode) -> IRQ Mode) - If from USR Mode, save user context into ktcb->user_context. - If from SVC Mode, save kernel context in the ktcb->kernel_context. - If from IRQ Mode, save irq context in the irq stack. -> Scheduling occurs (IRQ Mode -> (USR Mode or SVC Mode)) - Restore user context CHANGES: Forget above design. Each interrupter saves the context of interruptee on its stack. E.g. svc saves usr, irq saves svc, or usr. Only that because irqs are reentrant, irqs change to svc and save context to svc stack. Only upon context switches the context is restored from the stack and pushed to ktcb context frame. This way at any one time a non-runnable thread could have svc or usr context in its frame depending on what mode it was interrupted and blocked. TODO: - 8-byte aligned stack on irq handler. - What do those cpsr_fcxt flags mean? When are they needed? Done: - Tie up jump_usr(). Also check whether calling void schedule(void) push anything to stack. It really shouldn't. (But I'm sure it pushes r0-r3,r12,lr.) Need to rewind those. - Is current macro correct? Check. TODO: - 8-byte aligned stack on irq handler. - What do those cpsr_fcxt flags mean? When are they needed? - Limit irq nesting so that it never overflows irq stack. Things to do next: ------------------ - Add new tasks. - Done (Added a compile-time roottask) - Add multi-page task support. (With data/text/bss sections transferred to bootdesc.) - Implement all 9 system calls. - Remove malformed linked lists from allocators. - Add vfs task. - Add 6 major POSIX calls. (open/close/read/write/creat/seek) - Add microwindows/Keyboard/Mouse/CLCD support. - Add FAT32? filesystem. - Add Device driver framework. (Bus support, device probe/matching) - Add ethernet + lwip stack. Things to do right now: ----------------------- readelf is broken for finding --lma-start-end and --find-firstpage. Fix those. - Fixed in half an hour. More quick TODOs: ----------------- - Pass variable arguments, to every syscall with right args. Previous stuff is done. Now to add: ------------------------------------ - I am implementing IPC system call. Currently utcb's are copied from one thread to another (not even compiled but the feature is there), but no synchronisation is implemented. - TODO: Add waitqueues for ipc operations. E.g. a thread that does ipc_send() waits for the thread that does ipc_receive() to wake it up or vice versa. - Looks like every ipc instance between 2 unique threads requires a unique waitqueue. - If there are n threads, there could be a maximum of n(n-1) / 2 ipc instances (e.g. a mesh) if simultaneous ipcs are allowed. wait, its not allowed, so, for n threads there can be n / 2 ipc instances, which is better! for waitqueue complexity. Done: - Some hash tables to keep ipc rendezvous. - Not working with lists yet. TODO: - At least wait_event() must be a macro and the sleep condition must be checked after the spinlock is acquired (and before the sleepers++ in case the condition is the sleepers count). Currently there is a race about this. - Figure the problem with lists. TODO: - MAPPINGS: There must be an update_mappings() function that updates all new physical to virtual mappings that are common for every process. For example, any common kernel mapping that can be accessed by any process requires this. Currently add_mapping() adds the mapping for the current process only. We need a storyline for initialisation and interaction of critical server tasks after the microkernel is finished initialising. The Storyline: -------------- Microkernel initialises itself. Microkernel has filled in page_map array. Microkernel has filled in physmem descriptor. Microkernel reads bootdesc. Microkernel allocates, maps, starts mm0. (Memory manager) Microkernel allocates, maps, starts name0. (Naming server) Microkernel allocates, maps, starts pm0. (Process manager) == Servers Start == name0 waiting for start message from mm0. pm0 waiting for start message from mm0. mm0 invokes request_bootdesc on Microkernel. mm0 invokes request_pagemap on Microkernel. mm0 initialises the page allocator. mm0 initialises the page_map arrays. == mm0 in full control of name0 and pm0 address spaces, and can serve memory requests. == mm0 starts pm0. pm0 invokes request_procdesc on Microkernel. (Learn what processes are running, and relationships) == pm0 in full control of name0 and pm0 process information, and can serve process requests. == mm0 starts name0. name0 initialises its naming service. name0 waiting for advertise_method_list. == Method Advertise Stage == mm0 invokes advertise_method_list to name0. (Tell who can invoke what method on mm0) pm0 invokes advertise_method_list to name0. (Tell who can invoke what method on pm0) == name0 in full control of what all servers can invoke on each other. == == Method Request Stage == pm0 invokes request_method_list on name0. (Learn what method pm0 can invoke on who) pm0 initialises its remote method array. mm0 invokes request_method_list on name0. (Learn what method pm0 can invoke on who) mm0 initialises its remote method array. == All servers in full awareness of what methods they can invoke on other servers == Remote methods: --------------- Remote methods can pass up to the architecture-defined number of bytes without copying to/from each other, and setting up a connection. Alternatively for larger data sizes, a connection is set-up (a shared memory area) and a *single structure* of arbitrary size can be read/written. Remote method invocation semantics typically forbid using data types any more complex than a single raw structure of arbitrary size. This greatly simplifies the communication semantics and data transfer requirements. For a readily set-up connection, the remote method invocation cost is the cost of a single ipc, (i.e. context switch to server and back to client.) also the shared memory is currently uncached. RMI in the usual case does *not* copy data. There's no object-level support, e.g. no marshalling, no dynamic type discovery or any other object oriented programming bloat. Objects *are* policy anyway. There's no support for network-transparent communication either. If all these things were supported the system would end up becoming large and complex like Chorus or Sprite and guaranteed to fail. Context Switch between servers: ------------------------------- It is the intention that context switches between the critical servers have minimal overhead. For example, the single-address-space linux kernel can switch between kernel threads quite cheaply, because there's no need to change page table mappings and therefore no cache/tlb trashing. The idea is to link and run critical servers in non-overlapping address space areas (e.g. simply link at its physical address, Quick todo: ----------- 1) Fix page allocator as a library. - Done 2) Reduce physmem descriptor's fields. - Done 3) Export memcache as a library. - Done 4) Fix the problem of unmapping multiply-mapped pages. 5) Fix the problem of assuming a free() always occurs from kernel virtual addresses. 4-5) Hint: Set up the physical page array with mapping information. 6) Sort out how to map virt-to-phys phys-to-virt in tasks. Kernel virt-to-phys must be replaced. 7) Refactor ipc_send()/ipc_recv() add_mapping/add_mapping_pgd(). Remove some gotos. The revised storyline, with no naming yet: -------------- Microkernel initialises itself. Microkernel has filled in page_map array. Microkernel has filled in physmem descriptor. Microkernel reads bootdesc. (3 tasks: mm0, pm0, task3) Microkernel allocates, maps, starts mm0. (Memory manager) Microkernel allocates, maps, starts pm0. (Process manager) == Servers Start == pm0 waiting for start message from mm0. mm0 invokes request_bootdesc on Microkernel. mm0 invokes request_pagemap on Microkernel. mm0 initialises the page allocator. mm0 initialises the memory bank and page descriptors. mm0 sets up its own address space temporarily. == mm0 is somewhat initialised and can serve memory requests. == mm0 starts pm0. pm0 invokes request_procdesc on Microkernel. (Learn what processes are running, and relationships) pm0 sets up task_desc structures for running tasks. == pm0 is somewhat initialised and can serve task-related requests. == pm0 calls mock-up execute() to demonstrate demand paging on the third task's execution. Long Term TODOs: ------------------- - Finish inittask (pager/task manager) - Start on fs server (vfs + a filesystem) - Finish: fork, execve, mmap, open, close, create, read, write Current todo: ============== - Use shmat/shmget/shmdt to map block device areas to FS0 and start implementing the VFS. todo: - Generate 4 vmfiles: - env, stack, data, bss. - Fill in env as a private file. As faults occur on env, simply map file to process. - Create an empty data, bss and stack file. As faults occur on real data, copy on write onto proc->data file, by creating shadows. As faults occur on devzero, copy on write onto proc->stack file, by creating shadows. As faults occur on bss, copy on write onto proc->bss file, by creating shadows. FORK: If a fork occurs, copy all vmas into new task. Find all RW and VM_PRIVATE regions. All RW shadows are eligible. Create a fork file for each RW/VM_PRIVATE region. E.g. task->fork->data task->fork->stack task->fork->bss All RW/PRIVATE shadows become RO, with task->fork owners, rather than their original owners e.g. proc->data, proc->stack etc. All pages under shadow are moved onto those files. Increase file refcount for forker tasks. As faults occur on fork->stack/bss/data, copy on write onto proc->stack/bss/data, by making shadows RW again and copying those faulted pages from fork files onto the proc->x files.