Upon fork, child was created in a new space but as a copy of any
cloned thread in the parent space. This was due to the search of forker thread
by its space id (which is shared among many cloned threads).
Now fixed.
modified: src/api/thread.c
modified: tasks/mm0/src/task.c
- KIP's pointer to UTCB seems to work with existing l4lib ipc functions.
- Works up to clone()
- In clone we mmap() the same UTCB on each new thread - excessive.
- Generally during page fault handling, cloned threads may fault on the same page
multiple times even though a single handling would be enough for all of them.
Need to detect and handle this.
- Now libl4 has no references to utcb page or shmat etc.
- Pager does not deal with special case utcb page allocation.
It instead allocates a shared page from shm memory pool.
- All tasks working to original standard.
Next:
- Add per-thread utcb allocation from the kernel
- Add larger register file for standard ipc
- Add long ipc (up to 1Kb)
Previously a so-called utcb shared page was used for transfering
data between posix services. This was a special shmat/get/dt case
allocating from its own virtual pool. Now the term utcb is renamed
as a shared page and integrated with the shm* handling routines.
Generic l4 threads will use long-ipc and not this method. Posix
services will continue to communicate on a shared page for now.
modified: tasks/libl4/include/l4lib/ipcdefs.h
modified: tasks/libl4/src/init.c
new file: tasks/libposix/init.c
modified: tasks/mm0/include/shm.h
modified: tasks/mm0/include/task.h
deleted: tasks/mm0/include/utcb.h
modified: tasks/mm0/main.c
modified: tasks/mm0/src/boot.c
modified: tasks/mm0/src/clone.c
modified: tasks/mm0/src/execve.c
modified: tasks/mm0/src/exit.c
modified: tasks/mm0/src/init.c
modified: tasks/mm0/src/shm.c
modified: tasks/mm0/src/task.c
deleted: tasks/mm0/src/utcb.c
deleted: tools/l4-qemu
- Directory creation, file read/write is OK.
- Cannot reuse old task's fds. They are not recycled for some reason.
- Problems with fork/clone/exit. They fail for a reason.
Increased inode block pointers to 40. The current maximum allowed (and checked).
Updates to file size after every file write ensures subsequent writes can
correctly operate using updated file size information (i.e. not try to add
more pages that are already present). We cannot do this inside write() because
directory writes rely on byte-granularity updates on file buffers, whereas
file updates are by page-granularity (currently).
It turned out we used one version of kmalloc for malloc() and another for kfree()!
Now fixed.
Added parent-child relationship to tasks. Need to polish handling CLONE_PARENT and THREAD.
When sys_munmap() splits a vma, the new vma had no copy of the objects
in the original vma. Now we fixed that using a vma_copy_links() function
which can also be used as part of fork().
- Fixed an important bug with shadow object handling.
When a shadow is dropped, if there are references left
to it, both the object in front and dropped object becomes
a shadow of the original object underneath. We had thought
of this case but had not increase the shadow count.
- Added a test mechanism that tests the number of objects,
vmfiles, shadows etc. by first counting them and trying to
reach the same number by other means, i.e. per-object-shadow counts.
It discovered a plethora of bugs.
- Added new set of functions to register objects, files and tasks
globally with the pager, these functions introduce a refcount as
well as adding structures to linked lists.
- fork/exit now seems to work stably i.e. no negative shadow counts etc.
- Added cleaner allocation of shm addresses by moving the allocation to do_mmap().
- Added deletion routine for all objects: shadow, vm_file of type vfs_file, shm_file, etc.
- Need to make sure objects get deleted properly after exit().
- Currently we allow a single, unique virtual address for each shm segment.
- Updated sleeping paths such that a task is atomically put into
a runqueue and made RUNNABLE, or removed from a runqueue and made SLEEPING.
- Modified vma dropping sources to handle both copy_on_write() and exit() cases
in a common function.
- Added the first infrastructure to have a pager to suspend a task and wait for
suspend completion from the scheduler.
Now all system calls can simply return their final values and they
will be sent to client parties from a single location. Should have had this
simple cleanup a long time ago.
- Fixed do_mmap() so that it returns mapped address, and various bugs.
- A child seems to fork with new setup, but with incorrect return value.
Need to use and test exregs() for fork + clone.
- Shmat searches an unmapped area if input arg is invalid, do_mmap()
should do this.
- Added mutex_trylock()
- Implemented most of exchange_registers()
- thread_control() now needs a lock for operations that can modify thread context.
- thread_start() does not initialise scheduler flags, now done in thread_create.
TODO:
- Fork/clone'ed threads should retain their context in tcb, not syscall stack.
- exchange_registers() calls in userspace need cleaning up.
For clone, file descriptor and vm area structures need to be
separate from the tcb and reached via a pointer so that they
can be shared among multiple tcbs.
- Added automatic utcb map/prefaulting of forked tasks for fs0
so that it does not need to explicitly request those tasks from mm0.
Eliminating fs0 requests to mm0 reduce deadlock possibilities.
- Replaced kmalloc with a public malloc implementation because of a bug in kmalloc.
- Fixed a kfree bug. default_release_pages was trying to free page_array pages.
- Adding prefaulting of fs0 to avoid page fault deadlocks.
- Fixed a bug that a vmo page_cache equivalence would simply drop a link to
an original vmo, even if the vmo could have more pages outside the page cache,
or if the vmo was not a shadow vmo.
- Fixed a bug with page allocator where recursion would corrupt global variables.
- Now going to fix or re-write a simpler page allocator that works.
Removed some commented out code.
Removed excessive printfs.
Fixed spid not initialising for mm0
Fixed some faults with fs0.
TODO:
- Need to store vfs files in a separate list.
- Need to define vnum as a vfs-file-specific data, i.e. in priv_data field of vm_file.
- Need to then fix vfs_receive_sys_open.
- fixed is_err(x), was evaluating x twice, resulting in calling a
function x twice.
- Divided task initialisation into multiple parts.
- MM0 now creates a tcb for itself and maintains memory regions of its own.
- MM0's tcb is used for mmapping other tasks' regions. MM0 mmaps and prefaults
those regions, instead of the typical mmap() and fault approach used by
non-pager tasks.
For example there's an internal shmget_shmat() path to map in other tasks'
shm utcbs. Those mappings are then prefaulted into mm0's address space using
the default fault handling path.
- FS0 now reads task data into its utcb from mm0 via a syscall.
FS0 shmat()s to utcbs of other tasks, e.g. mm0 and test0.
FS0 then crashes, that is to be fixed and where this commit is left last.
Next issues: For every read fault, the fault must traverse the
vma's object stack until the page is found. The problem was that
we were only searching the first object, that object was a writable
shadow, and the shadow didn't have the read-only page, and the 0
return value was interpreted with IS_ERR() and failed, so address
0 was mapped into the location, and QEMU blew off.
The svc images must be pushed to boot file list in correct order
otherwise if test0 starts earlier than fs0, it gets fs0's predefined
thread and space id (since that's the first unallocated one) and
fs0 fails to initalise. In the future we can pre-allocate ids from
the kernel but current temporary fix is simple enough to use.