Create a syscall for drivers to be able to ask the kernel for a VMA that
maps a MMIO area. Also expose vm_flags via j6 table style include file
and new flags.h header.
We started actually running up against the page boundary for kernel
stacks and thus double-faulting on page faults from kernel space. So I
finally added IST stacks. Note that we currently just
increment/decrement the IST entry by a page when we enter the handler to
avoid clobbering on re-entry, but this means:
* these handlers need to be able to operate with only a page of stack
* kernel stacks always have to be >1 pages
* the amount of nesting possible is tied to the kernel stack size.
These seem fine for now, but we should maybe find a way to use something
besides g_kernel_stacks to set up the IST stacks if/when this becomes an
issue.
This is a minor refactor including:
- Removing old commented-out syscall_dispatch function
- Removing IA32_EFER syscall-enable flag setting (this is done by the
bootloader now)
- Moving much logging from inside process/thread syscalls to the 'task'
log area, allowing for turning the 'syscall' area down to info by
default.
This also prompted a change of the process initialization protocol to
allow handles to get typed, and changing to marking them as just
self/other handls. This also means exposing the object type enum to
userspace.
The previous frame allocator involved a lot of splitting and merging
linked lists and lost all information about frames while they were
allocated. The new allocator is based on an array of descriptor
structures and a bitmap. Each memory map region of allocatable memory
becomes one or more descriptors, each mapping up to 1GiB of physical
memory. The descriptors implement two levels of a bitmap tree, and have
a pointer into the large contiguous bitmap to track individual pages.
To enable setting sections as NX or read-only, the boot program loader
now loads programs as lists of sections, and the kernel args are updated
accordingly. The kernel's loader now just takes a program pointer to
iterate the sections. Also enable NX in IA32_EFER in the bootloader.
There was previously no good way to block log-display tasks, either the
fb driver or the kernel log task. Now the system object has a signal
(j6_signal_system_has_log) that gets asserted when the log is written
to.
The UEFI spec specifically calls out memory types with the high bit set
as being available for OS loaders' custom use. However, it seems many
UEFI firmware implementations don't handle this well. (Virtualbox, and
the firmware on my Intel NUC and Dell XPS laptop to name a few.)
So sadly since we can't rely on this feature of UEFI in all cases, we
can't use it at all. Instead, treat _all_ memory tagged as EfiLoaderData
as possibly containing data that's been passed to the OS by the
bootloader and don't free it yet.
This will need to be followed up with a change that copies anything we
need to save and frees this memory.
See: https://github.com/kiznit/rainbow-os/blob/master/boot/machine/efi/README.md
The endpoint receive syscalls can block and then write to userspace
memory. Since the current address space may be different after blocking,
make sure to only actually write to the user memory after returning to
the syscall handler - pass values that are on the syscall handler stack
deeper into the kernel.
If there's no video, do as we did before, otherwise route logs to the fb
driver instead. (Need to clean this up to just have a log consumer
general interface?) Also added a "scrollback" class to fb driver and
updated the system_get_log syscall.
Moved old PSF parsing code from kernel, and switched to embedding whole
PSF instead of just glyph data to make font class the same code paths
for both cases.
Move process init from each process needing a main.s with _start to
crt0.s in libc. Also change to a sysv-like initial stack with a
j6-specific array of initialization values after the program arguments.
Create a new framebuffer driver. Also hackily passing frame buffer size
in the list of init handles to all processes and mapping the framebuffer
into all processes. Changed bootloader passing frame buffer as a module
to its own struct.
In order to implement capabilities on system resources like IRQs so that
they may be restricted to drivers only, add a new 'system' kobject type,
and move the bind_irq functionality from endpoint to system.
Also fix some stack bugs passing the initial handles to a program.
- Add a tag field to all endpoint messages, which doubles as a
notification field
- Add a endpoint_bind_irq syscall to enable an endpoint to listen for
interrupt notifications. This mechanism needs to change.
- Add a temporary copy of the serial port code to nulldrv, and let it
take responsibility for COM2
Remove ELF and initrd loading from the kernel. The bootloader now loads
the initial programs, as it does with the kernel. Other files that were
in the initrd are now on the ESP, and non-program files are just passed
as modules.
The vm_space allow() functionality was a bit janky; using VMAs for all
regions would be a lot cleaner. To that end, this change:
- Adds a "static array" ctor to kutil::vector for setting the kernel
address space's VMA list. This way a kernel heap VMA can be created
without the heap already existing.
- Splits vm_area into different subclasses depending on desired behavior
- Splits out the concept of vm_mapper which maps vm_areas to vm_spaces,
so that some kinds of VMA can be inherently single-space
- Implements VMA resizing so that userspace can grow allocations.
- Obsolete page_table_indices is removed
Also, the following bugs were fixed:
- kutil::map iterators on empty maps no longer break
- memory::page_count was doing page-align, not page-count
See: Github bug #242
See: [frobozz blog post](https://jsix.dev/posts/frobozz/)
Tags:
Finished the VMA kobject and added the related syscalls. Processes can
now allocate memory! Other changes in this commit:
- stop using g_frame_allocator and add frame_allocator::get()
- make sure to release all handles in the process dtor
- fix kutil::map::iterator never comparing to end()
vm_space and page_table continue to take over duties from
page_manager:
- creation and deletion of address spaces / pml4s
- cross-address-space copies for endpoints
- taking over pml4 ownership from process
Also fixed the bug where the wrong process was being set in the cpu
data.
To solve: now the kernel process has its own vm_space which is not
g_kernel_space.
The "fake" stdout channel is now being passed in the new j6_process_init
structure to processes, and nulldrv now uses it to print a message to
the console.
Multiple changes regarding channels. Mainly channels are now stream
based and can handle partial reads or writes. Channels now use the
kernel buffers area with the related buffer_cache. Added a fake stdout
stream channel and kernel task to read its contents to the screen in
preparation for handing channels as stdin/stdout to processes.
We were previously allocating kernel stacks as large objects on the
heap. Now keep track of areas of the kernel stack area that are in use,
and allocate them from there. Also required actually implementing
vm_space::commit(). This still needs more work.
Add a system call to assert signals on a given object, only within the
range of user-settable signals. Also made object_wait return
immediately if any of the given signals are already set.
Add the channel object for sending messages between threads. Currently
no good of passing channels to other threads, but global variables in a
single process work. Currently channels are slow and do double copies,
need to refine more.
Tags: ipc
Implement the syscalls necessary for threads to create other threads in
their same process. This involved rearranging a number of syscalls, as
well as implementing object_wait and a basic implementation of a
process' list of handles.
Re-implent the concept of processes as separate from threads, and as a
kobject API object. Also improve scheduler::prune which was doing some
unnecessary iterations.
Create a clock class which can be queried for current timestamp in
nanoseconds. Also implements a simple HPET class as one possible clock
source.
Tags: time
The scheduler again tracks remaining timeslice. Timeslices are bigger,
but once a process uses all of its timeslice, it's demoted and
replenished at the next priority. The scheduler also tracks the last
time a process ran, and promotes it if it's been starved for twice its
full timeslice.
TODO: replenish a small amount of timeslice each time a process is run,
so that more interactive processes keep their priorities.
There were a few lingering bugs due to places where 510/511 were
hard-coded as the kernel-space PML4 entries. These are now constants
defined in kernel_memory.h instead.
Tags: boot memory paging
The bootloader was previously just passing the memory map as a module,
but the memory map is important enough to want a direct pointer, instead
of having to search the modules.
The `kernel_offset` and `page_offset` had already been updated with
previous bootloader changes, but `kernel_max_heap` had not. Also, make
all the constants `constexpr` instead of `static const` that would live
in multiple TUs.
Set up initial page tables for both the offset-mapped area and the
loaded kernel code and data.
* Got rid of the `loaded_elf` struct - the loader now runs after the
initial PML4 is created and maps the ELF sections itself.
* Copied in the `page_table` and `page_table_indices` from the kernel,
still need to clean this up and extract it into shared code.
* Added `page_table_cache` to the kernel args to pass along free pages
that can be used for initial page tables.
Tags: paging
Adding this now because I'm sure I'll forget later. Make sure to
annotate the entrypoint function pointer as `__attribute__((sysv_abi))`
so that it's not called via ms abi like the rest of the loader.
- The old kernel_args structure is now mostly represented as a series of
'modules' or memory ranges, tagged with a type. An arbitrary number
can be passed to the kernel
- Update bootloader to allocate space for the args header and 10 module
descriptors