Changing the SFINAE/enable_if strategy from a type to a constexpr
function means that it can be defined in other scopes than the functions
themselves, because of function overloading. This lets us put everything
into the kutil::bitfields namespace, and make bitfields out of enums in
other namespaces. Also took the chance to clean up the implementation a
bit.
Now that the other CPUs have been brought up, add support for scheduling
tasks on them. The scheduler now maintains separate ready/blocked lists
per CPU, and CPUs will attempt to balance load via periodic work
stealing.
Other changes as a result of this:
- The device manager no longer creates a local APIC object, but instead
just gathers relevant info from the APCI tables. Each CPU creates its
own local APIC object. This also spurred the APIC timer calibration to
become a static value, as all APICs are assumed to be symmetrical.
- Fixed a bug where the scheduler was popping the current task off of
its ready list, however the current task is never on the ready list
(except the idle task was first set up as both current and ready).
This was causing the lists to get into bad states. Now a task can only
ever be current or in a ready or blocked list.
- Got rid of the unused static process::s_processes list of all
processes, instead of trying to synchronize it via locks.
- Added spinlocks for synchronization to the scheduler and logger
objects.
KVM didn't like setting all the CR4 bits we wanted at once. I suspect
that means real hardware won't either. Delay the setting of the rest of
CR4 until after the CPU is in long mode - only set PAE and PGE from real
mode.
Because the firmware can set the APIC ids to whatever it wants, add a
sequential index to each cpu_data structure that jsix will use for its
main identifier, or for indexing into arrays, etc.
More and more places in the kernel init code are taking addresses from
the bootloader and translating them to offset-mapped addresses. The
bootloader can do this, so it should.
In order to avoid cyclic dependencies in the case of page faults while
bringing up an AP, pre-allocate the cpu_data structure and related CPU
control structures, and pass them to the AP startup code.
This also changes the following:
- cpu_early_init() was split out of cpu_early_init() to allow early
usage of current_cpu() on the BSP before we're ready for the rest of
cpu_init(). (These functions were also renamed to follow the preferred
area_action naming style.)
- isr_handler now zeroes out the IST entry for its vector instead of
trying to increment the IST stack pointer
- the IST stacks are allocated outside of cpu_init, to also help reduce
stack pressue and chance of page faults before APs are ready
- share stack areas between AP idle threads so we only waste 1K per
additional AP for the unused idle stack
Since SYSCALL/SYSRET rely on MSRs to control their function, split out
syscall_enable() into syscall_initialize() and syscall_enable(), the
latter being called on all CPUs. This affects not just syscalls but also
the kernel_to_user_trampoline.
Additionally, do away with the max syscalls, and just make a single page
of syscall pointers and name pointers. Max syscalls was fragile and
needed to be kept in sync in multiple places.
This very large commit is mainly focused on getting the APs started and
to a state where they're waiting to have work scheduled. (Actually
scheduling on them is for another commit.)
To do this, a bunch of major changes were needed:
- Moving a lot of the CPU initialization (including for the BSP) to
init_cpu(). This includes setting up IST stacks, writing MSRs, and
creating the cpu_data structure. For the APs, this also creates and
installs the GDT and TSS, and installs the global IDT.
- Creating the AP startup code, which tries to be as position
independent as possible. It's copied from its location to 0x8000 for
AP startup, and some of it is fixed at that address. The AP startup
code jumps from real mode to long mode with paging in one swell foop.
- Adding limited IPI capability to the lapic class. This will need to
improve.
- Renaming cpu/cpu.* to cpu/cpu_id.* because it was just annoying in GDB
and really isn't anything but cpu_id anymore.
- Moved all the GDT, TSS, and IDT code into their own files and made
them classes instead of a mess of free functions.
- Got rid of bsp_cpu_data everywhere. Now always call the new
current_cpu() to get the current CPU's cpu_data.
- Device manager keeps a list of APIC ids now. This should go somewhere
else eventually, device_manager needs to be refactored away.
- Moved some more things (notably the g_kernel_stacks vma) to the
pre-constructor setup in memory_bootstrap. That whole file is in bad
need of a refactor.
Instead of always mapping the framebuffer at an arbitrary location, and
so reporting that to userspace, send the physical address so drivers can
call system_map_mmio().
In preparation for moving things to the init process, move process
loading out of the scheduler. memory_bootstrap now has a
load_simple_process function for mapping an args::program into memory,
and the stack setup has been simplified (though all the initv values are
still being added by the kernel - this needs rework) and normalized to
use the thread::add_thunk_user code path.
This makes the job of the kernel easier when marking module pages as
used in the frame allocator. This will also help when sending modules
over to the init process.
The previous frame allocator involved a lot of splitting and merging
linked lists and lost all information about frames while they were
allocated. The new allocator is based on an array of descriptor
structures and a bitmap. Each memory map region of allocatable memory
becomes one or more descriptors, each mapping up to 1GiB of physical
memory. The descriptors implement two levels of a bitmap tree, and have
a pointer into the large contiguous bitmap to track individual pages.
To enable setting sections as NX or read-only, the boot program loader
now loads programs as lists of sections, and the kernel args are updated
accordingly. The kernel's loader now just takes a program pointer to
iterate the sections. Also enable NX in IA32_EFER in the bootloader.
There was previously no good way to block log-display tasks, either the
fb driver or the kernel log task. Now the system object has a signal
(j6_signal_system_has_log) that gets asserted when the log is written
to.
In order to allow the bootloader to do preliminary CPUID validation
while UEFI is still handling displaying information to the user, split
most of the kernel's CPUID handling into a library to be used by both
kernel and boot.
Several changes were needed to make this work:
- Update the page_table::flags to understand memory caching types
- Set up the PAT MSR to add the WC option
- Make page-offset area mapped as WT
- Add all the MTRR and PAT MSRs, and log the MTRRs for verification
- Add a vm_area flag for write_combining
If there's no video, do as we did before, otherwise route logs to the fb
driver instead. (Need to clean this up to just have a log consumer
general interface?) Also added a "scrollback" class to fb driver and
updated the system_get_log syscall.
Moved old PSF parsing code from kernel, and switched to embedding whole
PSF instead of just glyph data to make font class the same code paths
for both cases.
Create a new framebuffer driver. Also hackily passing frame buffer size
in the list of init handles to all processes and mapping the framebuffer
into all processes. Changed bootloader passing frame buffer as a module
to its own struct.
- Add a tag field to all endpoint messages, which doubles as a
notification field
- Add a endpoint_bind_irq syscall to enable an endpoint to listen for
interrupt notifications. This mechanism needs to change.
- Add a temporary copy of the serial port code to nulldrv, and let it
take responsibility for COM2
Remove ELF and initrd loading from the kernel. The bootloader now loads
the initial programs, as it does with the kernel. Other files that were
in the initrd are now on the ESP, and non-program files are just passed
as modules.
The "fake" stdout channel is now being passed in the new j6_process_init
structure to processes, and nulldrv now uses it to print a message to
the console.
Multiple changes regarding channels. Mainly channels are now stream
based and can handle partial reads or writes. Channels now use the
kernel buffers area with the related buffer_cache. Added a fake stdout
stream channel and kernel task to read its contents to the screen in
preparation for handing channels as stdin/stdout to processes.
The scheduler singleton was getting constructed twice, once at static
time and then again in main(). Make the singleton a pointer so we only
construct it once.
Create a clock class which can be queried for current timestamp in
nanoseconds. Also implements a simple HPET class as one possible clock
source.
Tags: time
Instead of many timer interrupts and decrementing a process' remaining
quanta, change to setting a single timer for when a process should be
preempted. If it uses its whole timeslice, demote it. If it uses less
than half before blocking, promote it. Determine timeslice based on
priority as well.
This change also required changing the apic timer interface to be purely
interval (in microseconds) based instead of its previous interval/tick
hybrid.
Look up the global constructor list that the linker outputs, and run
them all. Required creation of the `kutil::no_construct` template for
objects that are constructed before the global constructors are run.
Also split the `memory_initialize` function into two - one for just
those objects that need to happen before the global ctors, and one
after.
Tags: memory c++
Many kernel objects had to keep a hold of refrences to allocators in
order to pass them on down the call chain. Remove those explicit
refrences and use `operator new`, `operator delete`, and define new
`kalloc` and `kfree`.
Also remove `slab_allocator` and replace it with a new mixin for slab
allocation, `slab_allocated`, that overrides `operator new` and
`operator free` for its subclass.
Remove some no longer used related headers, `buddy_allocator.h` and
`address_manager.h`
Tags: memory
GDB works far better now with QEMU's `-S` flag. No longer does it
complain about changing the target from 32 to 64 bits. Get rid of the
old `waiting` loop and `sleep` call in the GDB config for the kernel.
Tags: debugging
The `kernel_main()` had a lot change out from under it with the
bootloader changes. This change brings most of it back in line with the
new kernel arguments.
Tags: pml4 paging boot
Created a new `memory_initialize()` function that uses the new-style
kernel args structure from the new bootloader.
Additionally:
* Fixed a hard-coded interrupt EOI address that didn't work with new
memory locations
* Make the `page_manager::fault_handler()` automatically grant pages
in the kernel heap
Tags: boot page fault
At some point, `init_console()` ended up not being before the first
usage of some `log::` functions, which were jumping off into garbage.
Tags: initialization boot
- The old kernel_args structure is now mostly represented as a series of
'modules' or memory ranges, tagged with a type. An arbitrary number
can be passed to the kernel
- Update bootloader to allocate space for the args header and 10 module
descriptors
Introduces the cpu_features.inc table to enumerate the CPU features that
j6 cares about. Features in this table marked CPU_FEATURE_REQ are
considered required, and the boot process will log an error and halt
when any of these features are not supported. This should save me from
banging my head against the wall like I did last night with the missing
pdpe1gb feature.
This commit makes several fundamental changes to memory handling:
- the frame allocator is now only an allocator for free frames, and does
not track used frames.
- the frame allocator now stores its free list inside the free frames
themselves, as a hybrid stack/span model.
- This has the implication that all frames must currently fit within
the offset area.
- kutil has a new allocator interface, which is the only allowed way for
any code outside of src/kernel to allocate. Code under src/kernel
_may_ use new/delete, but should prefer the allocator interface.
- the heap manager has become heap_allocator, which is merely an
implementation of kutil::allocator which doles out sections of a given
address range.
- the heap manager now only writes block headers when necessary,
avoiding page faults until they're actually needed
- page_manager now has a page fault handler, which checks with the
address_manager to see if the address is known, and provides a frame
mapping if it is, allowing heap manager to work with its entire
address size from the start. (Currently 32GiB.)