The kernel log levels are now numerically reversed so that more-verbose
levels can be added to the end. Replaced 'debug' with 'verbose', and
added new 'spam' level.
Rearranging of the ISR vectors for eventual TPR priority. Also removed
excess IRQs - if we need to support more than 64 IRQ vectors, we can add
some more back in.
Also clean up the legacy PIC init/masking code.
Previously, when adding a new thread, we only ever added it to the
current CPU and relied on work stealing to balance the CPUs. This commit
has the scheduler schedule new tasks round-robin across CPUs in hopes of
having to steal fewer tasks.
Also adds the run_queue.prev pointer for debugging what task was just
running on the given CPU.
The printf library I have been using, while useful, has way more than I
need in it, and had comparably huge stack space requirements. This
change adds a new util::format() which is a replacement for snprintf,
but with only the features used by kernel logging.
The logger has been changed to use it, as well as the few instances of
snprintf in the interrupt handling code before calling kassert.
Also part of this change: the logger's (now vestigial) immediate output
handling code is removed, as well as the "sequence" field on log
message headers.
The isr_prelude (and its IRQ equivalent) had been pushing RIP and RBP in
order to create a fake stack frame. This was in an effor to make GDB
display tracebacks more reliably, but GDB has other reasons for being
finnicky about stack frames, so this was just wasted. This commit gets
rid of it to make looking at the stack clearer.
In the beginning of the interrupt handler, we had previously checked if
the current handler had grabbed an IST stack from the IDT/TSS. If it
was, it saved this value and set it to 0 in the IDT, then restored it at
the end.
Now this is an atomic action. This is unlikely to make a difference
unless the interrupt handler is itself interrupted by an exception
before being able to swap the IDT value, but such a situation is now
impossible.
Previously, the CPU control registers were being set in a number of
different ways. Now, since the APs' need this to be set in the CPU
initialization code, always do it there. This removes some of the
settings from the bootloader, and some unused ones from smp.s.
Additionally, the control registers' flags are now enums in cpu.h and
manipulated via util::bitset.
In bsp_early_init(), the BSP cpu_data's rsp0 was getting initialized to
the _value_ at the idle_stack_end symbol, instead of its address. I
don't believe this was causing any actual harm, but it was a red herring
when debugging.
The cpu::cpu_id class no longer looks up all known features in the
constructor, but instead provides access to the map of supported
features as a bitset from the verify() method. It also exposes the
brand_name() method instead of loading the brand name string in the
constructor and storing it as part of the object.
This bug has been making me tear my hair out for weeks. When creating
the idle thread for each CPU, we were previously sharing stack areas
with other CPUs' idle threads in an effort to save memory. However, this
caused stack corruption that was very hard to track down. The kernel
stacks are in a vm_area_guarded to better detect this exact kind of
issue, but splitting stacks like this skirts that protection. It's not
worth saving a few KiB per CPU.
This commit contains a number of related mailbox issues:
- Add extra parameters to mailbox_respond_receive to allow both the
number of bytes/handles passed in, and the size of the byte/handle
buffers to be passed in.
- Don't delete mailbox messages on receipt if the caller is waiting on
reply
- Correctly pass status messages along with a mailbox::replyer object
- Actually wake the calling thread in the mailbox::replyer dtor
- Make sure to release locks _before_ calling thread::wake() on blocked
threads, as that may cause them to be scheduled ahead of the current
thread.
This commit adds a new flag, j6_channel_block, and a new flags param to
the channel_receive syscall. When the block flag is specified, the
caller will block waiting for data on the channel if the channel is
empty.
It seems more common to want to sleep for a duration than to sleep to a
specific time. Change the implementation to not make the process look up
the current time first. (Plus, there's no current syscall to do so)
When waking another thread, if that thread has a more urgent priority
than the current thread on the same CPU, send that CPU an IPI to tell it
to run its scheduler.
Related changes in this commit:
- Addition of the ipiSchedule isr (vector 0xe4) and its handler in
isr_handler().
- Change the APIC's send_ipi* functions to take an isr enum and not an
int for their vector parameter
- Thread TCBs now contain a pointer to their current CPU's cpu_data
structure
- Add the maybe_schedule() call to the scheduler, which sends the
schedule IPI to the given thread's CPU only when that CPU is running a
less-urgent thread.
- Move the locking of a run queue lock earlier in schedule() instead of
taking the lock in steal_work() and again in schedule().
The new logger event object for making get_entry() block when no logs
are available was consuming the event's notification even if the thread
did not need to block. This was causing excessive blocking - if multiple
logs had been added since the last call to get_entry(), only one would
be returned, and the next call would block until yet another log was
added.
Now only call event::wait() to block the calling thread if there are no
logs available.
Added profiler.h which defines classes and macros for defining profiler
objects. Also added gdb command j6prof for printing profile data. Added
the syscall_profiles profiler class and auto wrapping of syscalls with
profile objects.
Other changes in this commit:
- Made the gdb command `j6threads` argument for specifying a CPU
optional. Without an argument, it loops through all CPUs.
- Switched to -mcmodel=kernel for kernel code, which makes `call`
instructions easier to follow when debugging / looking at disassembly.
A few changes to the panic handler's display:
- Change rdi and rsi to match other general-purpose registers. (They
were previously blue, matching the stack/base pointer registers.)
- Change the ordering of r8-r15 to be column-major instead of row-major.
I find myself wanting to read down the columns to find the register
I'm looking for, and rax-rdx are already this way.
- Make the flags register yellow, matching the ss and cs registers
- Comment out the call to print_rip() call, as it's only occasionally
helpful and can cause the panic handler to page fault.
When debugging, or in panic callstacks, the BSP idle thread used to be
reported as `_kernel_start`, because it was just the loop at the end of
that assembly function. Now, wrap that loop in a separate symbol called
`bsp_idle` to make it clearer that the cpu is in the idle thread.
Three issues that caused build breaks when regenerating the build
directory after the previous commits:
- system.def was including endpoint.def
- syscalls/vm_area.cpp was including j6/signals.h
- util/util.h was missing an include of stddef.h
The new mailbox kernel object API offers asynchronous message-based IPC
for sending data and handles between threads, as opposed to endpoint's
synchronous model.
In preparation for the new mailbox IPC model, blocking threads needed an
overhaul. The `wait_on_*` and `wake_on_*` methods are gone, and the
`block()` and `wake()` calls on threads now pass a value between the
waker and the blocked thread.
As part of this change, the concept of signals on the base kobject class
was removed, along with the queue of blocked threads waiting on any
given object. Signals are now exclusively the domain of the event object
type, and the new wait_queue utility class helps manage waiting threads
when an object does actually need this functionality. In some cases (eg,
logger) an event object is used instead of the lower-level wait_queue.
Since this change has a lot of ramifications, this large commit includes
the following additional changes:
- The j6_object_wait, j6_object_wait_many, and j6_thread_pause syscalls
have been removed.
- The j6_event_clear syscall has been removed - events are "cleared" by
reading them now. A new j6_event_wait syscall has been added to read
events.
- The generic close() method on kobject has been removed.
- The on_no_handles() method on kobject now deletes the object by
default, and needs to be overridden by classes that should not be.
- The j6_system_bind_irq syscall now takes an event handle, as well as a
signal that the IRQ should set on the event. IRQs will cause a waiting
thread to be woken with the appropriate bit set.
- Threads waking due to timeout is simplified to just having a
wake_timeout() accessor that returns a timestamp.
- The new wait_queue uses util::deque, which caused the disovery of two
bugs in the deque implementation: empty deques could still have a
single array allocated and thus return true for empty(), and new
arrays getting allocated were not being zeroed first.
- Exposed a new erase() method on util::map that takes a node pointer
instead of a key, skipping lookup.
When displaying a set of user regs, also display memory around the
current rip from those user regs. This helps find or rule out memory
corruption errors causing invalid code to run.
This change introduces test_runner, which runs unit or integration tests
and then tells the kernel to exit QEMU with a status code indicating the
number of failed tests.
The test_runner program is not loaded by default. Use the test manifest
to enable it:
./configure --manifest=assets/manifests/test.yml
A number of tests from the old src/tests have moved over. More to come,
as well as moving code from testapp before getting rid of it.
The test.sh script has been repurposed to be a "headless" version of
qemu.sh for running tests, and it exits with the appropriate exit code.
(Though ./qemu.sh gained the ability to exit with the correct exit code
as well.) Exit codes from kernel panics have been updated so that the
bash scripts should exit with code 127.
This has always been on the todo list, but it finally bit me. srv.init
re-uses load addresses when loading multiple programs, and collision
between reused addresses was causing corruption without the TLB flush.
Now srv.init also doesn't increment its load address for sections when
loading a single program either, since unmapping pages actually works.
The new "noreturn" option tag on syscall methods causes those methods to
be generated with [[noreturn]] / _Noreturn to avoid clang complaining
that other functions marked noreturn, like exit(), because it can't tell
that the syscall never returns.
This new libc is mostly from scratch, with *printf() functions provided
by Marco Paland and Eyal Rozenberg's tiny printf library, and malloc and
friends provided by dlmalloc.
The great header shift: It didn't make sense to regenerate headers for
the same module for every target (boot/kernel/user) it appeared in. And
now that core headers are out of src/include, this was going to cause
problems for the new libc changes I've been working on. So I went back
to re-design how module headers work.
Pre-requisites:
- A module's public headers should all be available in one location, not
tied to target.
- No accidental includes. Another module should not be able to include
anything (creating an implicit dependency) from a module without
declaring an explicit dependency.
- Exception to the previous: libc's headers should be available to all,
at least for the freestanding headers.
New system:
- A new "public_headers" property of module declares all public headers
that should be available to dependant modules
- All public headers (after possible processing) are installed relative
to build/include/<module> with the same path as their source
- This also means no "include" dir in modules is necessary. If a header
should be included as <j6/types.h> then its source should be
src/libraries/j6/j6/types.h - this caused the most churn as all public
header sources moved one directory up.
- The "includes" property of a module is local only to that module now,
it does not create any implicit public interface
Other changes:
- The bonnibel concept of sources changed: instead of sources having
actions, they themselves are an instance of a (sub)class of Source,
which provides all the necessary information itself.
- Along with the above, rule names were standardized into <type>.<ext>,
eg "compile.cpp" or "parse.cog"
- cog and cogflags variables moved from per-target scope to global scope
in the build files.
- libc gained a more dynamic .module file
The manifest can now supply a list of boot flags, including "test".
Those get turned into the bootproto::args::flags field by the
bootloader. The kernel takes those and uses the test flag to control
enabling syscalls with the new "test" attribute, like the new
test_finish syscall, which lets automated tests call back to the kernel
to shut down the system.
If the thread waiting is the current thread, it should have the result
when it wakes. Might as well return it, so that syscalls that know
they're putting the current thread to sleep can get the result easily.
The "handle" tag on syscall parameters causes syscall_verify.cpp to pass
the resulting object as a obj::handle* instead of directly as an object
pointer. Now the handle tag is supported directly on the syscall itself
as well, causing the "self" object to be passed as a handle pointer.
The event object was missing any syscalls. Furthermore, kobject had an
old object_signal implementation (the syscall itself no longer exists),
which was removed. User code should only be able to set signals on
events.
Two minor issues: scheduler::prune wasn't formatted correctly, and
j6/caps.h was not using the ull prefix when shifting 64 bit numbers.
(It's doubtful an object would get more than 32 caps any time soon, but
better to be correct.)
The cpu.cpp/smp.cpp cleanup out of kernel_main missed an important call:
kernel_main never called smp::ready() to unblock the APs waiting for the
scheduler to be ready.
The return of slab_allocated! Now after the kutil/util/kernel giant
cleanup, this belongs squarely in the kernel, and works much better
there. Slabs are allocated via a bump pointer into a new kernel VMA,
instead of using kalloc() or allocating pages directly.
Adding the util::deque container, implemented with the util::linked_list
of arrays of items.
Also, use the deque for a kobject's blocked thread list to maintain
order instead of a vector using remove_swap().
The new zero_ok flag is similar to optional for reference parameters,
but only in cases where there is a length parameter. If that parameter
is a reference parameter itself and is null, or it is non-null and
contains a non-zero length, or there is no length parameter, then the
main parameter may not be null.