Commit Graph

526 Commits

Author SHA1 Message Date
Justin C. Miller
9fbbd8b954 [kernel] Update kernel binary's header structure
The kernel's file header has not been verified for a long time. This
change returns file verification to the bootloader to make sure the ELF
loaded in position 0 is actually the kernel.
2021-05-28 14:44:13 -07:00
Justin C. Miller
910fde3b2c [all] Rename kernel::args to kernel::init
The kernel::args namespace is really the protocol for initializing the
kernel from the bootloader. Also, the header struct in that namespace
isn't actually a header, but a collection of parameters. This change
renames the namespace to kernel::init and the struct to args.
2021-05-28 12:34:46 -07:00
Justin C. Miller
0ae489f49d [build] Update to using pb 3
Updating the build to the new version of bonnibel. This also includes
some updates to make sure things keep working with LLVM 11.
2021-04-07 23:05:58 -07:00
Justin C. Miller
e05e05b13a [kernel] Set process in cpu_early_init
Update the cpu data to point to the fake kernel process in
cpu_early_init so there can never be a race condition where the current
process may not be set.
2021-04-07 23:04:37 -07:00
Justin C. Miller
19a799656a [kernel] Protect against null ctor
This should never happen, but if there's a null in the ctors list, don't
just blindly call it.
2021-04-07 23:01:26 -07:00
Justin C. Miller
6a41446185 [kernel] Make IDT per-cpu, not global
Since we modify IST entries while handling interrupts, the IDT cannot be
a global data structure. Allocate new ones for each CPU.
2021-02-19 21:51:25 -08:00
Justin C. Miller
2d6987341c [kernel] Make sure not to log from AP idle threads
The idle threads for the APs have intentionally tiny stacks. Logging is
currently an absolute hog of stack space, so avoid logging on the idle
stacks as much as possible.

Eventually we should instead just reclaim the physical pages used by
most of the stack instead of making them tiny.
2021-02-19 21:47:46 -08:00
Justin C. Miller
f9a967caf7 [kutil] Make enum bitfields usable in other scopes
Changing the SFINAE/enable_if strategy from a type to a constexpr
function means that it can be defined in other scopes than the functions
themselves, because of function overloading. This lets us put everything
into the kutil::bitfields namespace, and make bitfields out of enums in
other namespaces. Also took the chance to clean up the implementation a
bit.
2021-02-19 20:42:49 -08:00
Justin C. Miller
b6772ac2ea [kernel] Fix #DF when building with -O3
I had failed to specify in inline asm that an input variable was the
same as the output variable.
2021-02-17 00:22:22 -08:00
Justin C. Miller
f0025dbc47 [kernel] Schedule threads on other CPUs
Now that the other CPUs have been brought up, add support for scheduling
tasks on them. The scheduler now maintains separate ready/blocked lists
per CPU, and CPUs will attempt to balance load via periodic work
stealing.

Other changes as a result of this:
- The device manager no longer creates a local APIC object, but instead
  just gathers relevant info from the APCI tables. Each CPU creates its
  own local APIC object. This also spurred the APIC timer calibration to
  become a static value, as all APICs are assumed to be symmetrical.
- Fixed a bug where the scheduler was popping the current task off of
  its ready list, however the current task is never on the ready list
  (except the idle task was first set up as both current and ready).
  This was causing the lists to get into bad states. Now a task can only
  ever be current or in a ready or blocked list.
- Got rid of the unused static process::s_processes list of all
  processes, instead of trying to synchronize it via locks.
- Added spinlocks for synchronization to the scheduler and logger
  objects.
2021-02-15 12:56:22 -08:00
Justin C. Miller
2a347942bc [kernel] Fix SMP boot on KVM
KVM didn't like setting all the CR4 bits we wanted at once. I suspect
that means real hardware won't either. Delay the setting of the rest of
CR4 until after the CPU is in long mode - only set PAE and PGE from real
mode.
2021-02-13 01:45:17 -08:00
Justin C. Miller
36da65e15b [kernel] Add index to cpu_data
Because the firmware can set the APIC ids to whatever it wants, add a
sequential index to each cpu_data structure that jsix will use for its
main identifier, or for indexing into arrays, etc.
2021-02-11 00:00:34 -08:00
Justin C. Miller
8c0d52d0fe [kernel] Add spinlocks to vm_space, frame_allocator
Also updated spinlock interface to be an object, and added a scoped lock
object that uses it as well.
2021-02-10 23:57:51 -08:00
Justin C. Miller
793bba95b5 [boot] Do address virtualization in the bootloader
More and more places in the kernel init code are taking addresses from
the bootloader and translating them to offset-mapped addresses. The
bootloader can do this, so it should.
2021-02-10 01:23:50 -08:00
Justin C. Miller
2d4a65c654 [kernel] Pre-allocate cpu_data and pass to APs
In order to avoid cyclic dependencies in the case of page faults while
bringing up an AP, pre-allocate the cpu_data structure and related CPU
control structures, and pass them to the AP startup code.

This also changes the following:
- cpu_early_init() was split out of cpu_early_init() to allow early
  usage of current_cpu() on the BSP before we're ready for the rest of
  cpu_init(). (These functions were also renamed to follow the preferred
  area_action naming style.)
- isr_handler now zeroes out the IST entry for its vector instead of
  trying to increment the IST stack pointer
- the IST stacks are allocated outside of cpu_init, to also help reduce
  stack pressue and chance of page faults before APs are ready
- share stack areas between AP idle threads so we only waste 1K per
  additional AP for the unused idle stack
2021-02-10 15:44:07 -08:00
Justin C. Miller
872f178d94 [kernel] Update syscall MSRs for all CPUs
Since SYSCALL/SYSRET rely on MSRs to control their function, split out
syscall_enable() into syscall_initialize() and syscall_enable(), the
latter being called on all CPUs. This affects not just syscalls but also
the kernel_to_user_trampoline.

Additionally, do away with the max syscalls, and just make a single page
of syscall pointers and name pointers. Max syscalls was fragile and
needed to be kept in sync in multiple places.
2021-02-10 15:25:17 -08:00
Justin C. Miller
70d6094f46 [kernel] Add fake preludes to isr handler to trick GDB
By adding more debug information to the symbols and adding function
frame preludes to the isr handler assembly functions, GDB sees them as
valid locations for stack frames, and can display backtraces through
interrupts.
2021-02-10 01:10:26 -08:00
Justin C. Miller
31289436f5 [kernel] Use PAUSE in spinwait
Using PAUSE in a tight loop allows other logical cores on the same
physical core to make use of more of the core's resources.
2021-02-07 23:52:06 -08:00
Justin C. Miller
72787c0652 [kernel] Make sure all vma types have (virtual) dtors 2021-02-07 23:45:07 -08:00
Justin C. Miller
c88170f6e0 [kernel] Start all other processors in the system
This very large commit is mainly focused on getting the APs started and
to a state where they're waiting to have work scheduled. (Actually
scheduling on them is for another commit.)

To do this, a bunch of major changes were needed:

- Moving a lot of the CPU initialization (including for the BSP) to
  init_cpu(). This includes setting up IST stacks, writing MSRs, and
  creating the cpu_data structure. For the APs, this also creates and
  installs the GDT and TSS, and installs the global IDT.

- Creating the AP startup code, which tries to be as position
  independent as possible. It's copied from its location to 0x8000 for
  AP startup, and some of it is fixed at that address. The AP startup
  code jumps from real mode to long mode with paging in one swell foop.

- Adding limited IPI capability to the lapic class. This will need to
  improve.

- Renaming cpu/cpu.* to cpu/cpu_id.* because it was just annoying in GDB
  and really isn't anything but cpu_id anymore.

- Moved all the GDT, TSS, and IDT code into their own files and made
  them classes instead of a mess of free functions.

- Got rid of bsp_cpu_data everywhere. Now always call the new
  current_cpu() to get the current CPU's cpu_data.

- Device manager keeps a list of APIC ids now. This should go somewhere
  else eventually, device_manager needs to be refactored away.

- Moved some more things (notably the g_kernel_stacks vma) to the
  pre-constructor setup in memory_bootstrap. That whole file is in bad
  need of a refactor.
2021-02-07 23:44:28 -08:00
Justin C. Miller
eb8a3c0e09 [kernel] Fix frame allocator next-block bug
The frame allocator was causing page faults when exhausting the first
(well, last, because it starts from the end) block of free pages. Turns
out it was just incrementing instead of decrementing and thus running
off the end.
2021-02-06 00:06:29 -08:00
Justin C. Miller
335bc01185 [kernel] Fix page_tree growth bug
The logic was inverted in contains(), meaning that new parents were
never being created, and the same level-0 block was just getting reused.
2021-02-05 23:47:29 -08:00
Justin C. Miller
b3861decc3 [kernel] Pass the fb phys addr to userspace
Instead of always mapping the framebuffer at an arbitrary location, and
so reporting that to userspace, send the physical address so drivers can
call system_map_mmio().
2021-02-04 19:56:41 -08:00
Justin C. Miller
b3f59acf7e [kernel] Make sure to virtualize ACPI table pointers
Probably due to old UEFI page tables going away, some systems failed to
load ACPI tables at their physical location. Make sure to translate them
to kernel offset-mapped addresses.
2021-02-04 19:47:17 -08:00
Justin C. Miller
4f8e35e409 [kernel] system_get_log should take a void*
Since it's not just text that's being returned in the buffer, switch the
argument from a char* to a void*.
2021-02-04 19:44:28 -08:00
Justin C. Miller
b898949ffc [kernel] Create system_map_mmio syscall
Create a syscall for drivers to be able to ask the kernel for a VMA that
maps a MMIO area. Also expose vm_flags via j6 table style include file
and new flags.h header.
2021-02-04 19:42:45 -08:00
Justin C. Miller
2244764777 [kernel] Set process stack pointer correctly
The rsp returned by initialize_main_user_stack() needs to be put into
the cpu data area, not just put into the stack (the stack only fills in
rbp).
2021-02-03 17:01:19 -08:00
Justin C. Miller
41eb45402e [kernel] Start process handles at 1
The 0 index was still sometimes not handled properly in the hash table.
Also 0 is sometimes indicative of an error. Let's just start handles
from 1 to avoid those issues.
2021-02-03 16:55:14 -08:00
Justin C. Miller
4985b2d1f4 [kernel] fix frame_allocator::allocate return value
frame_allocator::allocate was returning the passed-in desired amount of
pages, instead of the actual number allocated.
2021-02-02 18:36:28 -08:00
Justin C. Miller
68a2250886 [kernel] Use IST for kernel stacks for NMI, #DF, #PF
We started actually running up against the page boundary for kernel
stacks and thus double-faulting on page faults from kernel space. So I
finally added IST stacks. Note that we currently just
increment/decrement the IST entry by a page when we enter the handler to
avoid clobbering on re-entry, but this means:

* these handlers need to be able to operate with only a page of stack
* kernel stacks always have to be >1 pages
* the amount of nesting possible is tied to the kernel stack size.

These seem fine for now, but we should maybe find a way to use something
besides g_kernel_stacks to set up the IST stacks if/when this becomes an
issue.
2021-02-02 18:36:11 -08:00
Justin C. Miller
634a1c5f6a [kernel] Implement VMA page tracking
The previous method of VMA page tracking relied on the VMA always being
mapped at least into one space and just kept track of pages in the
spaces' page tables. This had a number of drawbacks, and the mapper
system was too complex without much benefit.

Now make VMAs themselves keep track of spaces that they're a part of,
and make them responsible for knowing what page goes where. This
simplifies most types of VMA greatly. The new vm_area_open (nee
vm_area_shared, but there is now no reason for most VMAs to be
explicitly shareable) adds a 64-ary radix tree for tracking allocated
pages.

The page_tree cannot yet handle taking pages away, but this isn't
something jsix can do yet anyway.
2021-01-31 22:18:44 -08:00
Justin C. Miller
c364e30240 [kutil] Flag static allocated vectors
ktuil::vector can take a static area of memory as its initial memory,
but the case was never handled where it outgrew that memory and had to
reallocate. Steal the high bit from the capacity value to indicate the
current memory should not be kfree()'d. Also added checks in the heap
allocator to make sure pointers look valid.
2021-01-31 20:54:19 -08:00
Justin C. Miller
c3dd65457d [kernel] Move 'table' includes to j6/tables
Move all table-style include files that are part of the public kernel
interface to the j6/tables include path
2021-01-28 18:42:42 -08:00
Justin C. Miller
3aa909b917 [kernel] Split loading from scheduler
In preparation for moving things to the init process, move process
loading out of the scheduler. memory_bootstrap now has a
load_simple_process function for mapping an args::program into memory,
and the stack setup has been simplified (though all the initv values are
still being added by the kernel - this needs rework) and normalized to
use the thread::add_thunk_user code path.
2021-01-28 18:26:24 -08:00
Justin C. Miller
35d8d2ab2d [kernel] Add vm_space::allocate
Refactored out vm_space::handle_fault's allocation code into a separate
vm_space::allocate function, and reimplemented handle_fault in terms of
the new function.
2021-01-28 01:08:06 -08:00
Justin C. Miller
e3ebaeb2c8 [kernel] Add new vm_area_fixed
Add a new vm_area type, vm_area_fixed, which is sharable but not
allocatable. Useful for mapping things like MMIO to process spaces.
2021-01-28 01:05:21 -08:00
Justin C. Miller
71dc332dae [kernel] Make default_priority naming consistent
The naming was default_pri in process, but default_priority in
scheduler. Normalize to the longer name.
2021-01-28 01:01:40 -08:00
Justin C. Miller
211a3c2358 [kernel] Clean up syscall code
This is a minor refactor including:
- Removing old commented-out syscall_dispatch function
- Removing IA32_EFER syscall-enable flag setting (this is done by the
  bootloader now)
- Moving much logging from inside process/thread syscalls to the 'task'
  log area, allowing for turning the 'syscall' area down to info by
  default.
2021-01-23 20:37:20 -08:00
Justin C. Miller
16b9d4fd8b [kernel] Have process_start syscall take a list of handles
This also prompted a change of the process initialization protocol to
allow handles to get typed, and changing to marking them as just
self/other handls. This also means exposing the object type enum to
userspace.
2021-01-23 20:36:27 -08:00
Justin C. Miller
c0f304559f [boot] Send module addresses as physical
This makes the job of the kernel easier when marking module pages as
used in the frame allocator. This will also help when sending modules
over to the init process.
2021-01-23 20:30:09 -08:00
Justin C. Miller
0df93eaa98 [kernel] Added the process_kill syscall
Added process_kill, and also cleaned up all the disparate types being
used for thread/process exit codes. (Now all int32_t.)
2021-01-22 00:38:46 -08:00
Justin C. Miller
aae18fd035 [boot][kernel] Replace frame allocator with bitmap-based one
The previous frame allocator involved a lot of splitting and merging
linked lists and lost all information about frames while they were
allocated. The new allocator is based on an array of descriptor
structures and a bitmap. Each memory map region of allocatable memory
becomes one or more descriptors, each mapping up to 1GiB of physical
memory. The descriptors implement two levels of a bitmap tree, and have
a pointer into the large contiguous bitmap to track individual pages.
2021-01-22 00:16:01 -08:00
Justin C. Miller
452457412b [kernel] Add process_create syscall
New syscall creates a process (and thus a new virtual address space) but
does not create any threads in it.
2021-01-20 18:39:14 -08:00
Justin C. Miller
0ae2f935af [kernel] Remove old fake stdout channel/task
This was useful for testing channels, but it just gets in the way now.
2021-01-20 01:30:33 -08:00
Justin C. Miller
3282a3ae34 [kernel] Split out sched log area
To keep the task log area useful, scheduler updates on processes now go
to the new sched log area.
2021-01-20 01:29:18 -08:00
Justin C. Miller
cb612c36ea [boot][kernel] Split programs into sections
To enable setting sections as NX or read-only, the boot program loader
now loads programs as lists of sections, and the kernel args are updated
accordingly. The kernel's loader now just takes a program pointer to
iterate the sections. Also enable NX in IA32_EFER in the bootloader.
2021-01-20 01:25:47 -08:00
Justin C. Miller
847d7ab38d [kernel] Add a 'log available' signal to block on
There was previously no good way to block log-display tasks, either the
fb driver or the kernel log task. Now the system object has a signal
(j6_signal_system_has_log) that gets asserted when the log is written
to.
2021-01-18 19:12:49 -08:00
Justin C. Miller
99ef9166ae [kernel] Lower APIC calibration timer
Now that the spinwait bug is fixed, the raised time for APIC calibration
can be put back to a lower value. It was previously raised thinking more
time would get a more accurate result -- but accuracy was not the issue.
2021-01-18 18:25:44 -08:00
Justin C. Miller
0305830e32 [kernel] fix thread_create handle bug
thread_create was setting the handle it returned to be that of the
parent process, not the thread it created.
2021-01-18 18:24:18 -08:00
Justin C. Miller
9f342dff49 [kernel] fix err_insufficient bug in endpoint
The endpoint syscalls endpoint_recv and endpoint_sendrecv gained new
local stack variables for calling into possibly blocking endpoint
functions, but the len variable was being initialized to 0 instead of
the incoming buffer size.
2021-01-18 18:22:32 -08:00