This is the second of two big changes to clean up includes throughout
the project. Since I've started using clangd with Neovim and using
VSCode's intellisense, my former strategy of copying all header files
into place in `build/include` means that the real files don't show up in
`compile_commands.json` and so display many include errors when viewing
those header files in those tools.
That setup was mostly predicated on a desire to keep directory depths
small, but really I don't think paths like `src/libraries/j6/j6` are
much better than `src/libraries/j6/include/j6`, and the latter doesn't
have the aforementioned issues, and is clearer to the casual observer as
well.
Some additional changes:
- Added a new module flag `copy_headers` for behavior similar to the old
style, but placing headers in `$module_dir/include` instead of the
global `build/include`. This was needed for external projects that
don't follow the same source/headers folder structure - in this case,
`zstd`.
- There is no longer an associated `headers.*.ninja` for each
`module.*.ninja` file, as only parsed headers need to be listed; this
functionality has been moved back into the module's ninja file.
This is the first of two rather big changes to clean up includes
throughout the project. In this commit, the implicit semi-dependency on
libc that bonnibel adds to every module is removed. Previously, I was
sloppy with includes of libc headers and include directory order. Now,
the freestanding headers from libc are split out into libc_free, and an
implicit real dependency is added onto this module, unless `no_libc` is
set to `True`. The full libc needs to be explicitly specified as a
dependency to be used.
Several things needed to change in order to do this:
- Many places use `memset` or `memcpy` that cannot depend on libc. The
kernel has basic implementations of them itself for this reason. Now
those functions are moved into the lower-level `j6/memutils.h`, and
libc merely references them. Other modules are now free to reference
those functions from libj6 instead.
- The kernel's `assert.h` was renamed kassert.h (matching its `kassert`
function) so that the new `util/assert.h` can use `__has_include` to
detect it and make sure the `assert` macro is usable in libutil code.
- Several implementation header files under `__libj6/` also moved under
the new libc_free.
- A new `include_phase` property has been added to modules for Bonnibel,
which can be "normal" (default) or "late" which uses `-idirafter`
instead of `-I` for includes.
- Since `<utility>` and `<new>` are not freestanding, implementations of
`remove_reference`, `forward`, `move`, and `swap` were added to the
`util` namespace to replace those from `std`, and `util/new.h` was
added to declare `operator new` and `operator delete`.
These are some changes I made to debug tooling while tracking down the
bugfix in the previous commit.
Each `scripts/debug_*_alloc.gdb` script has gdb output a `*_allocs.txt`
file, which in turn can be parsed by the `scripts/parse_*_allocs.py`
script to find errors.
The upcoming futex syscalls will be easier to use (and to auto verify)
if passed a pointer instead of an address, this allows for changing a
`Primitive` to a `PrimitiveRef` by adding a `*` to the end.
The debugcon logger is now separate from logger::output, and is instead
a kernel-internal thread that watches for logs and prints them to the
deubcon device.
Make build_symbol_table.py output statistics on the symbol table it
builds, and emit warnings for zero-length symbols. Also added lengths to
several functions defined in asm that this uncovered.
The symbol table needs to be passed to the panic handler very early in
the kernel, loading it in init is far less useful. Return it to the boot
directory and remove it from the initrd.
Previously, to add a custom linker script, a module would need to modify
its variables after the fact to add to ldflags. Now module constructors
take a new keyword `ld_script` and handle the ldflags and dependencies
properly.
CDB seemed to be too simple for the needs of init, and squashfs is too
laden with design choices to work around Linux's APIs. This commit adds
creation of an initrd image of a new format I've called `j6romfs`.
Note that this commit currently does not work! The initrd-reading code
still needs to be added.
The initrd image is now created by the build system, loaded by the
bootloader, and passed to srv.init, which loads it (but doesn't do
anything with it yet, so this is actually a functional regression).
This simplifies a lot of the modules code between boot and init as well:
Gone are the many subclasses of module and all the data being inline
with the module structs, except for any loaded files. Now the only
modules loaded and passed will be the initrd, and any devices only the
bootloader has knowledge of, like the UEFI framebuffer.
A new compressed initrd format for srv.init to load drivers, services,
and data from, instead of every file getting loaded by the bootloader.
This will allow for less memory allocated by the bootloader and passed
to init if not every driver or data file is loaded.
Loading, passing, and using the new initrd will be done in a coming
commit.
This new class makes it easier for user programs to spawn threads. This
change also includes support for .hh files in modules, to differentiate
headers that are C++-only in system libraries.
After hitting 700 commits last week, I thought it'd be a good time to
update the project status in the Readme. This change also pulls out the
toolchain scripts that moved to jsix-os/toolchain, and updates the
Readme about that as well.
This new libc is mostly from scratch, with *printf() functions provided
by Marco Paland and Eyal Rozenberg's tiny printf library, and malloc and
friends provided by dlmalloc.
The great header shift: It didn't make sense to regenerate headers for
the same module for every target (boot/kernel/user) it appeared in. And
now that core headers are out of src/include, this was going to cause
problems for the new libc changes I've been working on. So I went back
to re-design how module headers work.
Pre-requisites:
- A module's public headers should all be available in one location, not
tied to target.
- No accidental includes. Another module should not be able to include
anything (creating an implicit dependency) from a module without
declaring an explicit dependency.
- Exception to the previous: libc's headers should be available to all,
at least for the freestanding headers.
New system:
- A new "public_headers" property of module declares all public headers
that should be available to dependant modules
- All public headers (after possible processing) are installed relative
to build/include/<module> with the same path as their source
- This also means no "include" dir in modules is necessary. If a header
should be included as <j6/types.h> then its source should be
src/libraries/j6/j6/types.h - this caused the most churn as all public
header sources moved one directory up.
- The "includes" property of a module is local only to that module now,
it does not create any implicit public interface
Other changes:
- The bonnibel concept of sources changed: instead of sources having
actions, they themselves are an instance of a (sub)class of Source,
which provides all the necessary information itself.
- Along with the above, rule names were standardized into <type>.<ext>,
eg "compile.cpp" or "parse.cog"
- cog and cogflags variables moved from per-target scope to global scope
in the build files.
- libc gained a more dynamic .module file
The manifest can now supply a list of boot flags, including "test".
Those get turned into the bootproto::args::flags field by the
bootloader. The kernel takes those and uses the test flag to control
enabling syscalls with the new "test" attribute, like the new
test_finish syscall, which lets automated tests call back to the kernel
to shut down the system.
The new zero_ok flag is similar to optional for reference parameters,
but only in cases where there is a length parameter. If that parameter
is a reference parameter itself and is null, or it is non-null and
contains a non-zero length, or there is no length parameter, then the
main parameter may not be null.
The main point of this change is to allow "global" capabilities defined
on the base object type. The example here is the clone capability on all
objects, which governs the ability to clone a handle.
Related changes in this commit:
- Renamed `kobject` to `object` as far as the syscall interface is
concerned. `kobject` is the cname, but j6_cap_kobject_clone feels
clunky.
- The above change made me realize that the "object <type>" syntax for
specifying object references was also clunky, so now it's "ref <type>"
- Having to add `.object` on everywhere to access objects in
interface.exposes or object.super was cumbersome, so those properties
now return object types directly, instead of ObjectRef.
- syscall_verify.cpp.cog now generates code to check capabilities on
handles if they're specified in the definition, even when not passing
an object to the implementation function.
Added the handle_clone syscall which allows for cloning a handle with
a subset of the original handle's capabilities.
Related changes:
- srv.init now calls handle_clone on its system handle, and load_program
was changed to allow this second system handle to be passed to loaded
programs instead. However, as drv.uart is still a driver AND a log
reader, this new handle is not actually passed yet.
- The definition parser was using a set for the cap list, which meant
the order (and thus values) of caps was not static.
- Some code in objects/handle.h was made more explicit about what bits
meant what.
This change finally adds capabilities to handles. Included changes:
- j6_handle_t is now again 64 bits, with the highest 8 bits being a type
code, and the next highest 24 bits being the capability mask, so that
programs can check type/caps without calling the kernel.
- The definitions grammar now includes a `capabilities [ ]` section on
objects, to list what capabilities are relevant.
- j6/caps.h is auto-generated from object capability lists
- init_libj6 again sets __handle_self and __handle_sys, this is a bit
of a hack.
- A new syscall, j6_handle_list, will return the list of existing
handles owned by the calling process.
- syscall_verify.cpp.cog now actually checks that the needed
capabilities exist on handles before allowing the call.
The compile_commands.json database is used by clangd to understand the
compilation settings of the project. Keep this up to date by making
ninja update it as part of the build, and have it depend on all build
and module files.
This commit contains a couple large, interdependent changes:
- In preparation for capability checking, the _syscall_verify_*
functions now load most handles passed in, and verify that they exist
and are of the correct type. Lists and out-handles are not converted
to objects.
- Also in preparation for capability checking, the internal
representation of handles has changed. j6_handle_t is now 32 bits, and
a new j6_cap_t (also 32 bits) is added. Handles of a process are now a
util::map<j6_handle_t, handle> where handle is a new struct containing
the id, capabilities, and object pointer.
- The kernel object definition DSL gained a few changes to support auto
generating the handle -> object conversion in the _syscall_verify_*
functions, mostly knowing the object type, and an optional "cname"
attribute on objects where their names differ from C++ code.
(Specifically vma/vm_area)
- Kernel object code and other code under kernel/objects is now in a new
obj:: namespace, because fuck you <cstdlib> for putting "system" in
the global namespace. Why even have that header then?
- Kernel object types constructed with the construct_handle helper now
have a creation_caps static member to declare what capabilities a
newly created object's handle should have.
Since we have a DSL for specifying syscalls, we can create a verificaton
method for each syscall that can cover most argument (and eventually
capability) verification instead of doing it piecemeal in each syscall
implementation, which can be more error-prone.
Now a new _syscall_verify_* function exists for every syscall, which
calls the real implementation. The syscall table for the syscall handler
now maps to these verify functions.
Other changes:
- Updated the definition grammar to allow options to have a "key:value"
style, to eventually support capabilities.
- Added an "optional" option for parameters that says a syscall will
accept a null value.
- Some bonnibel fixes, as definition file changes weren't always
properly causing updates in the build dep graph.
- The syscall implementation function signatures are no longer exposed
in syscall.h. Also, the unused syscall enum has been removed.
A structure, system_config, which is dynamically defined by the
definitions/sysconf.yaml config, is now mapped into every user address
space. The kernel fills this with information about itself and the
running machine.
User programs access this through the new j6_sysconf fake syscall in
libj6.
See: Github bug #242
See: [frobozz blog post](https://jsix.dev/posts/frobozz/)
Tags:
The last commit was a bandaid, but this needed a real fix. Now we create
a .parse_deps.phony file in every module build dir that implicitly
depends on that module's dependencies' .parse_deps.phony files, as well
as order-only depends on any cog-parsed output for that module. This
causes the cog files to get generated first if they never have been, but
still leaves real header dependency tracking to clang's depfile
generation.
While bonnibel already had the concept of a manifest, which controls
what goes into the built disk image, the bootloader still had filenames
hard-coded. Now bonnibel creates a 'jsix_boot.dat' file that tells the
bootloader what it should load.
Changes include:
- Modules have two new fields: location and description. location is
their intended directory on the EFI boot volume. description is
self-explanatory, and is used in log messages.
- New class, boot::bootconfig, implements reading of jsix_boot.dat
- New header, bootproto/bootconfig.h, specifies flags used in the
manifest and jsix_boot.dat
- New python module, bonnibel/manifest.py, encapsulates reading of the
manifest and writing jsix_boot.dat
- Syntax of the manifest changed slightly, including adding flags
- Boot and Kernel target ccflags unified a bit (this was partly due to
trying to get enum_bitfields to work in boot)
- util::counted gained operator+= and new free function util::read<T>
This is a rather large commit that is widely focused on cleaning things
out of the 'junk drawer' that is src/include. Most notably, several
things that were put in there because they needed somewhere where both
the kernel, boot, and init could read them have been moved to a new lib,
'bootproto'.
- Moved kernel_args.h and init_args.h to bootproto as kernel.h and
init.h, respectively.
- Moved counted.h and pointer_manipulation.h into util, renaming the
latter to util/pointers.h.
- Created a new src/include/arch for very arch-dependent definitions,
and moved some kernel_memory.h constants like frame size, page table
entry count, etc to arch/amd64/memory.h. Also created arch/memory.h
which detects platform and includes the former.
- Got rid of kernel_memory.h entirely in favor of a new, cog-based
approach. The new definitions/memory_layout.csv lists memory regions
in descending order from the top of memory, their sizes, and whether
they are shared outside the kernel (ie, boot needs to know them). The
new header bootproto/memory.h exposes the addresses of the shared
regions, while the kernel's memory.h gains the start and size of all
the regions. Also renamed the badly-named page-offset area the linear
area.
- The python build scripts got a few new features: the ability to parse
the csv mentioned above in a new memory.py module; the ability to add
dependencies to existing source files (The list of files that I had to
pull out of the main list just to add them with the dependency on
memory.h was getting too large. So I put them back into the sources
list, and added the dependency post-hoc.); and the ability to
reference 'source_root', 'build_root', and 'module_root' variables in
.module files.
- Some utility functions that were in the kernel's memory.h got moved to
util/pointers.h and util/misc.h, and misc.h's byteswap was renamed
byteswap32 to be more specific.
Bonnibel used to copy any asset into the root - now a path can be
specified for assets, and debug output goes into .debug so GDB will pick
it up automatically.
Overall, I believe TOML to be a superior configuration format than YAML
in many situations, but it gets ugly quickly when nesting data
structures. The build configs were fine in TOML, but the manifest (and
my future plans for it) got unwieldy. I also did not want different
formats for each kind of configuration on top of also having a custom
DSL for interface definitions, so I've switched all the TOML to YAML.
Also of note is that this change actually adds structure to the manifest
file, which was little more than a CSV previously.
This change adds a new interface DSL for specifying objects (with
methods) and interfaces (that expose objects, and optionally have their
own methods).
Significant changes:
- Add the new scripts/definitions Python module to parse the DSL
- Add the new definitions directory containing DSL definition files
- Use cog to generate syscall-related code in kernel and libj6
- Unify ordering of pointer + length pairs in interfaces
This change moves Bonnibel from a separate project into the jsix tree,
and alters the project configuration to be jsix-specific. (I stopped
using bonnibel for any other projects, so it's far easier to make it a
custom generator for jsix.) The build system now also uses actual python
code in `*.module` files to configure modules instead of TOML files.
Target configs (boot, kernel-mode, user-mode) now moved to separate TOML
files under `configs/` and can inherit from one another.
Created the framework for using different loadable panic handlers,
loaded by the bootloader. Initial panic handler is panic.serial, which
contains its own serial driver and stacktrace code.
Other related changes:
- Asserts are now based on the NMI handler - panic handlers get
installed as the NMI interrupt handler
- Changed symbol table generation: now use nm's own demangling and
sorting, and include symbol size in the table
- Move the linker script argument out of the kernel target, and into the
kernel's specific module, so that other programs (ie, panic handlers)
can use the kernel target as well
- Some asm changes to boot.s to help GDB see stack frames - but this
might not actually be all that useful
- Renamed user_rsp to just rsp in cpu_state - everything in there is
describing the 'user' state
Update the build_sysroot to build an LLVM 11 sysroot, dealing with
LLVM's change in structure. Also re-add peru.yaml for getting a
pre-built sysroot via peru.
See: https://github.com/buildinspace/peru
For the fb driver to have a font before loading from disk is available,
create a hard-coded font as a byte array.
To create this, added a new scripts/psf_to_cpp.py which also refactored
out much of scripts/parse_font.py into a new shared module
scripts/fontpsf.py.
Create a new framebuffer driver. Also hackily passing frame buffer size
in the list of init handles to all processes and mapping the framebuffer
into all processes. Changed bootloader passing frame buffer as a module
to its own struct.
On my laptop, a few symbol names raise errors as invalid when attempting
to demangle them. This should not stop the rest of the symbol table from
building.
Remove ELF and initrd loading from the kernel. The bootloader now loads
the initial programs, as it does with the kernel. Other files that were
in the initrd are now on the ESP, and non-program files are just passed
as modules.
vm_space no longer relies on page_manager to map pages during a page
fault. Other changes that come with this commit:
- C++ standard has been changed to C++17
- enum bitfield operators became constexpr
- enum bifrield operators can take a mix of ints and enum arguments
- added page table flags enum instead of relying on ints
- remove page_table::unmap_table and page_table::unmap_pages