Linux kernel 6.15: A closer look at new features
Linux 6.15 once again improves performance, offers standardized firmware management and clears the way for x86.
(Image: heise online / dmk)
Linus Torvalds released Linux kernel 6.15 last week. In addition to the usual new and adapted drivers and many small corrections, the new kernel is mainly dedicated to optimization and decluttering. Some new features stand out and are worth a closer look.
Better performance
The “Translation Lookaside Buffer” (TLB) is a small, fast cache in the CPU that helps to translate virtual addresses quickly into physical addresses. It stores the most recently used address translations to avoid accesses to the page table –, a kernel-managed table in memory –. If a virtual address is used, the CPU first checks the TLB. If there is no entry there, the hardware accesses the page table to find the physical memory location.
If the page table changes, stored TLB entries may be out of date. The kernel must therefore explicitly delete them from the TLB so that they are reloaded the next time they are used. To do this, it uses CPU instructions such as INVLPD (x86 in general) or INVLPGA (for AMD), as well as coordinated TLB shootdowns, in which other processors are requested to delete their TLB entries via interrupts. This is necessary because in a multiprocessor system, each CPU has its own TLB, which can hold invalid entries.
These operations are particularly time-consuming with many CPUs or when deleting many TLB entries. During a TLB shootdown, the kernel uses so-called “inter-processor interrupts” (IPI) to request other CPUs to delete their TLB entries. Waiting for all affected processors to receive and process the interrupts is particularly time-consuming. In addition, only one TLB entry at a time can be sent for deletion in an IPI.
With Linux 6.15, a patch for AMD processors contributed by Meta provides a much more efficient solution. Linux now uses the AMD-specific CPU command INVLPGB. The command has been available since the Zen 3 architecture, which AMD introduced in November 2020. The highlight: INVLPGB does not have to wait for interrupts to be processed, nor is it limited to deleting individual entries. Multiple TLB entries can be removed in a single broadcast. This combination of several TLB entries during deletion and the elimination of the waiting time should provide a performance boost.
Standardized firmware management
Linux 6.15 introduces the new fwctl (Firmware Control) subsystem. It provides a standardized and secure interface for managing firmware directly from user space. Until now, there has been a lack of standardized and device-independent methods for configuring, provisioning and diagnosing firmware errors.
Modern hardware often contains extensive components, each with its firmware. As there has been no standardized method to date, everyone has been brewing their soup. Ultimately, this led to a proliferation of fragmented and insecure solutions. fwctl aims to change this.
Security and compatibility
fwctl is a kernel subsystem that makes it possible to communicate with the firmware of a device via remote procedure calls (RPCs) –, for example to transfer new firmware. This communication takes place via a standardized interface that is based on IOCTLs and is accessible via /dev/fwctlX device files. This can be used to perform various tasks, such as reading debug information, configuring devices or provisioning hardware.
fwctl implements security mechanisms that specifically control access to sensitive functions. It defines different access levels (scopes) for RPCs to control the level of access to the firmware. For example, full debug access requires special authorizations and marks the kernel as “tainted” (literally “dirty” or “unclean”). A “tainted” kernel generally signals that the kernel is in a modified or potentially insecure state –, for example due to unsupported drivers or deep debug access.
In addition, fwctl ensures compatibility with existing kernel subsystems such as RDMA, DRM or VFIO. This prevents applications from having to access device-specific interfaces that may change in the future.
Focus on mounts
With kernel 2.6.36, fanotiy (file access notify) was introduced as an interface for real-time virus scanners. It was originally used to monitor, notify and prevent file system access. Over the years, fanotify has evolved into an egg-laying woolly milk sow. For example, systemd relies on fanotify's read-ahead to wisely read the contents of files completely into RAM and speed up access to them.
Linux 6.15 dops the wool-milk sow again. fanotify now allows registration to be informed about the mounting and unmounting of file systems. Instead of polling /proc/<PID>/mountinfo as before, which is expensive, the process now leans back and lets itself be informed about changes to the mounts. The kernel knows when a mount is added or removed. After all, it mounts and unmounts. A process that constantly asks for information is quite an overhead.
The kernel introduces the notification masks FAN_MNT_ATTACH and FAN_MNT_DETACH for fanotify to monitor the mounting and unmounting of a file system. The registered process receives the mount ID of the affected file system as a notification. This can then be used to obtain more detailed information about the mount via system calls listmount() and statmount(). The two system calls were first introduced in Linux 6.8 and have been extended again in 6.11. This time, too, the programmers have extended the statmount() system call, which now also contains information about the ID mappings applied to the mount.
ID-mapped from ID-mapped
The new kernel incarnation also takes “ID-mapped mounts” to a new level. Linux 6.15 now also allows the creation of ID-mapped mounts from existing ID-mapped mounts. Previously, this only worked for file systems that were not subject to ID remapping.
As ID remapping has already been tweaked several times, ID-mapped mounts and the associated “water head” have become established. This was not quite foreseeable when kernel 5.12 was released.