Linux kernel gets "blue screens" with QR code

The Linux operating system recently learned to output information via QR codes in case of a hopeless error. The Linux kernel will have a similar function.

Save to Pocket listen Print view
Penguin sits helplessly in front of a computer that displays a QR code and "Kernel Panic"

(Image: Bild erstellt mit KI in Bing Designer durch heise online / dmk)

10 min. read
By
  • Thorsten Leemhuis
Contents
This article was originally published in German and has been automatically translated.

According to current plans, the Linux kernel 6.12 will be able to display a QR code with error information if it stops working completely for safety reasons due to an extremely critical error. This is a problem known in the Unix world as a "kernel panic" and in Windows as a "blue screen" or "blue screen of death".

An example of a QR code that will be displayed in the future in the event of a kernel panic.

(Image: Jocelyn Falempe)

The code developed by Jocelyn Falempe to display the QR code in the event of a panic is the first feature in the official kernel implemented with the Rust programming language that appeals to a wider range of users. At least potentially: Due to several hurdles, such QR codes will probably not be seen in mainstream Linux distributions for at least six months; it is also still uncertain with how many of the graphics cards commonly used in x86 systems the whole thing will work by then.

Fedora Linux 42 should include this function next year. It complements the QR error codes that Linux distributions have been able to display for six months with the help of Systemd. According to numerous reports, this approach at operating system level already seems to do what the Linux kernel is only now learning to do. In reality, however, Systemd is powerless in the event of a genuine kernel panic, but it can react more flexibly when one occurs.

The smallest of three obstacles that will prevent users from seeing QR code panics of the kernel any time soon: the program code responsible for generating the QR code, which was written in Rust. In order to be able to use it, distributions must therefore activate "Rust for Linux" support when building their kernels. Until now, many have not done this because it requires special compiler versions that must not be too old, but also not too new – and change every now and then.

This will no longer be necessary from Linux 6.11, which is expected to be released on September 16 or 23: from this kernel version onwards, Rust for Linux will support the Rust compiler 1.78.0 or newer in the medium or long term. However, even if this complication is dropped, Rust support will retain its "Experimental" status. All kinds of distributors are therefore only likely to activate it when they are sufficiently keen on some kernel feature that Rust requires.

The code for displaying QR codes in kernel panics is based on the"drm_panic" feature, which Linux has offered since version 6.10 and also comes from Jocelyn Falempe. This allows the kernel to snatch the graphics output from the desktop interface in the event of a critical error in order to display the text of a kernel panic undisturbed. Previously, it was not able to do this; if a desktop interface was running, the image usually simply froze without the kernel being able to name the cause and output information for debugging.

The problem: The graphics driver used must explicitly support drm_panic. Apart from the generic kernel driver Simpledrm, only a few rather exotic drivers currently support this. At least for the GeForce driver Nouveau, which is common in the x86 world, Falempe published patches to support drm_panic back in May, but these are not yet on their way into the official kernel. There is no sign of corresponding adaptations for the Amdgpu and i915 drivers, which are also frequently used, but Falempe wants to ensure that these are also included.

Falempe has recently removed a third obstacle: In order to be able to activate drm_panic when building the kernel, users must deactivate the CONFIG_VT_CONSOLE option in Linux 6.11. However, Linux then does not display any kernel messages at startup and cannot open an emergency login autonomously in single-user mode. Some recently proposed changes eliminate the conflict between the two features; as things stand, they will also be included in Linux 6.12, which should be released on the last two Mondays in November or the first Monday in December.

The QR code panics also require a web service that receives and decodes the data contained in the QR code via URL. This is necessary because the kernel compresses the information with Zlib in order to be able to store all the details that are typically important for the analysis in the QR code -- including the location in the kernel code where the error occurred, including a backtrace.

Falempe has published the source code for such a web service. However, operating such a service naturally incurs costs that someone has to cover. Therefore, when building a kernel with QR code panic, the URL of the web service to be used must be explicitly defined; this then flows into the QR code so that any scanner can find its way there independently. It remains to be seen whether each distribution will operate such a web service itself or whether someone trustworthy will set up one that is accessible to all.

The "blue screen of death" function of Systemd 255, which was released last December, does not need such a service because the text it contains is not normally compressed. This approach is aimed at other error situations anyway. One of these is that when booting, a systemd contained in Initramfs (sometimes also called Initrd ) does not find the volume with the root file system. If the system configuration then prohibits the provision of an emergency shell for handling, the system and service manager can no longer go back and forth.

In such situations, the "systemd-bsod" service introduced with version 255 could then display a "blue screen of death" with a helpful error message. The alternative would be to give up and hand responsibility back to the kernel. However, the kernel would then be faced with the same dilemma or a subsequent problem. In such a case, it would therefore only stretch its wings with errors such as "Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)" or "Kernel panic - not syncing: Attempted to kill init!".

Scenarios like this, coupled with the name of the Systemd service and superficial reporting, are the reason why the impression has been created in many places that Linux now rules blue screens. If the term "Linux" only refers to the so-called kernel, this is wrong; if, on the other hand, it refers to operating systems built with the kernel and an up-to-date Systemd, this is halfway correct.

The conventional display of a kernel panic.

(Image: c't)

In the end, however, the distinction is important: As a userspace program, Systemd cannot handle real kernel panics at all, because userspace no longer comes into play with such panics; instead, the kernel only outputs some information for debugging in order to then deliberately and completely stop operation. This is in everyone's interest because malware, for example, could otherwise possibly use the situation to its advantage; it is also possible that something has gotten so badly mixed up that data or hardware could be damaged. This is the difference between a "kernel panic" and a "kernel oops". In the latter case, the kernel has also noticed a major error, but has intercepted it in such a way that it believes it can continue to work correctly for the most part. However, subsequent errors cannot be completely ruled out.

In the event of a panic, the kernel no longer responds to keyboard input due to the severity of the situation. In order to allow users to debug, it then displays a lot of information that is probably necessary in a compact format – just like Windows, this looks rather cryptic and is limited by the screen space.

Compressed text in a QR code could therefore possibly be used to transport even more information, if the developers wanted to. But because screen space is limited and the kernel may no longer respond to input, users have to choose between classic output and QR code. This can be done via boot parameters or changes to the /sys/module/drm/parameters/panic_screen file.

If desired, distributions can also configure the solution so that, by default, only a vague indication of a critical error appears, prompting a reboot. This is currently the plan for Fedora Linux 42, which is expected in April 2025 and will usedrm_panic according to current plans; Falempe is once again a driving force behind this.

Because the kernel completely refuses to output panic information, it makes sense that Systemd can also output more details about the problem via QR code in hopeless situations. Because if Systemd can see these, the kernel is still completely together. This allows Systemd-Bsod to react much more flexibly to the situation and, in principle, even interact with the user. Currently, the two solutions differ visually, making it easy to tell them apart; in any case, distributions can configure background colors other than blue for both approaches if they wish.

(mma)