Understanding kernel panic in linux
What is a kernel panic?
A kernel panic is one of several Linux boot issues. In basic terms, it is a situation when the kernel can't load properly and therefore the system fails to boot. During the boot process, the kernel doesn't load directly. Instead, initramfs loads in RAM, then it points to the kernel (vmlinuz), and then the operating system boots. If initramfs gets corrupted or deleted at this stage because of recent OS patching, updates, or other causes, then we face a kernel panic.
When a Linux system boot process starts after the Master Boot Record (MBR) step, GRUB is loaded. The kernel needs to be loaded into RAM to start the OS, but the kernel is situated on the hard disk (/boot/vmlinuz), and the hard disk is not yet mounted on /. Without mounting, no files can be accessed, even the kernel. To overcome this, first initramfs/initrd loads in RAM directly and mounts the /boot partition in read-only mode. Next, it mounts the hard disk on the / partition, and the process continues.
This process emphasizes the importance of initramfs/initrd in the Linux boot process.
Figures below to illustrate - Boot Process Mechanism
Why Kernel panic Occurs?
Kernel panics occur:
If the initramfs file gets corrupted.
If initramfs is not created properly for the specified kernel. Every kernel version has its own corresponding initramfs.
If the installed kernel is not supported or not installed correctly.
If recent patches have some flaws.
If a module has been installed from online or another source, but the initrd image is not created with the latest installed module.
How to troubleshoot?
The first thing to do after seeing a kernel panic error is not to panic ,because now you are aware of the image file related to the error.Step 1: Boot the system normally with your given kernel version.
Press Enter or any key, and then you will see the following:
In RHEL 6 or earlier versions, we do not have this option, but in RHEL 7 and onwards, we have a built-in rescue image. This image boots your OS normally.
Step 2: Go to /boot and list all files. Here you will see there is no initramfs file for your kernel, but there is an initramfs file for rescue by which you have booted your system, and another is for kdump.
The initramfs for the kernel is missing.
Step 3: You will need to create a new initramfs file that corresponds to your kernel version.
Step 4: First check your kernel version :#uname -r
Step 5: Next, run the dracut command:#dracut -f <initrd-image> <kernal-version>
Step 6: List the /boot directory contents again. The initramfs file for the kernel is now created.
Step 7: Now, when you boot normally, your machine starts without a kernel panic error.
Step 8:There might be a situation that occurs when you boot your system with a rescue image with creating a new initramfs file where you couldn't make a new file because it was already present.
Step 9: At this point, we need to create an initramfs image with the mkinitrd command or dracut command.
Step 10: Check your kernel version first using the uname -r command.
Run the mkinitrd command with the --force option and your kernel specification:#mkinitrd --force <initrd-image> <kernal-version>
To Summarize:
Kernel panic is a critical error that occurs in the Linux kernel when it encounters a situation from which it cannot recover. This can happen due to a variety of reasons, such as hardware failure, a software bug, a corrupted file system, or a misconfigured kernel module. When a kernel panic occurs, the Linux kernel detects the error and immediately stops all processes and activities. It then prints a message on the console screen indicating the cause of the panic and kernel panic providing some diagnostic information, such as a stack trace, register values, and system memory usage.
The purpose of the kernel panic is to prevent further damage to the system by stopping any processes that may be running incorrectly or causing the error. It also provides valuable information to system administrators or developers to diagnose and fix the issue.