Linux Kdump for system panics
The kdump procedure
The received warning means the kdump operation might fail and the crashdump parameter should be configured correctly. This is the procedure of kdumping:
1. The normal kernel is booted with crashkernel=… as a kernel option, reserving some memory for the kdump kernel. The memory reserved by the crashkernel parameter is not available to the normal kernel during regular operation. It is reserved for later use by the kdump kernel.
2. The system panics.
3. The kdump kernel is booted using kexec, it used the memory area that was reserved w/ the crashkernel parameter.
4. The normal kernel’s memory is captured into a vmcore.
Note: Not reserving enough memory for the kdump kernel can lead to the kdump operation failing.
Configuring crashkernel
RHEL6.0 and RHEL6.1 kernels
ram size | crashkernel parameter | ram / crashkernel factor |
---|---|---|
>0GB | 128MB | 15 |
>2GB | 256MB | 23 |
>6GB | 512MB | 15 |
>8GB | 768MB | 31 |
RHEL6.2 (and later) kernels
Starting with RHEL6.2 kernels crashkernel=auto
should be used. The kernel will automatically reserve an appropriate amount of memory for the kdump kernel. crashkernel is configured in file /etc/grub.conf
Keep in mind that it is an algorithmically calculated memory reservation and might not meet the needs of all systems (Especially for configurations with lots of IO cards and loaded drivers). So always make sure that memory reserved by crashkernel=auto is sufficient for the target machine by testing kdump. If it is not, reserve more memory by syntax crashkernel= XM (X is amount of memory to be reserved in mega bytes).
Additionally some improvements have been made in the RHEL6.2 kernel which have reduced the overall memory requirements of kdump.
Note: It is recommended to test and verify that kdump is working on all systems after installation of all applications. The memory reserved by crashkernel=auto
takes only typical RHEL configurations into account. Some hardware and larger configurations with many option cards may not work well with with crashkernel=auto
, in this case the use of crashkernel=512M
or more may be a recommended size to start.
Prior to the 6.3GA release, crashkernel=auto
will only reserve memory on systems with 4GB or more physical memory. If the system has less than 4GB of memory the memory must be reserved by explicitly requesting the reservation size, for example: crashkernel=128M
. Since the 6.3GA release (kernel-2.6.32-279.el6
), this limit has been lowered to 2GB.
Some environments still require manual configuration of the crashkernel
option, for example if dumps to very large local filesystems are performed.
RHEL 7
crashkernel is configured in the GRUB_CMDLINE_LINUX
line in /etc/default/grub
:
GRUB_TIMEOUT=5
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=rhel00/root rd.lvm.lv=rhel00/swap
rhgb quiet"
GRUB_DISABLE_RECOVERY="true"
After modifying /etc/default/grub
, regenerate the GRUB2 configuration using the edited default file. If your system uses BIOS firmware, execute the following command:
# grub2-mkconfig -o /boot/grub2/grub.cfg
On a system with UEFI firmware, execute the following instead:
# grub2-mkconfig -o /boot/efi/EFI/redhat/grub.cfg
Note:
RHEL7 with crashkernel=auto will only reserve memory on systems with 2GB or more physical memory. If the system has less than 2GB of memory the memory must be reserved by explicitly requesting the reservation size, for example: crashkernel=128M. Some environments still require manual configuration of the crashkernel option, e.g A 2TB oracle databasehost crash ,kdump not work, to modify crashkernel=2048M, then it work fine.
how much memory was reserved for the kdump kernel?
This is available when executing cat /proc/cmdline
. Even when the kernel was started with crashkernel=auto
then /proc/cmdline
will contain the computed value that got reserved.
cat /proc/cmdline
cat /sys/kernel/kexec_crash_size
Where to find generated vmcores
When the hypervisor comes back up, the vmcore can be found by default under /var/crash/
: (SLES is /var/log/dump)
# egrep '^path' /etc/kdump.conf
path /var/crash
The crash tool in kernel-debuginfo for analysis.
How to test kump work?
# echo c > /proc/sysrq-trigger
Note: This command will cause the operating system to crash, do not test during the run-time of the industry branch.
对不起,这篇文章暂时关闭评论。