Announcement

Collapse
No announcement yet.

FATAL Out of memory error on boot - GRUB-EFI problem

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    [RESOLVED] FATAL Out of memory error on boot - GRUB-EFI problem

    I'm actually kind of surprised no one else has encountered this and it can be bad if you do.

    The symptom is you boot up and you get the "Welcome to GRUB" "GRUB Loading" but then they are followed by "Out of memory error..." and then you're left to force a power off.

    IF you're lucky, a previous kernel might boot. If not, you're left booting to a live USB and chroot-ing into your install to fix your kernel. Like I said, this can be kinda bad.

    My encountering this included these relevant conditions:
    4K monitor
    nVidia driver in use
    initrd.img larger than 100MB
    EFI booting

    TL/DR: there's a bug in GRUB-EFI that causes it to run out of memory if your GRUB_GFXMODE is too large. Basic solution is to set it to 640x480 in /etc/default/grub and update grub.

    The other factors I listed above contribute to the problem and I had mixed results with various combinations.
    1. The nvidia driver makes initrd.img much larger
    2. The default initrmafs.conf has "MODULES=" set to "most"
    3. The default initramfs.conf has "COMPRESS=" set to "zstd"
    4. The default screen resolution my laptop is 4K which is a very large framebuffer
    My initial encounter left all my kernels unbootable. My initrd.img for all 3 kernels were 110MB+. Luckily I had a snapshot with an older kernel that would boot.

    My research revealed a probable cause was the initrd.img being too large. So I changed compression to "gzip" and modules to "dep".

    My testing show very little change when changing compression types, but setting modules to "dep" (try and load only whats needed) instead of "most" (load everything) reduced the initrd.img size by -50MB.

    This allowed -42 and -48 kernels to boot, but -52 was still just over 100MB - more than twice as large as the others.

    This was where the nvidia driver came into play. I had not loaded the nvidia driver in the two older kernels. I had been reconfiguring and updating this laptop and had previously just been using the Intel video rather than nVidia.

    So I was faced with removing the nVidia driver, which was possible as I don't game on this machine. But I like having options so I decided to continue digging.

    Finally I found a comment regarding the screen size and GRUB. Apparently the 4k graphics size eats half the available 200MB RAM from GRUBs allotment. Thus any initrd.img larger than 100MB won't load.

    Setting GRUB_GFXMODE=640x480 in grub's defaults seems to have resolved the issue for now. This bug is widely known and I am confident there will be a fix fairly soon. For now, I'll just have to tolerate a GIANT GRUB MENU.
    Last edited by oshunluvr; Nov 02, 2022, 07:26 AM.

    Please Read Me

    #2
    So, if I read this correctly, the evidence shows that - with the same nvidia driver- we have seen kernel bloat to the point where its size is interfering with the booting of the system (k-42 and k-48 boot, k-52 doesn't, using the current NV driver, I assume). Is there some technical reason grub is limited to 200mb?​

    It does sort of beg the question: what was the point of moving away from a monolithic design if the kernel is now configured to simply load all the modules? Doesn't it surprise you a bit?

    Comment


      #3
      I think it's been determined to be a bug in GRUB rather than intentional. I suspect no one 10 years ago thought the initrd.img would get so large. nVidia has never played well with Linux - which is why I started buying AMD cards instead. This is a laptop so I was rather stuck with nVidia. I will probably go back to the Intel chip anyway as it seems to work fine and uses less battery. We'll see when my external monitor gets here for the laptop.

      As far as the initramfs "bloat" - let's remember we're talking about MBs not GBs. Anyway, the reason modules are defaulted to "most" is because this keeps the kernel more transferable. In other words, if you select "dep" you may not have support for something you might add to your system or moving your install to a different system might fail to boot. Seems easy enough to simply change to "dep". Test the results on a single kernel to be sure it still boots. Then if all is OK, leave it that way. My usual habit is to be sure and keep at least one old kernel that I know is bootable. This caught me off guard so I thought it was worth it to post about it.

      Clearly though, possibly not being able to use the nVidia driver is a show stopper for lots of people.

      I'll be surprised if there isn't some wailing about how ugly GRUB is at 640x480 on a 4K screen, LOL. I never understood the need for the 3 seconds of screen time that GRUB spends on my systems to be "artistic." I'm fine with it just "working."
      Last edited by oshunluvr; Nov 02, 2022, 07:24 AM.

      Please Read Me

      Comment

      Working...
      X