do libreboot computers crash when on heavy load?

5 réponses [Dernière contribution]
tonlee
Hors ligne
A rejoint: 09/08/2014

I got a libreboot gigabyte ga g41m eS2l and lenovo t400. I noticed infrequently
both would crash. Turning of in a second. It appeared both would crash when
on heavy load. For instance displaying youtube videos.
Is this something libreboot computers do? I once read leah
saying something about a heat sensor stuck at a low trigger temperature. Is
that what causes the crashes? Because I never knew when the libreboot
computers would crash, I refrained from using them. Thank you.

jxself
Hors ligne
A rejoint: 09/13/2010

I don't have that problem on my GA-G41M-ES2L. Of course, I've moved on to GNU Boot. Exactly what CPU do you have in yours?

tonlee
Hors ligne
A rejoint: 09/08/2014

about the lenovo t400 command cat /proc/cpuinfo returns intel core 2 p8600 2.40ghz. I do not have the gigabyte computer around.
If you ask leah she will tell gnu boot is inferior to libreboot.

GNUser
Hors ligne
A rejoint: 07/17/2013

FWIW, I have a T400 bought many years ago from minifree
It does have a tendency as of late to lock/crash when under heavy load yes. BUT not always.
I think it has not to do with Libreboot, instead it's the RAM that probably is not healthy anymore and when under heavy load it will occasionally try to access some of the un-healthy memory portions, causing the crash.
Again, I didn't investigate this further.

Sometimes I can still access xkill (by running alt+F2 and inputting xkill, albeit slowly) and kill the browser, which seems to bring things back to normal.But most of the times I have to reboot the machine.

Magic Banana

I am a member!

I am a translator!

Hors ligne
A rejoint: 07/24/2010

You can test your RAM during one full night after installing "memtests86+" (in Trisquel's repository). Memtest86+ is to be launched from the menu of GRUB, the bootloader, that lists the installed operating systems right after the computer switches on. You may have to press [Esc] to have GRUB's menu appear. If memtest86+'s blue screen turns red, your RAM is defective. Instead of installing memtest86+, you can test your RAM with any live ISO that includes memtest86+.

Goat_Avenger
Hors ligne
A rejoint: 03/24/2020

I had a GA-G41M-ES2L running coreboot (a libre build) at one point. It frequently crashed or froze, I can't remember which. It started happening more and more frequently, until I got fed up with it; it was quite disappointing, as it was otherwise a nice setup.

I concluded, I believe, in the end, that the hardware was simply old: which, all of these boards were old years ago, and now they are even older. This means there is the possibility for capacitors to be going bad: unsteady power delivery to virtually any and all possible places on the board. I'm not a hardware guru, but, that was my conclusion at the time: the hardware was simply old. I wasn't able to recap the board to see if my theory was correct.

Another possibility is that, certain CPUs, not having microcode inserted, might snag on errata that isn't being fixed by said microcode. Though, I believe at the time, I tested my board with and without microcode updates, and the freezes/crashes remained.

So in short answer, in my experience, _yes_, I've had various boards running coreboot that froze on me at times, and never figured out why. I have two identical laptop models right now, both running coreboot. One of them never gives me problems, the other has been freezing lately. So, it's likely either a hardware issue, or a software issue with one particular boards configuration (CPU/MOBO model, etc..).

Generally coreboot is going to run pretty solid for boards that are well maintained and have good ports, but, always keep in mind that, it's a community effort, and motherboards almost always have multiple revisions. So a port for one specific board may work great for the revision the port work was done on, but there may be outlier cases of other board revisions that weren't tested for.

People often say to test your RAM, but, being as I've run into these kinds of problems before, I'm inclined to believe it has more to do with specifically, slightly errant motherboard revisions, or general hardware deterioration. I've also had a board with ram that failed memtest, but the machine ran rock solid regardless. My guess was that certain memory had been reserved by the bios and simply wasn't accessible to memtest, thus it failed; but, that was just a guess.

You could test your cpu with microcode updates, and see if the crashes/freezes still occur. That would rule out the CPU not behaving. After that you could test your ram/try different ram. And if the problems still persist, you could try/attempt to build a newer version coreboot yourself with no blobs.

I suspect however, the following, if, it's not a hardware deterioration issue.

Try booting with the following kernel parameter. [ intel_idle.max_cstate=0 ]

I know I had one specific Thinkpad, where, booting with that kernel parameter fixed the system from freezing sometimes, while idling (any c-state above 2 was problematic I believe on a particular T520). My theory is that the coreboot code didn't mesh well with that particular boards power delivery setup, as in: once the CPU was abrubtly awakened from a deep sleep c-state, it would request more power than it could receive that quickly, and would hiccup and freeze, as a result.

Another set of kernel parameters you can try are the following:
[thermal.psv=80] <-- tells the kernel to start thermal throttling at the specified integral temperature
[thermal.tzp=15] <-- tells the kernel when to poll for a temperature reading for thermal throttling (in decaseconds (in this case every 1.5 seconds)).

Since you stated you run into issues under heavy load, it's possible that you are running into thermal issues, OR, you are snagging on microcode errata for your specific CPU, and if neither of those two things, perhaps it could be a cstate issue/power delivery issue with board and the code running on it, and if not that a memory issue, and lastly if not that, simply the hardware is old and deleterious.

My apologies for the exhaustive post, but, in short, yes, these sorts of things have been a common occurrence for me with coreboot; and my theory is that, it's usually always a problem at the intersection of the coreboot code and the hardware. Though in some cases it may have been an issue of old hardware.