Lenovo X200 anomalous CPU usage

5 réponses [Dernière contribution]
jfw01
Hors ligne
A rejoint: 02/01/2022

I have a minFree X200 running Trisquel.

user@user-ThinkPad-X200:~$ uname -a
Linux user-ThinkPad-X200 5.4.0-137-generic #154+10.0trisquel11 SMP Sun Jan 15 01:27:32 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
user@user-ThinkPad-X200:~$

I reached a state where the GUI, Pluma and two terminal windows were running, one with top -d 10 sorting by CPU

The overall CPU usage was 45% user 10% system. The fan speed and heat output are consistent with this.

The top processes were Xorg 5% CPU and caja 4% CPU, with a few others, totalling maybe 15%.

How do I investigate CPU usage that does not show in top?

jfw01
Hors ligne
A rejoint: 02/01/2022

This slightly silly analysis procedure:
diff <(ps -eo cputimes,pid,comm | sort -n) <(sleep 100; ps -eo cputimes,pid,comm | sort -n)
does not help. The users of the anomalous cpu-time show equally little in ps and top.

I tried sorting by pid on the theory that something was forking a lot, so the anomalous cpu-time was consumed by short-lived processes. I have not caught one yet.

jfw01
Hors ligne
A rejoint: 02/01/2022

I think that there was a bootrom regrade about three weeks ago, or at least there was a big update on shutdown, which is unusual.

My current silly theory is that someone has half-arsed a hypervisor that's borrowing my battery power to mine bitcoins.

I'm open, but not hopeful, for stories about how I would test that.

jfw01
Hors ligne
A rejoint: 02/01/2022

The laptop came with flashrom and a bunch of other tools. I have read the bootrom and it is apparently unchanged since the last time I wrote it. Anyone who wrote a hypervisor that disguised a bootrom change would also disguise the cpu statistics, so I think that I've eliminated that theory.

I have two further anomalies.

1) If the cpu-usage anomaly does not trigger, then the baseline cpu-usage on an idle machine is about 2% plus 2% for each web browser. If there is a top running in virtual console 1 then, after a logout/login, the anomaly does not obviously trigger.

2) There appears to be cooperative tasking in the GUI user environment. A caja instance hung, and also the machinery that paints the desktop background stopped operating (I got black background plus copies of a few closed windows, probably maintained in save-unders). During the hang, there was not obvious anomalous usage. When I force-quit the caja, desktop painting resumed and the cpu anomaly did not.

(2) could be a subset of (1) if i had a top running in a window, and the anomaly does not distinguish that from top running in a virtual console.

I'm still on the question(s): what are the ways in which significant user-mode cpu usage can be invisible to top's process list, and to ps, and how would I investigate them?

jfw01
Hors ligne
A rejoint: 02/01/2022

Somewhere along the way, I started enabling magic-sysreq options, including sysreq-k. It is now my favourite way of exercising (2) above; ie getting to a non-triggered gui after starting top in another virtual console.

I have a correlate of the fault, where the system-wide number of forks in vmstat -f goes up by one or two each time I run it, in the non-triggered state, and counts at about 100 forks per second in the triggered state.

So now I have the question: how would I do system-wide monitoring to ask which process is calling (some variant of) fork() a lot?

Based on past experience, I expect no reply to this question. That being the case, in about a week, I expect that I will regrade to Trisquel 11, which may suppress the fault.

jfw01
Hors ligne
A rejoint: 02/01/2022

The fault is no longer reproducible in Trisquel 11.