General protection faults in Xorg

5 replies [Last post]
Avron

I am a translator!

Offline
Joined: 08/18/2020

I have my desktop becoming frozen from time to time and the logs usually show a "general protection fault" in Xorg with the call stack referring to nouveau functions. I had occassionally the same with compton instead of Xorg (slightly different call stack but still nouveau). I tried the default Trisquel kernel, linux-libre-lte and linux-libre, it is the same. Also, I see no way to reproduce the problem on purpose. Now I am searching how to live with it or avoid it.

Is there a way to trigger a clean reboot automatically upon a general protection fault? (At least, this would save me time).

How to make it so that the desktop does not launch Xorg by default?

If I connect my laptop to the monitor and run programmes on the desktop with display on the laptop X display, I would be able not to run Xorg on the desktop and perhaps avoid the general protection faults. If I ssh -X to my desktop from the laptop, set DISPLAY to the laptop and start a programme from the terminal, will that work the same as before?

Magic Banana

I am a member!

I am a translator!

Offline
Joined: 07/24/2010

As far as I understand https://man.openbsd.org/ssh#X11_FORWARDING X11 forwarding requires Xorg running on both systems. Do you have an Intel processor with integrated graphics that you could use instead of the NVidia card?

Avron

I am a translator!

Offline
Joined: 08/18/2020

Do you have an Intel processor with integrated graphics that you could use instead of the NVidia card?

No, it is AMD, on a D8 board. There is aspeed GPU on the main board, I previously tried to connect a monitor to it but the monitor said there is no signal.

As far as I understand https://man.openbsd.org/ssh#X11_FORWARDING X11 forwarding requires Xorg running on both systems.

Thanks for pointing there. Are you deducing that from "ssh creates a “proxy” X server on the server machine for forwarding the connections over the encrypted channel."?

I just remembered that I have an Olinuxino Lime2 running Parabola (it has issues with disks so I stopped using it and forgot about it but it is running), I just did ssh -X to it and started emacs (I don't set the DISPLAY variable as suggested by man, unlike what I wrote before), the emacs windows comes on my desktop. With "ps -ef", here is what I see:

[david@parabola ~]$ ps -ef 
UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0 Jun16 ?        00:00:03 /sbin/init
root         2     0  0 Jun16 ?        00:00:05 [kthreadd]
root         3     2  0 Jun16 ?        00:00:00 [rcu_gp]
root         4     2  0 Jun16 ?        00:00:00 [rcu_par_gp]
root         8     2  0 Jun16 ?        00:00:00 [mm_percpu_wq]
root         9     2  0 Jun16 ?        00:00:00 [rcu_tasks_kthre]
root        10     2  0 Jun16 ?        00:00:00 [rcu_tasks_trace]
root        11     2  0 Jun16 ?        00:00:06 [ksoftirqd/0]
root        12     2  0 Jun16 ?        00:06:25 [rcu_preempt]
root        13     2  0 Jun16 ?        00:01:01 [migration/0]
root        15     2  0 Jun16 ?        00:00:00 [cpuhp/0]
root        16     2  0 Jun16 ?        00:00:00 [cpuhp/1]
root        17     2  0 Jun16 ?        00:00:46 [migration/1]
root        18     2  0 Jun16 ?        00:00:11 [ksoftirqd/1]
root        21     2  0 Jun16 ?        00:00:00 [kdevtmpfs]
root        22     2  0 Jun16 ?        00:00:00 [netns]
root        24     2  0 Jun16 ?        00:00:00 [kauditd]
root        25     2  0 Jun16 ?        00:00:09 [khungtaskd]
root        26     2  0 Jun16 ?        00:00:00 [oom_reaper]
root        27     2  0 Jun16 ?        00:00:00 [writeback]
root        28     2  0 Jun16 ?        00:08:21 [kcompactd0]
root        29     2  0 Jun16 ?        00:00:00 [ksmd]
root        83     2  0 Jun16 ?        00:00:00 [kintegrityd]
root        84     2  0 Jun16 ?        00:00:00 [kblockd]
root        85     2  0 Jun16 ?        00:00:00 [blkcg_punt_bio]
root        86     2  0 Jun16 ?        00:00:00 [ata_sff]
root        87     2  0 Jun16 ?        00:00:00 [edac-poller]
root        88     2  0 Jun16 ?        00:00:00 [devfreq_wq]
root        89     2  0 Jun16 ?        00:00:00 [watchdogd]
root        90     2  0 Jun16 ?        00:02:58 [kworker/0:1H-mmc_complete]
root        91     2  0 Jun16 ?        00:00:00 [rpciod]
root        92     2  0 Jun16 ?        00:00:00 [kworker/u5:0]
root        93     2  0 Jun16 ?        00:00:00 [xprtiod]
root        96     2  0 Jun16 ?        00:00:00 [kswapd0]
root        97     2  0 Jun16 ?        00:00:00 [nfsiod]
root        99     2  0 Jun16 ?        00:00:00 [kthrotld]
root       101     2  0 Jun16 ?        00:00:00 [kvub300c]
root       102     2  0 Jun16 ?        00:00:00 [kvub300p]
root       103     2  0 Jun16 ?        00:00:00 [kvub300d]
root       104     2  0 Jun16 ?        00:00:00 [irq/38-sunxi-mm]
root       105     2  0 Jun16 ?        00:00:00 [irq/71-1c0f000.]
root       106     2  0 Jun16 ?        00:00:00 [iscsi_eh]
root       107     2  0 Jun16 ?        00:00:00 [iscsi_destroy]
root       109     2  0 Jun16 ?        00:00:00 [uas]
root       110     2  0 Jun16 ?        00:00:00 [mmc_complete]
root       111     2  0 Jun16 ?        00:02:45 [kworker/1:1H-kblockd]
root       112     2  0 Jun16 ?        00:00:00 [irq/102-axp20x_]
root       114     2  0 Jun16 ?        00:00:00 [ipv6_addrconf]
root       123     2  0 Jun16 ?        00:00:00 [kstrp]
root       124     2  0 Jun16 ?        00:00:00 [zswap-shrink]
root       125     2  0 Jun16 ?        00:00:00 [scsi_eh_0]
root       126     2  0 Jun16 ?        00:00:00 [scsi_tmf_0]
root       158     2  0 Jun16 ?        00:00:50 [jbd2/mmcblk0p1-]
root       159     2  0 Jun16 ?        00:00:00 [ext4-rsv-conver]
root       605     1  0 Jun16 ?        00:00:01 /usr/bin/udevd
root       814     2  0 Jun16 ?        00:01:09 [irq/53-sun4i_gp]
root       835     2  0 Jun16 ?        00:00:00 [gp]
root       839     2  0 Jun16 ?        00:00:00 [pp]
root       886     2  0 Jun16 ?        00:00:00 [stmmac_wq]
root       887     2  0 Jun16 ?        00:00:00 [cec-sun4i]
root       888     2  0 Jun16 ?        00:00:00 [card1-crtc0]
root       890     2  0 Jun16 ?        00:00:00 [card1-crtc1]
dbus      1236     1  0 Jun16 ?        00:00:04 /usr/bin/dbus-daemon --system
root      1277     1  0 Jun16 ?        00:00:00 elogind-daemon
root      1840     1  0 Jun16 ?        00:28:09 /usr/bin/NetworkManager --pid-file /run/NetworkManager/NetworkManager.pid
ntp       1942     1  0 Jun16 ?        00:20:02 /usr/bin/ntpd -p /run/ntpd.pid -g -u ntp:ntp
root      1974     1  0 Jun16 ?        00:00:00 sshd: /usr/bin/sshd [listener] 0 of 10-100 startups
root      2054     1  0 Jun16 ?        00:00:00 supervise-daemon agetty.tty1 --start --pidfile /run/agetty.tty1.pid --respawn-period 60 /sbin/agetty -- tty1 38400 linux
root      2055  2054  0 Jun16 tty1     00:00:00 /sbin/agetty tty1 38400 linux
root      2178     1  0 Jun16 ?        00:00:00 supervise-daemon agetty.tty2 --start --pidfile /run/agetty.tty2.pid --respawn-period 60 /sbin/agetty -- tty2 38400 linux
root      2179  2178  0 Jun16 tty2     00:00:00 /sbin/agetty tty2 38400 linux
root      2209     1  0 Jun16 ?        00:00:00 supervise-daemon agetty.tty3 --start --pidfile /run/agetty.tty3.pid --respawn-period 60 /sbin/agetty -- tty3 38400 linux
root      2210  2209  0 Jun16 tty3     00:00:00 /sbin/agetty tty3 38400 linux
root      2239     1  0 Jun16 ?        00:00:00 supervise-daemon agetty.tty4 --start --pidfile /run/agetty.tty4.pid --respawn-period 60 /sbin/agetty -- tty4 38400 linux
root      2240  2239  0 Jun16 tty4     00:00:00 /sbin/agetty tty4 38400 linux
root      2267     1  0 Jun16 ?        00:00:00 supervise-daemon agetty.tty5 --start --pidfile /run/agetty.tty5.pid --respawn-period 60 /sbin/agetty -- tty5 38400 linux
root      2268  2267  0 Jun16 tty5     00:00:00 /sbin/agetty tty5 38400 linux
root      2295     1  0 Jun16 ?        00:00:00 supervise-daemon agetty.tty6 --start --pidfile /run/agetty.tty6.pid --respawn-period 60 /sbin/agetty -- tty6 38400 linux
root      2296  2295  0 Jun16 tty6     00:00:00 /sbin/agetty tty6 38400 linux
root      2362     2  0 Jun16 ?        00:00:19 [kworker/u4:1-events_unbound]
root     11438     2  4 13:01 ?        00:07:44 [kworker/0:1-events_freezable_power_]
root     11466     2  0 13:55 ?        00:00:00 [kworker/1:2-mm_percpu_wq]
root     11517     2  1 15:37 ?        00:00:08 [kworker/0:0-events]
root     11520     2  0 15:39 ?        00:00:00 [kworker/0:2H]
root     11533     2  0 15:39 ?        00:00:00 [kworker/1:2H]
root     11540     2  0 15:39 ?        00:00:00 [kworker/1:1-mm_percpu_wq]
root     11557     2  0 15:41 ?        00:00:00 [kworker/u4:2-events_unbound]
root     11562  1974  0 15:41 ?        00:00:00 sshd: david [priv]
david    11564 11562  0 15:41 ?        00:00:01 sshd: david@pts/0
david    11565 11564  0 15:41 pts/0    00:00:00 -bash
david    11577     1  0 15:41 pts/0    00:00:00 dbus-launch --autolaunch=3c48d6b121110597b37cb5fc00000025 --binary-syntax --close-stderr
david    11578     1  0 15:41 ?        00:00:00 /usr/bin/dbus-daemon --syslog-only --fork --print-pid 5 --print-address 7 --session
root     11587     2  0 15:42 ?        00:00:00 [kworker/0:2-mm_percpu_wq]
root     11627     2  0 15:49 ?        00:00:00 [kworker/1:0]
root     11628     2  0 15:50 ?        00:00:00 [kworker/1:0H]
david    11633 11565  0 15:50 pts/0    00:00:00 ps -ef
[david@parabola ~]$ 

Could the "proxy X" be hidden in an sshd process or in some of the kworker tasks or in a dbus process? (I am looking at things with a start time close to the one of the sshd processes that is about the time when I did ssh -X)?

(Side note: there is no systemd, not a personnal choice, I used the only available Parabola image that happened to work with my board revision).

If the "proxy" X server is not running a real display, perhaps it won't run nouveau and that could avoid the freezes?

I could give it a try but I don't know how to start my desktop in text mode.

Magic Banana

I am a member!

I am a translator!

Offline
Joined: 07/24/2010

I indeed do not see Xorg in the list of processes. Don't you actually run Wayland?

Do you use sysvinit? If so, to not have Xorg automatically start, you can remove from /etc/rc2.d (probably; it is actually distribution-dependent) a symbolic link to the Xorg init script.

Avron

I am a translator!

Offline
Joined: 08/18/2020

Don't you actually run Wayland?

No, it is not even installed.

Do you use sysvinit?

I have openrc on parabola but anyway, it is now configured to start in text mode, I use startx if I want the X server.

Using "sudo systemctl set-default multi-user.target", my desktop starts in text mode. On my laptop, using ssh -X or ssy -Y towards my desktop, I can launch abrowser on my desktop with display on my laptop. The connection between them is ethernet.

However, scrolling on an abrowser window makes ssh on the laptop to take between 50% and 100% of the CPU. If I am asking abrowser to open multiple tabs at onces (like 5), I just lose control of it.

In the past, I used X terminals to connect to a server and in my recollection, these were using rather low range CPUs so I am quite surprised that my X200 is unable to handle the display of an abrowser window properly.

I suppose ssh is not the only way to have a remote display. As the two machines are connected to the same ethernet switch, I don't expect security matters so much there. I tried to set the display variable but then, if I ty emacs, I always have "Display x200.mydomain:0 unavailable, simulating -nw" while "ping x200.mydomain" works.

Magic Banana

I am a member!

I am a translator!

Offline
Joined: 07/24/2010

Using "sudo systemctl set-default multi-user.target", my desktop starts in text mode.

I now understand that you want Trisquel (with systemd) display graphical applications running on a remote Parabola (with OpenRC) system. I had not before. Sorry.

I suppose ssh is not the only way to have a remote display.

Indeed. VNC *not* tunneled through an SSH connection (which is OK, since you are staying within your local network) is likely faster. On Trisquel, a VNC client is installed by default.