The Angry Red Button of Doom and Chronic Catastrophic Crashes
- Inicie sesión o regístrese para enviar comentarios
I apologise if any part of this posting comes across as grumpy or ungrateful. I know most everyone involved in developing software for GNU/Linux is doing their best, and that most of them are unpaid. I appreciate what your work, and your ethical commitment, I really do. But I am beyond frustration here. Where to start?
An angry red button just popped up on my taskbar. When I left-clicked on it, it opened a little pop-up menu, which had an open "open preferences" option that opened the "software sources" applet. So I'm assuming from that, and the other menu items ("install update" etc), that it was associated with the software update system. There was an error message at the top of the menu, something about "unmet dependencies". It wouldn't let me copy'n'paste it, and when I opened GEdit and tried to type it in manually, the menu closed every time I clicked on the GEdit window. Very irritating. So I'm sorry, I can't tell you exactly what the error message said.
The red button has now disappeared. It happened that I was running apt-get update and upgrade in a terminal when I noticed the red button, so maybe that's something to do with why it's disappeared? The terminal update seemed to run nomally, no errors, just some surplus packages and some held back:
The following packages were automatically installed and are no longer required:
libseccomp2 tor tor-geoipdb torsocks xul-ext-torproxy
Use 'apt-get autoremove' to remove them.
The following packages have been kept back:
linux-headers-lowlatency linux-image-lowlatency linux-lowlatency
My system has been *very* unstable recently, with the OS completely hanging after a few hours of work, almost every time I use it. To the point where even CTRL-ALT-F1 does nothing, or sometimes it does take me to the virtual terminal, and I can enter my username, but it times out before it can even prompt me for a password.
So, three things.
1) Is there any way to fix the software running behind the angry red button so that it pops up its error messages in a proper dialogue box, which can be cut'n'pasted in a forum post?
2) Will the latest bunch of updates restore the stability of my OS? Do I need to do anything other than run apt-get autoremove and apt-get dist-upgrade?
3) Is Trisquel 7 main just too heavy now for my 5 year old "netbook" hardware (1GB RAM, 1.6 GHZ)? Would switching to Trisquel Mini help? Would using a "lighter" desktop system than GNOME help?
I have tried so many things to try to improve performance:
* run updates from the terminal every time I boot up, then reboot if any were installed, before I start work
* use a decent-sized swap partition
* upgrade to Belenos so I'm using more recent versions of software
* put /home on a separate partition
* give my OS partition 16GB to work with
* only install extra apps if I really need to use them (eg Mumble)
* never allow flash, and installed HTML5 Everywhere, Privacy Badger, and most recently NoScript, in case the problem is Flash and Javascript crap overloading the browser and/or filling the RAM with junk
* participate in this forum, and report the problems I encounter, in as much detail as I can, to help the developers find bugs and fix them
None of this has fixed the problem. The OS (not just GNOME, the *OS*) still crashes almost every day. In fact, Trisquel is crashing on this machine more often than I used to get the "blue screen of death" on Windows 98, on a Compaq laptop bought in 2001. This is woeful performance. I need to use this computer to get some work done. I can't spend all day with my head under the hood tinkering around with the software. If I don't find a fix soon, I'm going to be seriously tempted to try some non-libre distros and see if that helps.
BTW While I was typing this in GEdit, letters started not appearing when I typed them, or different letters from what I typed, or other weird behaviour. Then GEdit closed, and I lost a few paragraphs of work. It's like someone has remote access to my desktop and is $%&#ing with me for fun. I know that sounds paranoid but as Nirvana put it "just because you're paranoid, don't mean they're not after you"... But I disconnected my wireless internet, and used the hardware switch to turn wireless networking off, and it's still happening, so it looks like it's a GEdit bug *sigh*.
> I apologise if any part of this posting comes across as grumpy or
> ungrateful. I know most everyone involved in developing software for
> GNU/Linux is doing their best, and that most of them are unpaid. I appreciate
> what your work, and your ethical commitment, I really do. But I am beyond
> frustration here. Where to start?
It's okay- your post is neither grumpy nor ungrateful. You have a seriously
f*cked up system.
> So, three things.
> 1) Is there any way to fix the software running behind the angry red button
> so that it pops up its error messages in a proper dialogue box, which can be
> cut'n'pasted in a forum post?
There is one sure-fire way to fix the problem- don't use graphical package
management. However, that's probably not the answer you were looking for, so no.
> 2) Will the latest bunch of updates restore the stability of my OS? Do I
> need to do anything other than run apt-get autoremove and apt-get
> dist-upgrade?
The full procedure looks like this:
$ sudo apt-get update
$ sudo apt-get dist-upgrade
$ sudo apt-get autoremove
$ sudo apt-get clean
But no, I don't think new package versions will sort this out. Your system
looks beyond repair.
> 3) Is Trisquel 7 main just too heavy now for my 5 year old "netbook"
> hardware (1GB RAM, 1.6 GHZ)? Would switching to Trisquel Mini help? Would
> using a "lighter" desktop system than GNOME help?
Probably, yes. The Trisquel DE (based on the monstrously bloated GNOME) uses at
least 500 MB of RAM even when nothing's running- that's half your RAM gone.
Many of the problems you describe are symptoms of the system constantly
swapping. There are probably a whole bunch of cryptically broken things under
the hood in your system, and it would take a *lot* of time for you to diagnose
them all, report them to us, and then fix them, so I would just recommend a
re-install. Install using the Trisquel netinstall CD (don't install a desktop!),
then install xfce4 and xfce4-goodies to install everyone's favourite GTK-based
lightweight desktop environment. Xfce usually uses just under 200 MB RAM when
ticking over, giving you significantly more to play with. Also, it's a lot
snappier when it comes to launching programs and doing things. Give it a go.
> I have tried so many things to try to improve performance:
>
> * run updates from the terminal every time I boot up, then reboot if any
> were installed, before I start work
Rebooting really shouldn't be necessary unless any system-critical packages
(such as a new linux, a new grub, or something like that). Otherwise, updated
packages should Just Work.
> * use a decent-sized swap partition
You want more than a decent-sized swap partition- you want a *hefty* swap
partition. When you re-install, set it to at least double or quadruple your
RAM, if you have the disk space to spare.
> * upgrade to Belenos so I'm using more recent versions of software
You should be doing this anyway, regardless of any problems you may have. Why
use Trisquel 6?
> * put /home on a separate partition
There aren't going to be any significant speed advantages from doing this.
Unrelated, I have 4GB RAM and a 10GB swap. The rest is one partition for /.
> * give my OS partition 16GB to work with
It could run out, you never know. Put everything on one partition and you wont
have to deal with resizing partitions when space runs out.
> * only install extra apps if I really need to use them (eg Mumble)
This is just general good practice.
Also, install the wicd-gtk package, you have to keep connected to the
Internet, don't you? :D
ifconfig, iwconfig, wpa_supplicant, and dhclient are all I need ;)
I also tend to think that 1GB of RAM may be too little for your usage (well, for the Web where pages became far heavier in the past years). That said, a larger swap is no solution: the system becomes unbearably slow as soon as you *start* using the swap. The real solution if you really run short of RAM (what can be easily verified from the "System monitor" in the "System settings") is to add RAM. Your motherboard would hopefully accept more RAM and buying one additional GB is very cheap.
If problems occur even when the RAM is not full, then it may be a hardware problem. Like lembas, I would advise you to run memtest86+.
>> There is one sure-fire way to fix the problem- don't use graphical package management. However, that's probably not the answer you were looking for, so no. <<
Not really :p Especially since I wasn't using graphical package management when the issue occurred. As I said in OP, I was using 'sudo apt-get update && sudo apt-get upgrade' when the angry red button popped up on my taskbar. I'm going to assume this is a GNOME problem.
>> But no, I don't think new package versions will sort this out. Your system looks beyond repair. <<
I suspected as much, but thanks for the confirmation. The situation reminds me of when I was still using Ubuntu, and I would hope against hope that every new version would fix the performance problems with my system. Usually they did fix some issues, but made the overall performance worse.
>> Probably, yes. The Trisquel DE (based on the monstrously bloated GNOME) uses at least 500 MB of RAM even when nothing's running- that's half your RAM gone. Many of the problems you describe are symptoms of the system constantly
swapping. <<
This is exactly what I needed to know. Thank you so much. So my choices are basically:
* buy a new computer
OR
* try a more lightweight desktop
I've used LXDE before and found it ok, and same with Enlightenment, although the distros I tried that bundled Enlightenment by default (eg Bodhi) were bleeding edge (and non-free). What do folks think about Mate or Cinammon? Anything else I should consider?
Moxalt suggests XFCE, which I haven't tried for years, but I remember finding it clunky and limited. Having said that, many of my criticisms (eg awkward controls for customization) apply equally to GNOME on Trisquel.
>> There are probably a whole bunch of cryptically broken things under the hood in your system, and it would take a *lot* of time for you to diagnose them all, report them to us, and then fix them, so I would just recommend a re-install. <<
This was a fresh install a few weeks ago, which I did in the hopes it would fix the problem. It hasn't so I think I need to move on to one of the options above.
>> Rebooting really shouldn't be necessary unless any system-critical packages (such as a new linux, a new grub, or something like that). Otherwise, updated packages should Just Work. <<
In theory sure. On a headless system, certainly. But GUI systems are complex enough to create race condition bugs, and I've read it's a good idea to at least logout and login if not reboot, after any update. Just ruling out possible causes.
>> You want more than a decent-sized swap partition- you want a *hefty* swap partition. When you re-install, set it to at least double or quadruple your RAM, if you have the disk space to spare. <<
I think I used double on this install you said swapping is part of the problem, and MagicBanana confirms this, so it probably didn't help :(
>> Why use Trisquel 6? <<
Because that's what I was using since I left Ubuntu, and in theory it should be supported for a few more years yet.
>> It could run out, you never know.
With a 16GB partition? I've always used a separate file partition on this system (I used to use the Windows partition until I got rid of that) so that I can access the files from multiple installs. The only time I've ever run out of space on an OS partition was on my other laptop, because of the log cancer problem.
>> Put everything on one partition and you wont have to deal with resizing partitions when space runs out. <<
Sure, but you also have to copy the entire /home contents to another drive if you want to reinstall or try a different distro. No thanks.
>>> * only install extra apps if I really need to use them (eg Mumble) <<<<
>> This is just general good practice. <<
Sure, but on a reliable system, I often install new applications just to try them out, install games etc. I know that can introduce race condition bugs, so my point was that I haven't done that on this install. Again, just ruling things out.
The full procedure looks like this:
$ sudo apt-get update
$ sudo apt-get dist-upgrade
$ sudo apt-get autoremove
$ sudo apt-get clean
Would it be possible to make a script out of these ?
like "do this command, allow sudo. Confirm "Y" (yes). When done, do the same with the following..." etc.
It's easier and faster to just launch a script.
An automation option could be fine (like every day at whatever o'clock).
So, do you think it's possible ?
Sure. You could define the following alias in your ~/.bashrc file:
alias upgrade='sudo apt-get update && sudo apt-get dist-upgrade && sudo apt-get autoremove && sudo apt-get clean'
Then you would just have to type 'upgrade' (or whatever name you prefer to pick) in a console to run the four commands.
sweet ! Thanks oysterboy !
On 29/10 07:58, moxalt wrote:
> > I apologise if any part of this posting comes across as grumpy or
> > ungrateful. I know most everyone involved in developing software for
> > GNU/Linux is doing their best, and that most of them are unpaid. I appreciate
> > what your work, and your ethical commitment, I really do. But I am beyond
> > frustration here. Where to start?
>
> It's okay- your post is neither grumpy nor ungrateful. You have a seriously
> f*cked up system.
[snip]
> Probably, yes. The Trisquel DE (based on the monstrously bloated GNOME) uses at
> least 500 MB of RAM even when nothing's running- that's half your RAM gone.
> Many of the problems you describe are symptoms of the system constantly
> swapping. There are probably a whole bunch of cryptically broken things under
> the hood in your system, and it would take a *lot* of time for you to diagnose
> them all, report them to us, and then fix them, so I would just recommend a
> re-install. Install using the Trisquel netinstall CD (don't install a desktop!),
> then install xfce4 and xfce4-goodies to install everyone's favourite GTK-based
> lightweight desktop environment. Xfce usually uses just under 200 MB RAM when
> ticking over, giving you significantly more to play with. Also, it's a lot
> snappier when it comes to launching programs and doing things. Give it a go.
[snip]
Wouldn't using Trisquel Mini be a simpler answer to this? From what I
remember LXDE is more lightweight then XFCE.
Triquel Mini also looks pretty darn good.
--
In regular expressions you must backslash { (count)
Regarding the hangs, you could test your memory. Boot from Trisquel live media and you're offered that option. Leave it running over night or so.
If that doesn't help and all your peripherals are properly seated in their sockets, then we get into the twilight zone. One another cause could be dust buildup. Or perhaps a dying motherboard. (visually inspect the capacitors to see if they've burst and are leaking, see below)
Thanks for the suggestion Lembas, but I think I will try using a more lightweight desktop, and any other software/ configuration suggestions anyone has. If that doesn't work, and the hardware is just borked, then it's beyond being worth reconditioning, and it's time to recycle it. I can get a USB WiFi card for the $50 Lifebook I just bought so I can use that as a replacement at home (now that we've found the cause of the log cancer), and put up with the second-hand HP Slate tablet I bought last year for when I'm travelling.
I personally recommend you to use Geany (still full of features, but
with spell checking, so expect some disk space usage) or Mousepad (very
minimal text editor for XFCE, but with no spell checking, but give it a
try) instead of Gedit, and please leave Gedit untouched if you're using
GNOME.
I suggest you to do so because there is a bug with some GNOME dependent
packages that makes the CPU usage spike to almost 100%. It happens every
time you give them focus, and also happens when the current focus inside
the window is an editable text field. This bug seems to be related to
IBus, which is the default input method if the system is using GNOME for
the current session.
This is why I switched to XFCE, although I had a lot of problems to
solve due to my switch and my strange notebook's hardware. Again,
non-free software dependent crapware. This is why I dream of buying just
Respects Your Freedom certified products from now on (even if they're
not that powerful) instead of buying the hardware next door or the
latest one in terms of power. My financial status just has to stabilize
a little more, and I'll be able to make my dream come true. :D
Also, you also use Mumble! That's great! I've been using Mumble for
almost two years. :D
Thanks for the suggestions. I'm going to try out a few different lightweight desktops, until I find one I like.
>> Also, you also use Mumble! That's great! I've been using Mumble for almost two years. :D <<
Cool. I only just started, as the Mumble client only started receiving sound through my built-in mic when I upgraded to Belenos. Would you (and others) be keen for a Trisquel chat session on Mumble sometime?
Could I ask if you are duel booting perhaps? Has anyone thought to ask that? Or running a VirtualBox? Kernel failure? Did you use the "USB stick and a Live Usb maker" to install the system?
Perhaps this a hardware fault in your CPU? Maybe first try to disable cache on the cpu from the bios.
Are you overclocking your computer? Go back to stock timings and frequencies and see if stability returns? I'd also suggest checking that the motherboard and PSU are up to spec?
If none of this help, perhaps think of formatting and reinstalling.
Unfortunately, if the system freezes, the triggering event may not get logged or the log entries may not be written to disk so /var/log/kern.log would be of no use, then.
ALSO:
Some possibilities:
Bad memory is a common cause of problems.
bad power-supplies can also cause random freezes and crashes.
low-quality motherboard. either due to shoddy manufacturing or due to bad/dodgy parts (e.g. a sub-standard or cheap version of a NIC that claims to be a particular brand/model but isn't - the manufacturer's Windows driver may compensate for its inadequacies but the linux driver believes it is an XYZ device because that's what it claims to be)
ditto for expansion cards
Are there any common patterns to the crashes? For example:
does it happen more often when you do certain things or run particular programs (if so, what are they?)
or after you've visited certain websites (e.g. badly written javascript code can leak memory)
or at a certain times of day (when?)
or when other equipment is being operated nearby (e.g. a fridge motor turning on - a good UPS can protect against transient voltage fluctuations).
You should not give up D:
Thanks MeNoMore. As with your advice about logrotate, I really appreciate your probing questions and detailed descriptions. I'm sorry if my questions in the log cancer thread came across as sarcastic, they were honestly just clarifying questions, although I admit that comment was a bit terse compared to my usuall novellas ;)
I just want to clarify again that the PC we're discussing here is an old Acer "netbook" laptop. If you click on my username, you can see the full model info, including lspci output.
>> Could I ask if you are duel booting perhaps? <<
Not any more. As I said in my reply to Moxalt, I recently did a fresh install of Belenos, and when I did that I blew away the NTFS partitions and all my experimental dual-boot partitions, leaving only a 4GB partition at the end of the drive, which I used for swap. I created two new 16GB OS partitions at the start of the drive, and installed Belenos to the first one. The second one is still empty. The rest of the drive is a /home partition. All three partitions are Ext4.
>> Or running a VirtualBox? <<
I've had a go running VirtualBox in the past, but never got it to work. I haven't even installed it in this install.
>> Kernel failure? <<
Not as far as I can tell. How could I test this? Thinking about what Moxalt said about swapping, I don't think the kernel is actually crashing at all.
>> Did you use the "USB stick and a Live Usb maker" to install the system? <<
Yes. The system has no optical disk drive so I had no choice but to use USB, and I used the USB maker in Trisquel System Settings to make the installation media.
>> Perhaps this a hardware fault in your CPU? Maybe first try to disable cache on the cpu from the bios. <<
Again, is there are way to test this? What can I expect to happen if I go into BIOS and disable cache?
>> Are you overclocking your computer? <<
No. Is that wise with a laptop? Overheating is a serious enough problem over a few hours of work when things are running normally.
>> I'd also suggest checking that the motherboard and PSU are up to spec? <<
Can you clarify what you mean by "up to spec"?
>> If none of this help, perhaps think of formatting and reinstalling. <<
Been there, done that, didn't help.
>> Bad memory is a common cause of problems. <<
OK, I will run a memtest overnight.
>> bad power-supplies can also cause random freezes and crashes. <<
How can I test that?
>> low-quality motherboard. either due to shoddy manufacturing or due to bad/dodgy parts (e.g. a sub-standard or cheap version of a NIC that claims to be a particular brand/model but isn't - the manufacturer's Windows driver may compensate for its inadequacies but the linux driver believes it is an XYZ device because that's what it claims to be)
ditto for expansion cards <<
What can I say? This laptop is 5 years old. There were definitely times in the past when it ran much more reliably than it does now. It could be that I'm using overly bloated software. It could be that the hardware is wearing out. What I'm asking for here, I guess, is suggestions on what tests I can use to rule possible causes out, so I can diagnose the exact problem(s) by a process of elimination.
>> Are there any common patterns to the crashes? <<
Pretty much always happens when I'm browsing the web. I currently use ABrowser, but before reinstalling the same thing would happen with IceCat.
>> or after you've visited certain websites (e.g. badly written javascript code can leak memory) <<
I suspected this might be the problem, which is why I installed NoScript. Didn't help.
>> or at a certain times of day (when?)
or when other equipment is being operated nearby (e.g. a fridge motor turning on - a good UPS can protect against transient voltage fluctuations). <<
Hmm. I do live in an old house, with old wiring, in an old neighbourhood, with old wiring. Brown-outs could be an issue. My flatmates don't have any problems with their computers though.
>> You should not give up D: <<
I'm certainly not giving up on GNU/Linux, and I'll try everything the friendly, helpful folks in the Trisquel community can suggest before I give up Trisquel. I have considered trying something like Parabola though. If I understand the Arch family correctly, this essentially allows me to build up a custom distro specifically for my hardware and my needs?
PS, Do not fret, lol. This is online, people interpreter things how they oft see them, and not how it was meant to have been said.
Do not feel bad, I am here to help, not judge.
Just a quick follow-up, I have installed the E17 Enlightenment desktop using Synaptic, and it is working brilliantly. The performance streets ahead of GNOME, no lag at all yet, let alone hangs or crashes, although I haven't fully put it through its paces yet. Better still, it actually looks nicer and works better than GNOME, in almost every respect. I haven't used LXDE for a while, but it's far better than what I remember of that too.
I will look at a few more featherweight desktops over the next few weeks (LXDE, XFCE, Mate, and any others the community here recommends), but I strongly recommend when preparations begin for Trisquel 8 that a version with E17 as the default desktop is considered as part of the release. I would certainly use and recommend it.
Kernel failure test
> Put "ctrlaltdel hard" in your /etc/rc.local file. When the system locks up, try Ctrl-Alt-Del. If it still does nothing, you know for sure that the kernel is no longer running; you have a hardware or driver failure.
CPU
> There are many tools that specialise in testing different components. No Oll-in-one tool does currently come to my mind. (Also, if it exists and does not internally use any of the well-established tools, that would be stepping against the principle of modularity, which is one of the fundamentals of Unix philosophy.
For RAM testing, I recommend Memtest86+. You have to boot into it rather than your primary system (pretty obvious if you consider the role of memory to a running OS).
For hard drive testing, you can try:
smartmontools to check the hardware "health" state of your drive,
Testdisk if you need to recover partition structure.
I don't recall any generic tools for testing components such as CPU or a graphics card other than actually monitoring the CPU usage.
As a general tip, I recommend using a specialized live distribution such as SystemRescueCd, that is released specifically as a tool for resolving major system problems.
Disabling Cache?
> If you have a cache to clear, then it is obviously not disabled.
On the rare occasions that I notice bash's cache of things that it has found in the path, it's not because it's helpful, it's because it's annoying.
I'm sure that this caching made sense back in the good old days when disks were slow and memory was expensive and limited and so you couldn't cache much - caching a path is cheaper than caching all the disk blocks necessary to find a command. But these days it provides no noticeable benefit and causes more problems than it solves. It's a misfeature, verging on being a bug.
If you are scared, then maybe you can just clear the hashed executables before the prompt gets drawn:
PROMPT_COMMAND='hash -r'
From: help hash:
hash: hash [-lr] [-p pathname] [-dt] [name ...]
Remember or display program locations.
Determine and remember the full pathname of each command NAME. If
no arguments are given, information about remembered commands is displayed.
Options:
-d forget the remembered location of each NAME
-l display in a format that may be reused as input
-p pathname use PATHNAME is the full pathname of NAME
-r forget all remembered locations
-t print the remembered location of each NAME, preceding
each location with the corresponding NAME if multiple
NAMEs are given
Arguments:
NAME Each NAME is searched for in $PATH and added to the list
of remembered commands.
Exit Status:
Returns success unless NAME is not found or an invalid option is given.
You can ALSO force bash to do a new path lookup in case a command in the hash table does not exist anymore.
shopt -s checkhash
From bash's manpage:
checkhash
If set, bash checks that a command found in the hash table exists before trying to execute it. If a hashed command no longer exists, a normal path search is performed.
Example:
[blabla]$ PATH=$HOME/bin:$PATH
[blabla]$ hash -r
[blabla]$ cat bin/which
#!/bin/bash
echo "my which"
[blabla]$
[blabla]$ shopt -s checkhash
[blabla]$ which
my which
[blabla]$ mv bin/which bin/dis.which
[blabla]$ which which
/usr/bin/which
(Sorry, lol blahblablah, I am sure this is information overload D:)
Abrowser crashing, refer to: http://trisquel.info/en/forum/abrowser-very-frequent-crashes
Maybe try: http://www.techrepublic.com/blog/10-things/10-web-browsers-for-the-linux-operating-system/
More info, look around here? http://linux-kernel.2935.n7.nabble.com/
<<>>
Do you know how to command-line install everything from your Wi-Fi to your GUI?
>> I am sure this is information overload <<
Yeah, kind of :P But all the same, I did ask a lot of questions, and I appreciate your effort to be thorough. As it turns out, switching from GNOME to E17 (Enlightenment) seems to have solved the lag/ freeze problem, along with a number of others (eg the weird ownCloud error message on login). For the first time in as long as I can remember, I was able to shut down my system normally after an all-day session. I will run a memtest overnight just out of curiosity though.
There are a couple of weird little issues in E17:
* Whether the sound controls are set to 'lock sliders' or not, only one volume slider goes up and down, and has no effort on the actual volume.
* Sometimes some of the taskbar gadgets seem to freeze up, eg the clock won't open up to display the calendar when left-clicked.
It's going to take me a while to get used to how everything works, but I really like the way I can easily drag windows between desktops (either from the window itself or from the page), and slide between desktops by holding the mouse against the side of the screen etc. I also really like the fact that when I initiate the launch of an application on one desktop, it opens there no matter what. In GNOME, it just opens in whatever desktop I happen to be on at the time.
- Inicie sesión o regístrese para enviar comentarios