Resume brings system instability and hard freezes

Bug #490893 reported by jhfhlkjlj
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
pm-utils (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

Binary package hint: pm-utils

Some people also affected by bug 417842 have this problem, though it might be unrelated.

Jaunty suspend/resume worked just fine. In Karmic, it also works fine, but certain parts of the system will start to fail, and eventually the system will freeze. Magicsys keys sometimes work to bring me back to the login screen, but then the computer will completely freeze up and a hard reboot is needed.

This computer actually uses NVidia + 185 drivers, but my card died last night. As such, I have experienced this problem with the restricted drivers+compiz as well as my current state of intel onboard+metacity. Therefore, it cannot be a bug for nvidia and compiz. I'm making a blind guess that the problem lies with pm-utils.

This is reproducible 95 percent of the time. It's not a matter of if the system will become completely unstable, though, it's _when_ it will. Sometimes it's after five minutes, and sometimes it's after a few hours. Leaving my computer on 24/7 from bootup presents no problems whatsoever.

Let me know if more information is needed.

ProblemType: Bug
Architecture: i386
Date: Tue Dec 1 11:08:37 2009
DistroRelease: Ubuntu 9.10
Package: pm-utils 1.2.5-2ubuntu7
PackageArchitecture: all
ProcEnviron:
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.31-15.50-generic
SourcePackage: pm-utils
Uname: Linux 2.6.31-15-generic i686

Revision history for this message
jhfhlkjlj (fdsuufijjejejejej-deactivatedaccount) wrote :
Revision history for this message
jhfhlkjlj (fdsuufijjejejejej-deactivatedaccount) wrote :

I will give a case scenario that happened today:

I resumed my computer when I woke up and it worked fine. I came back to in in a few hours and started working. About five minutes in, gnome-system-monitor crashed. The computer asked if I wanted to reload the applet, and I told it to. The panels then crashed, and the computer completely locked up.

Revision history for this message
eigenman (eigenman96) wrote :

One thing that is probably relevant here are the various quirks that your computer has enabled for your video card going to sleep. You can find them by the command "lshal | grep power_management", but it's probably better to upload the output of lshal for this bug.

Revision history for this message
jhfhlkjlj (fdsuufijjejejejej-deactivatedaccount) wrote :

Here you go. Specifically for the power manager:

  power_management.acpi.linux.version = '20090521' (string)
  power_management.can_hibernate = true (bool)
  power_management.can_suspend = true (bool)
  power_management.can_suspend_hybrid = false (bool)
  power_management.is_powersave_set = false (bool)
  power_management.quirk.dpms_on = true (bool)
  power_management.quirk.dpms_suspend = true (bool)
  power_management.quirk.vbe_post = true (bool)
  power_management.quirk.vbemode_restore = true (bool)
  power_management.quirk.vbestate_restore = true (bool)
  power_management.quirk.vga_mode_3 = true (bool)
  power_management.type = 'acpi' (string)

Revision history for this message
eigenman (eigenman96) wrote :

One thing you can do is to try to enable or disable various quirks to see if it will make your system more stable. On my laptop I found that disabling all the quirks made my computer functional, although I haven't had the time to play with all the options yet. To disable them you can edit /usr/share/hal/information/10freedesktop/99-video-quirk-default.fdi (not the best place, but it was the quickest way for me). I gave a set of instructions for bug #469734.

(There used to be a wiki page on ubuntu's website on debugging suspend, which seems to have disappeared. I think it had a suggestion on which quirks should be enabled in what order.)

Hope this helps, and maybe someone with more experience can give a less vague/hackish suggestion.

Revision history for this message
jhfhlkjlj (fdsuufijjejejejej-deactivatedaccount) wrote :

Hi, eigenman.

The default.fdi file I found was not in the specified path but rather in /usr/share/hal/fdi/information/10freedesktop.

I guess I'll get started testing. If anyone else would be willing to do this (as I really am just kind of poking the unknown beast here), that'd be great. I'll start with can_suspend_hybrid and see what happens....

Revision history for this message
jhfhlkjlj (fdsuufijjejejejej-deactivatedaccount) wrote :

Tried a few, and each time I resumed, it would work for a few minutes. Another case scenario:

I resumed and started random programs. I had Scummvm running (Monkey Island 2) and it worked fine. I then tried to edit the default.fdi file again for next use, but gedit wouldn't open up. The window borders were there, but the actual body of the program would not appear. I tried to open a terminal, but the same thing happened. Borders, but nothing inside. I tried to shut down the computer via the applet, but the shutdown timer dialog also would not draw. Accessing the interactive shutdown menu worked fine so I shut down and rebooted.

Are there any particular quirks that I should be trying for? I've only done a few, but they seem to yield the same results... I'd tell which ones that I used, but my text file that listed the ones I edited did not save (another strange effect that happened after resume)

Revision history for this message
jhfhlkjlj (fdsuufijjejejejej-deactivatedaccount) wrote :

If this helps, I have attached the output of:

sync; echo 1 > /sys/power/pm_trace; pm-suspend

as dmesg.txt

Revision history for this message
jhfhlkjlj (fdsuufijjejejejej-deactivatedaccount) wrote :

I have been going without suspend for a while.

I decided to try it again last night, and upon waking it up, I was presented with a blank screen with only a cursor. I sysrq + k'ed to kill my X session, but it never reloaded itself. It just became completely black, and a hard reset was required.

Revision history for this message
Chow Loong Jin (hyperair) wrote :

Reassign to nvidia-graphics-drivers, since that's probably the cause of X not working properly after resuming.

affects: pm-utils (Ubuntu) → nvidia-graphics-drivers (Ubuntu)
Revision history for this message
jhfhlkjlj (fdsuufijjejejejej-deactivatedaccount) wrote :

I disabled the restricted drivers, rebooted, and tried to suspend. The same thing happened.

Reassigning to pm-utils until we have a better lead.

affects: nvidia-graphics-drivers (Ubuntu) → pm-utils (Ubuntu)
Revision history for this message
jhfhlkjlj (fdsuufijjejejejej-deactivatedaccount) wrote :

Fresh reinstalled Lucid when I switched from Karmic. I still have the problem, but far less frequently. Instead of a 100 percent chance of the desktop being screwed up, it's more like 10 percent.

When this happens, I cannot even switch to tty1. The system is completely and utterly powerless. The only thing I haven't tried is ssh-ing into it, but I really don't think that will do any good unless I know what to look for in there (though I would bet that I would be unable to anyway!)

summary: - [Karmic] Resume brings system instability and hard freezes
+ Resume brings system instability and hard freezes
Revision history for this message
jhfhlkjlj (fdsuufijjejejejej-deactivatedaccount) wrote :

Those who are subscribed, can you still reproduce? Are you on Karmic/Lucid? Please confirm if you can.

Revision history for this message
eigenman (eigenman96) wrote :

I'm not sure if my issues were exactly the same as this one. Suspend/hibernate on my laptop usually works (it has improved since upgrading to Lucid), however it does flake out in various forms even now. Haven't been able to find the cause to file a bug report yet.

Revision history for this message
jhfhlkjlj (fdsuufijjejejejej-deactivatedaccount) wrote :

pm-suspend says everything was a success, no errors.

I am attaching a truncated version of my /var/log/kern.log. I guess one should ignore the massive amounts of "Buffer I/O error on device sr0, logical block 0" errors.

I suspended my computer at night on May 5, and resumed it this morning around 8.

Possible lines of note:

16: ACPI handle has no context!
109-onwards.

Revision history for this message
jhfhlkjlj (fdsuufijjejejejej-deactivatedaccount) wrote :
Revision history for this message
jhfhlkjlj (fdsuufijjejejejej-deactivatedaccount) wrote :

Wow, blast from the past. This bug has since been fixed for me for a long time and is too broad to be of any use. I'm going to close this bug. Anyone with this issue should raise a new bug report with up-to-date information

Changed in pm-utils (Ubuntu):
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.