gdm starts too early, X.org/VTs fail

Bug #502838 reported by Martin Pitt
64
This bug affects 11 people
Affects Status Importance Assigned to Milestone
gdm (Ubuntu)
Fix Released
High
Martin Pitt

Bug Description

Binary package hint: gdm

Since the gdm upstart rule was changed to start very early (graphics-device-added fb0 or drm-device-added card0), I only get a working startup in about one out of ten cases:

 (1) standard 80x25 VT appears for some seconds
 (2) KMS gets switched on, and I see a big text VT for a fraction of a second
 (3) X.org starts up

But in 9 out of 10 cases I just see (1), then lots of flickering, and eventually an X.org "low graphics" dialog (which is utterly distorted when the laptop runs standalone with internal screen, and looks fine when I have the external screen attached, for some curious reason). In this case the VTs are broken as well ("frequency not supported" monitor message, and otherwise black).

It seems that KMS and X.org start at roughly the same time, and thus fight each other. Can we introduce a signal that fires only when KMS has actually finished its setup, instead of begin setup?

Martin Pitt (pitti)
Changed in gdm (Ubuntu):
assignee: nobody → Scott James Remnant (scott)
tags: added: ubuntu-boot
Changed in gdm (Ubuntu):
importance: Undecided → High
tags: added: regression-potential
Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

If you have "graphics-device-added fb0" you have a very old version of the rules?!

Can you attach your gdm.conf for me. The current version looks like:

start on (filesystem
          and (graphics-device-added fb0 PRIMARY_DEVICE_FOR_DISPLAY=1
               or drm-device-added card0 PRIMARY_DEVICE_FOR_DISPLAY=1
               or stopped udevtrigger))

Changed in gdm (Ubuntu):
status: New → Incomplete
Revision history for this message
João Pinto (joaopinto) wrote :

I am experiencing the same issue, I never get the case 2.

Revision history for this message
Martin Pitt (pitti) wrote : Re: [Bug 502838] Re: gdm starts too early, races with KMS, X.org/VTs fail

 status new

Hello Scott,

Scott James Remnant [2010-01-04 17:34 -0000]:
> If you have "graphics-device-added fb0" you have a very old version of
> the rules?!
>
> Can you attach your gdm.conf for me. The current version looks like:
>
> start on (filesystem
> and (graphics-device-added fb0 PRIMARY_DEVICE_FOR_DISPLAY=1
> or drm-device-added card0 PRIMARY_DEVICE_FOR_DISPLAY=1
> or stopped udevtrigger))

That's indeed what I have. I wasn't quoting verbatim, just sketching
the structure. My gdm.conf is unmodified.

Changed in gdm (Ubuntu):
status: Incomplete → New
Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

On Mon, 2010-01-04 at 18:29 +0000, Martin Pitt wrote:

> > start on (filesystem
> > and (graphics-device-added fb0 PRIMARY_DEVICE_FOR_DISPLAY=1
> > or drm-device-added card0 PRIMARY_DEVICE_FOR_DISPLAY=1
> > or stopped udevtrigger))
>
> That's indeed what I have. I wasn't quoting verbatim, just sketching
> the structure. My gdm.conf is unmodified.
>
Those _are_ the signals that the KMS driver is ready.

gdm or X must need something else. Don't suppose you can strace it or
something to find out what it's failing on?

Or do you have an X.org log from a failure?

Scott
--
Scott James Remnant
<email address hidden>

Revision history for this message
Martin Pitt (pitti) wrote : Re: gdm starts too early, races with KMS, X.org/VTs fail

So I watched it more closely now, and I think you are right: the KMS race might well be a red herring. Sometimes I do see the big text console appear with the 25 "ureadahead-other terminated with status 4" messages before gdm/X start up, so KMS was ready at that time. I suppose the broken text terminals ("frequency not supported") are due to the vesa driver then?

When such a failure occurs, I don't have a /var/log/Xorg.0.log, just a (rather uninteresting) Xorg.failsafe.log with VESA stuff. dmesg has no errors either and just shows that DRM is up and the screen modes are set properly.

/var/log/gdm/:0.log shows that KMS/DRM indeed worked:

[ 3.243116] (II) intel(0): Output DVI1 using initial mode 1280x1024
[ 3.292017] (II) intel(0): [DRI2] Setup complete
[ 3.292039] (**) intel(0): Kernel mode setting active, disabling FBC.

There is no error message in the end, just

[ 36.783732] (II) intel(0): Modeline "1280x1024"x0.0 108.00 1280 1328 1440 1688 1024 1025 1028 1066 +hsync +vsync (64.0 kHz)
[ 119.962615] (II) "Video Bus": Close
[ 120.012305] (II) "Video Bus": Close
[ 120.052313] (II) "Power Button": Close
[ 120.082316] (II) "Sleep Button": Close
[ 120.132396] (II) "HID 05f3:0007": Close
[ 120.192504] (II) "HID 05f3:0007": Close
[ 120.262437] (II) "Logitech USB-PS/2 Optical Mouse": Close
[ 120.312431] (II) "AT Translated Set 2 keyboard": Close
[ 120.372361] (II) "DualPoint Stick": Close
[ 120.522372] (II) "Dell WMI hotkeys": Close
[ 120.572430] (II) "Macintosh mouse button emulation": Close
 ddxSigGiveUp: Closing log

Given the timestamps that's a normal shutdown; I played a bit with the failsafe options, so two minutes sounds realistic. What I wonder about is why the server with the intel driver has been up for so long, because what I saw and used was definitively the VESA driver.

Nothing else interesting in :0-{slave,greeter}.log unfortunately. I'll try an strace and some more debugging.

Changed in gdm (Ubuntu):
status: New → Incomplete
summary: - gdm starts too early, races with KMS, X.org/VTs fail
+ gdm starts too early, X.org/VTs fail
Revision history for this message
lokað (lokad) wrote :

I seem to have the same issue.
But I have the VT for a longer time, some messages about readahead and /home being clean, then gibberish.
But I even do net get readable output on my attached 2nd Monitor - only gibberish on my netbooks monitor. The system is not unresponsive as the contents changes when I press Esc. I did not try anything else for safety reasons. Ctrl-Alt-Del reboots instantly.

If I boot to maintenance mode and start gdm manually everything works fine.

This seemed to first happen, when I attached my external monitor for the first time. Right after the installation of Alpha1 everything was fine. But now it does not matter if it is connected or not.

Machine: Asus EEE 1000H with Intel GMA.

I have no logs of these incidents, the root fs seems to be mounted ro this early.

Revision history for this message
Martin Pitt (pitti) wrote :

Argh, this is an utter pain to debug, but that's what I found out:

 * When this happens, gdm only writes a "failsave.log" with a single number in it. No other logs.

 * Wrapping gdm-binary into strace in the upstart script introduces enough slowdown to make gdm startup succeed. Yay heisenbug.

 * Similar to starting gdm by hand or other slowdown, it works when there's no ureadahead pack, i. e. when reprofiling. (This might explain why it works for me sometimes)

 * The failsave X.org server is on VT2 (!). This means that it started to launch an X.org server at vt7 first (due to 05_initial_server_on_vt7.patch), and then started another (failsafe) one on the next free VT, which apparently was 2. As I already said, all other VTs are disfunctional (black screen and "frequency not supported")

Is there a possibility to add more verbosity to upstart event/job processing? Can this be single-stepped somehow?

Revision history for this message
Kai Jauch (kaijauch) wrote :

I redirected gdm's output to a logfile in /etc/init/gdm.conf (since it wasn't logging anything anywhere when it failed), this is what I get when it fails to start:

** (gdm-binary:1261): WARNING **: Couldn't connect to system bus: Failed to connect to socket /var/run/dbus/system_bus_socket: No such file or directory

Revision history for this message
Kai Jauch (kaijauch) wrote :

According to dbus.conf and gdm.conf they both start at roughly the same time if no remote filesystems are to be mounted.
Sometimes dbus is fast enough and sets itself up before gdm, sometimes gdm is faster. When gdm starts and dbus is not yet ready, it just bails out (see gdm-2.29.4/daemon/main.c).

Revision history for this message
Robert Hooker (sarvatt) wrote :

Removing /lib/udev/rules.d/40-xserver-xorg-video-intel.rules seems to have fixed the problem here.

Its contents:

# do not edit this file, it will be overwritten on update

# Jesse Barnes on <email address hidden>:
# You'll get three events, one when the error is detected, one before the
# reset and one after. Each has a different environment variable set; the
# initial error has ERROR=1, the pre-reset event has RESET=1 and the
# post-reset event has ERROR=0.

DRIVER=="i915, "ACTION=="change", ENV{ERROR}==1, PROGRAM="/usr/share/apport/apport-gpu-error-intel.py"

/usr/share/apport/apport-gpu-error-intel.py also is not getting installed with xserver-xorg-video-intel currently. I brought all these issues up with bryyce and tjaalton yesterday on irc in #ubuntu-x.

Revision history for this message
Kai Jauch (kaijauch) wrote :

Robert, removing /lib/udev/rules.d/40-xserver-xorg-video-intel.rules didn't fix it for me. Same thing as before, gdm-binary unable to connect to the system bus and bailing out because of it.

Revision history for this message
Robert Hooker (sarvatt) wrote :

Ahh sorry about the noise then, I assumed it worked because I didn't get dropped to the login prompt for 6 boots after removing it and that was pretty much unheard of. I should play the lottery today :) Thanks for digging into it more though, seems like you have identified the problem. Do you have any suggestions on how to get it working? I'm not very literate with upstart, but can I just make the start) section in gdm.conf require a stopped dbus? Or have dbus.conf emit something and gdm wait for that?

Revision history for this message
Martin Pitt (pitti) wrote :

Robert, nice catch! Indeed that makes absolute sense. I committed the waiting for D-Bus

  http://bazaar.launchpad.net/~ubuntu-desktop/gdm/ubuntu/revision/186

and booted three times in a row, now it starts perfectly again.

Changed in gdm (Ubuntu):
assignee: Scott James Remnant (scott) → Martin Pitt (pitti)
status: Incomplete → In Progress
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package gdm - 2.29.4-0ubuntu2

---------------
gdm (2.29.4-0ubuntu2) lucid; urgency=low

  * debian/gdm.upstart: Wait for D-Bus to be ready, to avoid failure if gdm
    starts too early. Thanks to Robert Hooker! (LP: #502838)
 -- Martin Pitt <email address hidden> Sat, 09 Jan 2010 17:34:43 +0100

Changed in gdm (Ubuntu):
status: Fix Committed → Fix Released
Revision history for this message
Christopher (soft-kristal) wrote :

Still a problem in -10.

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote : Re: [Bug 502838] Re: gdm starts too early, X.org/VTs fail

On Sat, 2010-01-09 at 16:33 +0000, Martin Pitt wrote:

> Robert, nice catch! Indeed that makes absolute sense. I committed the
> waiting for D-Bus
>
> http://bazaar.launchpad.net/~ubuntu-desktop/gdm/ubuntu/revision/186
>
> and booted three times in a row, now it starts perfectly again.
>
That makes sense ;-)

The rule came from old-gdm originally, which wasn't so dependant on
D-Bus!

Scott
--
Scott James Remnant
<email address hidden>

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.