Comment 159 for bug 625239

Revision history for this message
Seth (bugs-sehe) wrote : Re: [Bug 625239] Re: X starts on wrong tty because gdm starts before nvidia driver is ready

On 09/15/2010 09:17 AM, Steve Langasek wrote:
> So far the only test on maverick from someone who's confirmed to see this
> bug on lucid has failed to reproduce the problem there, even /without/ the
> new gdm package. Nevertheless, I think there's enough information here to
> mark this as a duplicate of 615549 and target that bug for SRU.
>
Count me in. I just went the 'IKEA way' and added a reboot loop section
to my rc.local that logs all conceivable related info per boot[1]. The
used script has been attached. Per boot, a number of logfiles is
collected, various ps listings, package versions, upstart config is
saved in a tarball.

This resulted in 79 tarballs, which I intend to attach later in sofar
Martin/Steve have been asking for the info contained.

I tested 79 boots (_real_ boots) in all, in three batches:

--------------------------------------------------
(A) with plain vanilla lucid config and package: 40 boots. 30 of these
boots landed on tty2 rather than tty7:

    root@lucid:~/first# for a in gdmdebug_1*; do tar xOf $a tmp/ | grep
    usr/bin/X | head -1; done | cut -c31-34 | sort | uniq -c
         15 tty2
         25 tty7

All tty2 occurences coincide with multiple (1 _or_ 2) messages in the
logs about re-attempting to launch X:

    root@lucid:~/first# for a in gdmdebug_1*; do tar xOf $a
    var/log/daemon.log | grep 'display lasted' | wc -l; done | sort |
    uniq -c
         25 0
          6 1
          9 2

Interestingly, of the failing cases (75% of total boots), two-thirds
required 3 attempts, against one-third requiring two attempts to launch
X. Ah, well.

--------------------------------------------------
(B) with plain vanilla packages but maverick-style startup condition
(from bug 615549). 23 out of 23 land on tty7 just fine

    root@lucid:~/martinfix# for a in gdmdebug_1*; do tar xOf $a tmp/ |
    grep usr/bin/X | head -1; done | cut -c31-34 | sort | uniq -c
         23 tty7

--------------------------------------------------
 (C) in an effort to really torture test this when really reducing the
boot time to the absolute minimum ), i switched of as many services as I
could. Resulting boot time (1 sample) was around 7.20s. Obviously the
maverick fresh install was even faster than that.

Happily, 16 out of 16 boots landed X on tty7

    root@lucid:~/martin_minimal# for a in gdmdebug_1*; do tar xOf $a
    tmp/ | grep usr/bin/X | head -1; done | cut -c31-34 | sort | uniq -c
         16 tty7

All three batches were running the standard lucid package of gdm (manual
modifications done to gdm.conf in B and C):

    root@lucid:~# for a in */gdmdebug_*; do tar xOf $a | grep ^Version;
    done | uniq -c
         79 Version: 2.30.2.is.2.30.0-0ubuntu3

Hence, I would consider the fix in #615549 valid for this bug. Note that
it needs to be back-ported to lucid!
Finally, I think I will report the premature creation of
/var/run/gdm/firstserver.stamp as well as broken handling of gdm start
failure as bugs with gdm.

Thanks for all the thinking along, and especially Martin(?) for coming
up with the root cause.

[1] quote:

    #!/bin/bash
    #Not: set -e

...

    sleep 10
    chvt 1; echo "INFO: auto logging debug info"
    bash /etc/gdm_debug.sh

    stop gdm
    echo "WARNING: purging logs and rebooting in 4 seconds"
    sleep 4
    rm -fv /var/log/Xorg.* /var/log/gdm/* /var/run/gdm/firstserver.stamp
    rm /var/log/{syslog,user.log,kern.log,daemon.log}
    touch /var/log/{syslog,user.log,kern.log,daemon.log}
    reboot
    exit 0