[apport] polytopes crashed with SIGILL in _mesa_x86_64_transform_points4_perspective()

Bug #87661 reported by TylerMD
62
Affects Status Importance Assigned to Milestone
Mesa
Fix Released
Medium
mesa (Debian)
Fix Released
Unknown
mesa (Ubuntu)
Fix Released
Medium
Unassigned

Bug Description

Binary package hint: xscreensaver

Polytopes crashes every time I run it.

The first time:
In Herd 3, with xubuntu-desktop installed, I left the update manager running. When came back and closed the polytopes screen saver entering my password it crashed and left GTK "ugly". GTK apps looked like no "skin", like GTK for windows with no skin at all. Ant Window decoration (xfwin) was color reversed. It looked blue instead of yellow.

Note: the no-skin problem and the color-reverse problem occurred just the first time.

ProblemType: Crash
Date: Sat Feb 24 22:08:14 2007
DistroRelease: Ubuntu 7.04
ExecutablePath: /usr/lib/xscreensaver/polytopes
Package: xscreensaver-gl 4.24-5ubuntu2
ProcCmdline: polytopes -root
ProcCwd: /home/manuel
ProcEnviron:
 PATH=/usr/lib/xscreensaver:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
 LANG=es_ES.UTF-8
 SHELL=/bin/bash
Signal: 4
SourcePackage: xscreensaver
StacktraceTop:
 _mesa_x86_64_transform_points4_perspective ()
 ?? () from /usr/lib/dri/i915_dri.so
 _tnl_run_pipeline () from /usr/lib/dri/i915_dri.so
 _tnl_flush_vtx () from /usr/lib/dri/i915_dri.so
 ?? ()
Uname: Linux manuel-desktop 2.6.20-8-generic #2 SMP Tue Feb 13 01:14:41 UTC 2007 x86_64 GNU/Linux
UserGroups: adm admin audio cdrom dialout dip floppy lpadmin netdev plugdev powerdev scanner video

Related branches

Revision history for this message
TylerMD (manuelmorales) wrote :
Revision history for this message
Daniel Holbach (dholbach) wrote :

Thanks for your bug report.

Changed in xscreensaver:
importance: Undecided → Medium
Revision history for this message
Apport retracing service (apport) wrote : Symbolic stack trace

StacktraceTop:?? ()
?? ()
?? ()

Revision history for this message
Apport retracing service (apport) wrote : Symbolic threaded stack trace
Revision history for this message
Sebastien Bacher (seb128) wrote :

Thanks for your bug report. Please try to obtain a backtrace http://wiki.ubuntu.com/DebuggingProgramCrash and attach the file to the bug report. This will greatly help us in tracking down your problem.

Changed in xscreensaver:
status: Unconfirmed → Needs Info
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for xscreensaver (Ubuntu) because there has been no activity for 60 days.]

Revision history for this message
Michael Dudalev (dudalev) wrote :

I confirm this bug, backtrace is the same.

Revision history for this message
Tormod Volden (tormodvolden) wrote :

MIchael, please reopen the bug if you can provide a complete backtrace.

Revision history for this message
Michael Dudalev (dudalev) wrote :
Download full text (5.5 KiB)

Really the bug is more severe than just failing (and innocent) xscreensaver.
backtrace:
#0 0x00002ade19df2582 in _mesa_x86_64_transform_points4_perspective () from /usr/lib/dri/i915_dri.so
#1 0x00002ade19d85909 in ?? () from /usr/lib/dri/i915_dri.so
#2 0x00002ade19d7bd7b in _tnl_run_pipeline () from /usr/lib/dri/i915_dri.so
#3 0x00002ade19d7c2a4 in _tnl_draw_prims () from /usr/lib/dri/i915_dri.so
#4 0x00002ade19d7454e in vbo_exec_vtx_flush () from /usr/lib/dri/i915_dri.so
#5 0x00002ade19d700fd in vbo_exec_FlushVertices () from /usr/lib/dri/i915_dri.so
#6 0x00002ade19cfbe2e in _mesa_Flush () from /usr/lib/dri/i915_dri.so
#7 0x000000000040486a in draw_polytopes (mi=0x7fff93e3fd30) at polytopes.c:3064
#8 0x000000000040738b in xlockmore_screenhack (dpy=0x63fb30, window=73400322, want_writable_colors=<value optimized out>,
    want_uniform_colors=0, want_smooth_colors=0, want_bright_colors=0, event_mask=66, hack_init=0x404ec0 <init_polytopes>,
    hack_draw=0x4046d0 <draw_polytopes>, hack_reshape=0x404e70 <reshape_polytopes>, hack_handle_events=0x404ce0 <polytopes_handle_event>,
    hack_free=0) at xlockmore.c:444
#9 0x0000000000404fc8 in screenhack (dpy=0x6c75c0, window=6922368) at ./../xlockmore.h:158
#10 0x0000000000405a81 in main (argc=1, argv=0x7fff93e404c8) at ./../screenhack.c:679
#11 0x00002ade18810b44 in __libc_start_main () from /lib/libc.so.6
#12 0x0000000000403199 in _start ()

BUT! the point is not a backtrace, but a code inside driver:
(gdb) disassemble
Dump of assembler code for function _mesa_x86_64_transform_points4_perspective:
0x00002ade19df2530 <_mesa_x86_64_transform_points4_perspective+0>: mov 0x10(%rdx),%ecx
0x00002ade19df2533 <_mesa_x86_64_transform_points4_perspective+3>: movzbl 0x14(%rdx),%eax
0x00002ade19df2537 <_mesa_x86_64_transform_points4_perspective+7>: mov %ecx,0x10(%rdi)
0x00002ade19df253a <_mesa_x86_64_transform_points4_perspective+10>: movl $0x4,0x18(%rdi)
0x00002ade19df2541 <_mesa_x86_64_transform_points4_perspective+17>: orl $0xf,0x1c(%rdi)
0x00002ade19df2545 <_mesa_x86_64_transform_points4_perspective+21>: test %ecx,%ecx
0x00002ade19df2547 <_mesa_x86_64_transform_points4_perspective+23>: xchg %ax,%ax
0x00002ade19df254a <_mesa_x86_64_transform_points4_perspective+26>: je 0x2ade19df25b3 <_mesa_x86_64_transform_points4_perspective+131>
0x00002ade19df254c <_mesa_x86_64_transform_points4_perspective+28>: mov 0x8(%rdx),%rdx
0x00002ade19df2550 <_mesa_x86_64_transform_points4_perspective+32>: mov 0x8(%rdi),%rdi
0x00002ade19df2554 <_mesa_x86_64_transform_points4_perspective+36>: movd (%rsi),%mm0
0x00002ade19df2557 <_mesa_x86_64_transform_points4_perspective+39>: pxor %mm7,%mm7
0x00002ade19df255a <_mesa_x86_64_transform_points4_perspective+42>: punpckldq 0x14(%rsi),%mm0
0x00002ade19df255e <_mesa_x86_64_transform_points4_perspective+46>: movq 0x20(%rsi),%mm2
0x00002ade19df2562 <_mesa_x86_64_transform_points4_perspective+50>: prefetch (%rdx)
0x00002ade19df2565 <_mesa_x86_64_transform_points4_perspective+53>: movd 0x28(%rsi),%mm1
0x00002ade19df2569 <_mesa_x86_64_transform_points4_perspective+57>: xchg %a...

Read more...

Revision history for this message
Michael Dudalev (dudalev) wrote :

invalid opcode (for Intel CPUs) in amd64 binary version

Changed in mesa:
status: Invalid → New
Revision history for this message
Mikael Gerdin (mgerdin) wrote :

I filed a bug against mesa, this issue still exists in git head on freedesktop.org.

Revision history for this message
Mikael Gerdin (mgerdin) wrote :

Upstream bug # changed to non-duplicate

Revision history for this message
Mikael Gerdin (mgerdin) wrote :

The comment on the mesa bugzilla that the build target should be changed to one that does not use assembly-optimizations is probably the only thing we can do to fix this.
Maybe this should be reported and fixed in debian first and then synced to ubuntu?

Changed in mesa:
status: Unknown → Confirmed
Revision history for this message
Mikael Gerdin (mgerdin) wrote :

This patch disables the usage of assembly optimizations on amd64 builds of mesa.
"It works for me"

Revision history for this message
Tormod Volden (tormodvolden) wrote :

Yes, this should be reported and fixed in Debian, if it can be reproduced with Debian unstable. Once fixed there, we should make an SRU for Hardy.

Mikael, is your patch the minimal patch for turning off 3dnow optimizations?

Changed in mesa:
status: New → Confirmed
Revision history for this message
Mikael Gerdin (mgerdin) wrote :

Tormod:
I think that there is a nicer way to work around this by patching
debian/scripts/choose-configs
to include a special case for amd64 and setting DRI_CONFIGS and probably SWX11_GLU_CONFIGS to the unoptimised versions but I don't feel very comfortable around makefile syntax and the exact build procedure of the debian mesa package so I'm not sure if that would suffice.

Revision history for this message
Tormod Volden (tormodvolden) wrote :

Does it help to use the environment variable MESA_NO_ASM? That (for instance in /etc/environment) would be an easier workaround then rebuilding everything.

From what I understand (correct me if I am wrong):
The configs makefile magic does what it should do, and as can be verified by the hardy build logs, on i386 the mesa package is compiled with:
 -DUSE_X86_ASM -DUSE_MMX_ASM -DUSE_3DNOW_ASM -DUSE_SSE_ASM
and on amd64 it is compiled with:
 -DUSE_X86_64_ASM
and for instance on ia64 none of this is used.

The mesa x86 code carefully checks for these different flags while building, and probes for processor support when running, but the x86-64 code doesn't use for instance USE_3DNOW_ASM flags and just assumes an AMD64 processor from AMD with all corresponding instructions available.

If we turn off USE_X86_64_ASM in the build, mesa will work on the Intel x86-64 processors, but the AMD64 processors will now run slower, without these optimizations.

Revision history for this message
Mikael Gerdin (mgerdin) wrote :

Setting MESA_NO_ASM stops the crashes for me, I agree that this is a much nicer work-around but it would be even better if mesa could do runtime-checks as it does on IA32.
Perhaps mesa's postinst-script could check cpu vendor and add MESA_NO_ASM=1 to /etc/environment?

Revision history for this message
plusplus (strosset) wrote :

export MESA_NO_ASM=1 also solves the problem here.
Thank you very much for this very useful tip !

Revision history for this message
Tormod Volden (tormodvolden) wrote :

The best solution is if we can set this conditionally at run-time, depending on the contents of /proc/cpuinfo. I suggest to put this in a file /etc/X11/Xsession.d/65mesa-check-x86-64, which will be sourced at X session startup, and ship it with a mesa package. That way we can just drop it once it get fixed in a newer mesa version.

I don't have 64bit hardware and I am not too familiar with the variants, but I guess something ("x64" but not "AMD") would be correct? Can someone wise chime in here, or can those of you who need "MESA_NO_ASM" please post your /proc/cpuinfo?

Revision history for this message
Mikael Gerdin (mgerdin) wrote :

I think the best solution is to just do a:
if `grep -q -i genuineintel /proc/cpuinfo`
that would be true iff the host cpu was made by intel.
I'll attach my cpuinfo anyway.

Revision history for this message
plusplus (strosset) wrote :

Here is my /proc/cpuinfo

Revision history for this message
Tormod Volden (tormodvolden) wrote :

Well, we don't want this to kick in on all Intel processors (like the 32 bit ones), so maybe we have to check something like "uname -m" as well.

Revision history for this message
Tormod Volden (tormodvolden) wrote :

Or to do it the other way, is there for instance something in "flags" for AMD64 processors that we can probe for? I found a /proc/cpuinfo for an "AuthenticAMD" and it has flags like 3dnowext and 3dnow (and uname -m gives x86_64):

if [ `uname -m` = x86_64 ] && ! grep -q "^flags.*3dnow" /proc/cpuinfo; then
    MESA_NO_ASM=1
    export MESA_NO_ASM
fi

or alternatively:

if [ `uname -m` = x86_64 ] && ! grep -q "^vendor_id.*AuthenticAMD" /proc/cpuinfo; then

Revision history for this message
Mikael Gerdin (mgerdin) wrote :

By checking `uname -m` we only detect if the _kernel_ is x86-64.
This would be a problem if someone was running a 64-bit kernel and a 32-bit userland because they would get MESA_NO_ASM exported without needing it.
I think it would be better to do
if [ `dpkg --print-architecture` = "amd64" ]
Because that would tell us if we were running in an amd64 userland and that's what we really want to test for.
I think checking for "3dnow" in flags is preferred because AMD is not the only cpu vendor using that technology, even though none of the others have built any 64-bit chips yet, it would probably be more future-proof.

Revision history for this message
Tormod Volden (tormodvolden) wrote :

You're probably right, I didn't think about such combinations. It could be even more complicated for what I know, I don't know how mesa chooses its code path in all these cases. Anyway, here's a proposed Xsession.d file that we can test and get reviewed.

Revision history for this message
Tormod Volden (tormodvolden) wrote :
Changed in mesa:
status: Unknown → Confirmed
Revision history for this message
Tormod Volden (tormodvolden) wrote :
Revision history for this message
Tormod Volden (tormodvolden) wrote :

Test packages with the above debdiff can be found in my PPA, built on hardy. Thinking about it, we could have checked DEB_BUILD_ARCH in debian/rules and only added the hook in the amd64 packages... And this workaround might not catch things like remote X etc, but it will be good enough for most people.

Revision history for this message
Daniel Holbach (dholbach) wrote : Sponsor Request

Bryce: can you please take a look at it?

Revision history for this message
Bryce Harrington (bryce) wrote :

The debdiff looks good to me, I've uploaded to intrepid.

Changed in mesa:
status: Confirmed → Fix Released
Changed in mesa:
status: Confirmed → Fix Released
Revision history for this message
Tormod Volden (tormodvolden) wrote :

The real fix is in mesa master now: http://cgit.freedesktop.org/mesa/mesa/commit/?id=2b8d8989fb6f9c36baf166fc715182a1407ebadb
Maybe we can cherry-pick it to our mesa 7.2?

Changed in mesa:
status: Confirmed → Fix Released
Changed in mesa:
importance: Unknown → Medium
Changed in mesa:
importance: Medium → Unknown
Changed in mesa:
importance: Unknown → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.