Comment 112 for bug 330824

Revision history for this message
Theodore Ts'o (tytso) wrote :

Rocko,

Thanks for your kernel panic log. This doesn't prove anything, but 14 seconds before the oops involving ext4_delete_inode, there was a recursive fault in the X server, which was apparently running the Nvidia proproietary driver.

So I hate to ask this but (a) how many other people who have been having this problem are running with the proprietary Nvidia driver? And (b) would it be possible for people who can easily reproduce this, if they are using the Nvidia proprietary driver, to see if the problem goes away if you shut down the X server and logging into the VT console, and running the rm -rf from either a VT Console or via an ssh login?

It's *possible* that the Nvidia driver is innocent victim, and not the cause, but I don't use the proprietary closed-source Nvida driver, and I'm having a devil of a time reproducing the problem. So if we can take the closed-source Nvidia driver out of the equation, it would be useful to see where that leaves us.

Apr 25 17:10:08 10.1.1.11 [85627.288225] Pid: 3678, comm: Xorg Tainted: P D 2.6.28-11-generic #42-Ubuntu
Apr 25 17:10:08 10.1.1.11 [85627.288228] RIP: 0010:[<ffffffffa0099e26>]
Apr 25 17:10:08 10.1.1.11 [<ffffffffa0099e26>] _nv020907rm+0x14/0x42 [nvidia]
Apr 25 17:10:08 10.1.1.11 [85627.288379] RSP: 0018:ffff88011e9d19b0 EFLAGS: 00010202
Apr 25 17:10:08 10.1.1.11 [85627.288381] RAX: 64943378215a9b68 RBX: ffff8801074d2b90 RCX: 0000000000000000
Apr 25 17:10:08 10.1.1.11 [85627.288383] RDX: ffff8801074d2b90 RSI: ffff88010b582f70 RDI: e001208000000000
Apr 25 17:10:08 10.1.1.11 [85627.288385] RBP: ffff88010b582f68 R08: 0000000000000001 R09: ffffffffa0a2c330
Apr 25 17:10:08 10.1.1.11 [85627.288387] R10: 0000000000000001 R11: 0000000000011bf2 R12: ffff88010b582f70
Apr 25 17:10:08 10.1.1.11 [85627.288389] R13: 00000000e0012080 R14: ffff88010b582f9c R15: 0000000000000000
Apr 25 17:10:08 10.1.1.11 [85627.288392] FS: 0000000000000000(0000) GS:ffffffff80aa3000(0000) knlGS:0000000000000000
Apr 25 17:10:08 10.1.1.11 [85627.288394] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Apr 25 17:10:08 10.1.1.11 [85627.288396] CR2: 00007f4fa66e0620 CR3: 0000000000201000 CR4: 00000000000006a0
Apr 25 17:10:08 10.1.1.11 [85627.288399] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Apr 25 17:10:08 10.1.1.11 [85627.288402] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Apr 25 17:10:08 10.1.1.11 [85627.288404] Process Xorg (pid: 3678, threadinfo ffff88011e9d0000, task ffff88010b424320)
Apr 25 17:10:08 10.1.1.11 [85627.288406] Stack:
Apr 25 17:10:08 10.1.1.11 [85627.288409] ffffffffa00d3fc1
Apr 25 17:10:08 10.1.1.11 00000000c1d00001
Apr 25 17:10:08 10.1.1.11 00000000e0012080
Apr 25 17:10:08 10.1.1.11 ffff88010b582f90
Apr 25 17:10:08 10.1.1.11
Apr 25 17:10:08 10.1.1.11 [85627.288415]
Apr 25 17:10:08 10.1.1.11 ffffffffa00d3d69
Apr 25 17:10:08 10.1.1.11 ffff88010b582fa0
Apr 25 17:10:08 10.1.1.11 00000000c1d00001
Apr 25 17:10:08 10.1.1.11 0000000000000000
Apr 25 17:10:08 10.1.1.11
Apr 25 17:10:08 10.1.1.11 [85627.288421]
Apr 25 17:10:08 10.1.1.11 000000000100cb01
Apr 25 17:10:08 10.1.1.11 0000000000000000
Apr 25 17:10:08 10.1.1.11 ffffffffa03fe7b3
Apr 25 17:10:08 10.1.1.11 ffff880112cb5660
Apr 25 17:10:08 10.1.1.11
Apr 25 17:10:08 10.1.1.11 [85627.288430] Call Trace:
Apr 25 17:10:08 10.1.1.11 [85627.288433]
Apr 25 17:10:08 10.1.1.11 [<ffffffffa00d3fc1>] ? _nv019426rm+0x5c/0x91 [nvidia]
Apr 25 17:10:08 10.1.1.11 [85627.288526]
Apr 25 17:10:08 10.1.1.11 [<ffffffffa00d3d69>] ? _nv019519rm+0x39/0x78 [nvidia]
Apr 25 17:10:08 10.1.1.11 [85627.288617]
Apr 25 17:10:08 10.1.1.11 [<ffffffffa03fe7b3>] ? _nv003729rm+0x61/0x220 [nvidia]
Apr 25 17:10:08 10.1.1.11 [85627.288747]
Apr 25 17:10:08 10.1.1.11 [<ffffffffa03fc819>] ? _nv003740rm+0x96/0x20b [nvidia]
Apr 25 17:10:08 10.1.1.11 [85627.288872]
Apr 25 17:10:08 10.1.1.11 [<ffffffffa03fc4ab>] ? _nv003744rm+0x3e/0x316 [nvidia]
Apr 25 17:10:08 10.1.1.11 [85627.288996]
Apr 25 17:10:08 10.1.1.11 [<ffffffffa0478407>] ? _nv003714rm+0xe0/0x121 [nvidia]
Apr 25 17:10:08 10.1.1.11 [85627.289124]
Apr 25 17:10:08 10.1.1.11 [<ffffffffa047a1c7>] ? rm_free_unused_clients+0x69/0xb7 [nvidia]
Apr 25 17:10:08 10.1.1.11 [85627.289249]
Apr 25 17:10:08 10.1.1.11 [<ffffffffa054cd7f>] ? nv_kern_ctl_close+0x6f/0x100 [nvidia]
Apr 25 17:10:08 10.1.1.11 [85627.289352]
Apr 25 17:10:08 10.1.1.11 [<ffffffffa054f1fb>] ? nv_kern_close+0x2eb/0x3c0 [nvidia]
Apr 25 17:10:08 10.1.1.11 [85627.289452]
Apr 25 17:10:08 10.1.1.11 [<ffffffff802e8f6f>] ? __fput+0xcf/0x1f0
Apr 25 17:10:08 10.1.1.11 [85627.289458]
Apr 25 17:10:08 10.1.1.11 [<ffffffff802e90ad>] ? fput+0x1d/0x30
Apr 25 17:10:08 10.1.1.11 [85627.289462]
Apr 25 17:10:08 10.1.1.11 [<ffffffff802e553b>] ? filp_close+0x5b/0x90
Apr 25 17:10:08 10.1.1.11 [85627.289466]
Apr 25 17:10:08 10.1.1.11 [<ffffffff802530ad>] ? put_files_struct+0x7d/0xe0
Apr 25 17:10:08 10.1.1.11 [85627.289471]
Apr 25 17:10:08 10.1.1.11 [<ffffffff8025315f>] ? exit_files+0x4f/0x60
Apr 25 17:10:08 10.1.1.11 [85627.289475]
Apr 25 17:10:08 10.1.1.11 [<ffffffff80254f51>] ? do_exit+0x1b1/0x3b0
Apr 25 17:10:08 10.1.1.11 [85627.289480]
Apr 25 17:10:08 10.1.1.11 [<ffffffffa054d1cd>] ? nv_kern_ioctl+0x15d/0x490 [nvidia]
Apr 25 17:10:08 10.1.1.11 [85627.289581]
Apr 25 17:10:08 10.1.1.11 [<ffffffff8069f81e>] ? oops_end+0xbe/0xc0
Apr 25 17:10:08 10.1.1.11 [85627.289586]
Apr 25 17:10:08 10.1.1.11 [<ffffffff80215cbe>] ? die+0x5e/0x90
Apr 25 17:10:08 10.1.1.11 [85627.289591]
Apr 25 17:10:08 10.1.1.11 [<ffffffff8069f4c8>] ? do_general_protection+0x158/0x160
Apr 25 17:10:08 10.1.1.11 [85627.289594]
Apr 25 17:10:08 10.1.1.11 [<ffffffff8069e96a>] ? error_exit+0x0/0x70
Apr 25 17:10:08 10.1.1.11 [85627.289598]
Apr 25 17:10:08 10.1.1.11 [<ffffffffa054d1cd>] ? nv_kern_ioctl+0x15d/0x490 [nvidia]
Apr 25 17:10:08 10.1.1.11 [85627.289698]
Apr 25 17:10:08 10.1.1.11 [<ffffffff802e31c4>] ? __kmalloc+0x74/0x110
Apr 25 17:10:08 10.1.1.11 [85627.289705]
Apr 25 17:10:08 10.1.1.11 [<ffffffffa054d1cd>] ? nv_kern_ioctl+0x15d/0x490 [nvidia]
Apr 25 17:10:08 10.1.1.11 [85627.289805]
Apr 25 17:10:08 10.1.1.11 [<ffffffffa054d53c>] ? nv_kern_unlocked_ioctl+0x1c/0x20 [nvidia]
Apr 25 17:10:08 10.1.1.11 [85627.289905]
Apr 25 17:10:08 10.1.1.11 [<ffffffff802f62d1>] ? vfs_ioctl+0x31/0xa0
Apr 25 17:10:08 10.1.1.11 [85627.289910]
Apr 25 17:10:08 10.1.1.11 [<ffffffff802f6685>] ? do_vfs_ioctl+0x75/0x230
Apr 25 17:10:08 10.1.1.11 [85627.289913]
Apr 25 17:10:08 10.1.1.11 [<ffffffff802f68d9>] ? sys_ioctl+0x99/0xa0
Apr 25 17:10:08 10.1.1.11 [85627.289917]
Apr 25 17:10:08 10.1.1.11 [<ffffffff802e82a0>] ? sys_read+0x50/0x90
Apr 25 17:10:08 10.1.1.11 [85627.289920]
Apr 25 17:10:08 10.1.1.11 [<ffffffff8021253a>] ? system_call_fastpath+0x16/0x1b
Apr 25 17:10:08 10.1.1.11 [85627.289924] Code:
     ....
Apr 25 17:10:08 10.1.1.11
Apr 25 17:10:08 10.1.1.11 [85627.289993] RIP
Apr 25 17:10:08 10.1.1.11 [<ffffffffa0099e26>] _nv020907rm+0x14/0x42 [nvidia]
Apr 25 17:10:08 10.1.1.11 [85627.290079] RSP <ffff88011e9d19b0>
Apr 25 17:10:08 10.1.1.11 [85627.290173] ---[ end trace 3dc5288f733b0548 ]---
Apr 25 17:10:08 10.1.1.11 [85627.290175] Fixing recursive fault but reboot is needed!
Apr 25 17:10:08 10.1.1.11 [85627.300924] general protection fault: 0000 [#6]
Apr 25 17:10:08 10.1.1.11 SMP
Apr 25 17:10:08 10.1.1.11