Comment 13 for bug 342096

Revision history for this message
Bart de Koning (bratdaking) wrote :

Hey all, I just bumped into this problem too, with Ubuntu 9.04 unr installed fresh from a usb stick, used sda1 as / (ext2), and a sd card for /home (ext4). After my battery was empty my SD disk crashed upon resume.
obviously the problem is a lost partition table:
output of 'sudo fdisk -l'

Schijf /dev/mmcblk0: 8201 MB, 8201961472 bytes
4 koppen, 16 sectoren/spoor, 250304 cilinders
Eenheid = cilinders van 64 * 512 = 32768 bytes
Schijf-ID: 0x00000000

Schijf /dev/mmcblk0 bevat geen geldige partitietabel
(is dutch for: Disk /dev/mmcblk0 does not contain a valid partition table)

Googling around I found that this problem hurt other people with other systems as well, especially the openmoko and OLPC community, see:
http://dev.laptop.org/ticket/6532
http://docs.openmoko.org/trac/ticket/1802

As far as I understood the resume action does something tricky: it revives the SD card and checks whether the card is still in the slot, only it does the check too quick, the power is not entirely restored yet, what results in removal of the disk from the system, although the disk is actually still in and mounted resulting in overwriting of the partition table:

***
dsaxena in http://dev.laptop.org/ticket/6532:
Upon coming out of resume, the SD code, with CONFIG_MMC_UNSAFE_SUSPEND enabled, checks to see if there is a card plugged into the system and whether that card is the same as the one that was plugged into the system at suspend time. This is accomplished by reading the card ID of the device and for some reason, very possibly #1339, we fail this detection. In this case, the kernel removes the old device from the system and in this execution path, the partition information for this device is zeroed.

Even though the device is removed, the device is still mounted and upon unmount, ext2 syncs the superblock, even if the file system is sync'd beforehand. The superblock is block 0 of the partition and the block layer adds to this the partition start offset before submitting the write to the lower layers. As the partition information has already been zeroed out, we end up writing to block 0 of the disk itself, overwriting the partition table and the geometry information. I've verified this by both gathering debug output and 'dd' + 'hexdump' of corrupted and uncorrupted media.
***

The proposed fix was to build a delay of 400 ms in the resume process. Unfortunatley I am not really a programmer, so I have not really a clue how to do that yet.

In the meanwhile another (dirty) workaround I found from polarbaer in http://docs.openmoko.org/trac/ticket/1802:

***
I have the same problem with a 8GB SanDisk?-card, even when the 2008.8 is installed on the micro-SC. Since I have a Qtopia in the flash (to have a stable phone) I backed up the partitiontable via
dd if=/dev/mmcblk0 of=/home/root/backup.img bs=512 count=1
and put the restore-command
dd of=/dev/mmcblk0 if=/home/root/backup.img bs=512 count=1
to any bootup-script in Qtopia, in my case to the top of:
/etc/init.d/mountall.sh
so it fixes itself once I boot my Qtopia.

Not a fix, but at least a very bloody workaround.
***