Note the dev-sda1 in the md0/md directory in sysfs, and the dev-sda5 in the md1/md directory. These are the ones it complains about on insertion:
[ 35.023792] WARNING: at /build/buildd/linux-2.6.28/fs/sysfs/dir.c:462 sysfs_add_one+0x4c/0x50()
[ 35.023794] sysfs: duplicate filename 'dev-sda1' can not be created
[...]
[ 35.074528] WARNING: at /build/buildd/linux-2.6.28/fs/sysfs/dir.c:462 sysfs_add_one+0x4c/0x50()
[ 35.074529] sysfs: duplicate filename 'dev-sda5' can not be created
Whatever registered this directory seems to have done it properly, it has appropriate links etc internally:
(initramfs) ls -l /sys/devices/virtual/block/md0/md/dev-sda1
lrwxrwxrwx 1 0 0 0 block -> ../../../../../pci0000:00/0000:00:01.1/host0/target0:0:0/0:0:0:0/block/sda/sda1
-rw-r--r-- 1 0 0 4096 size
-rw-r--r-- 1 0 0 4096 offset
-rw-r--r-- 1 0 0 4096 slot
-rw-r--r-- 1 0 0 4096 errors
-rw-r--r-- 1 0 0 4096 state
Ok so where do these come from. They are made by bind_rdev_to_array() and undone by unbind_rdev_from_array(). From the logs we can see that that basically the kernel is making, unmaking, and remaking the array to degrade it:
[ 3.371474] md: bind<sda1>
[ 3.381990] md: bind<sda5>
[...]
[ 35.003029] md: md0 stopped.
[ 35.003043] md: unbind<sda1>
[ 35.020198] md: export_rdev(sda1)
[ 35.023745] md: bind<sda1>
[ 35.023787] ------------[ cut here ]------------
[ 35.023792] WARNING: at /build/buildd/linux-2.6.28/fs/sysfs/dir.c:462 sysfs_add_one+0x4c/0x50()
[ 35.023794] sysfs: duplicate filename 'dev-sda1' can not be created
If we look at the unbind_rdev_from_array() call it uses delayed work to remove the actual entries:
So if this was not waited for appropriatly we might well then sometimes manage to get back to binding the new one before this has been done. This being a race would also fit with the transient nature of the issue.
Will patch this to wait for the pending work and see if that resolves the issue or not.
I should also note that the kernel is not lying, these file are visibly present in sysfs:
(initramfs) ls /sys/devices/ virtual/ block/md0/ md position new_dev component_size layout virtual/ block/md1/ md position new_dev component_size layout
dev-sda1 safe_mode_delay resync_start raid_disks
reshape_
array_state metadata_version chunk_size level
(initramfs) ls /sys/devices/
dev-sda5 safe_mode_delay resync_start raid_disks
reshape_
array_state metadata_version chunk_size level
(initramfs)
Note the dev-sda1 in the md0/md directory in sysfs, and the dev-sda5 in the md1/md directory. These are the ones it complains about on insertion:
[ 35.023792] WARNING: at /build/ buildd/ linux-2. 6.28/fs/ sysfs/dir. c:462 sysfs_add_ one+0x4c/ 0x50() buildd/ linux-2. 6.28/fs/ sysfs/dir. c:462 sysfs_add_ one+0x4c/ 0x50()
[ 35.023794] sysfs: duplicate filename 'dev-sda1' can not be created
[...]
[ 35.074528] WARNING: at /build/
[ 35.074529] sysfs: duplicate filename 'dev-sda5' can not be created
Whatever registered this directory seems to have done it properly, it has appropriate links etc internally: virtual/ block/md0/ md/dev- sda1 ./../.. /pci0000: 00/0000: 00:01.1/ host0/target0: 0:0/0:0: 0:0/block/ sda/sda1
(initramfs) ls -l /sys/devices/
lrwxrwxrwx 1 0 0 0 block -> ../../.
-rw-r--r-- 1 0 0 4096 size
-rw-r--r-- 1 0 0 4096 offset
-rw-r--r-- 1 0 0 4096 slot
-rw-r--r-- 1 0 0 4096 errors
-rw-r--r-- 1 0 0 4096 state
Ok so where do these come from. They are made by bind_rdev_ to_array( ) and undone by unbind_ rdev_from_ array() . From the logs we can see that that basically the kernel is making, unmaking, and remaking the array to degrade it:
[ 3.371474] md: bind<sda1> buildd/ linux-2. 6.28/fs/ sysfs/dir. c:462 sysfs_add_ one+0x4c/ 0x50()
[ 3.381990] md: bind<sda5>
[...]
[ 35.003029] md: md0 stopped.
[ 35.003043] md: unbind<sda1>
[ 35.020198] md: export_rdev(sda1)
[ 35.023745] md: bind<sda1>
[ 35.023787] ------------[ cut here ]------------
[ 35.023792] WARNING: at /build/
[ 35.023794] sysfs: duplicate filename 'dev-sda1' can not be created
If we look at the unbind_ rdev_from_ array() call it uses delayed work to remove the actual entries:
static void unbind_ rdev_from_ array(mdk_ rdev_t * rdev)
synchronize_ rcu();
INIT_WORK( &rdev-> del_work, md_delayed_delete);
kobject_ get(&rdev- >kobj);
schedule_ work(&rdev- >del_work) ;
{
[...]
}
And it appears to be this this is removing the objects finally:
static void md_delayed_ delete( struct work_struct *ws)
kobject_ del(&rdev- >kobj);
kobject_ put(&rdev- >kobj);
{
mdk_rdev_t *rdev = container_of(ws, mdk_rdev_t, del_work);
}
So if this was not waited for appropriatly we might well then sometimes manage to get back to binding the new one before this has been done. This being a race would also fit with the transient nature of the issue.
Will patch this to wait for the pending work and see if that resolves the issue or not.