Comment 45 for bug 269651

Revision history for this message
James Westby (james-w) wrote :

Hi,

Thanks for the information, it has been very helpful. I can reproduce the
problem with your method.

It seems that /var/run/ConsoleKit/database is a red-herring. I just reproduced
it with the function that writes that and does the rename that appears
at the end of the strace with an empty function and still get the crash. It
obviously crashes in whatever it does after writing the file.

Your steps to reproduce suggest this is an issue with removing sessions.

remove_session_for_cookie appears in the stacktrace, and contains a
call to ck_manager_dump just before calling g_object_unref which
is in the stacktrace one level lower than the remove_session_for_cookie
call.

The g_object_unref is of "orig_session", which has this comment where
it is retrieved:

        /* Must keep a reference to the session in the manager until
         * all events for seats are cleared. So don't remove
         * or steal the session from the master list until
         * it is removed from all seats. Otherwise, event logging
         * for seat removals doesn't work.
         */

The g_object_unref calls ck_session_finalize which in turn calls
session_remove_activity_watch, which ends up at file_monitor_remove_notify.
The notify is not NULL as was previously thought, as that would lead to
a segfault much earlier.

It first looks up the watch in it's global list of watches, and finds it. It then steals
it out of the hash. It the removes this watch from the list of watches for the same
path. If that list is zero, which it is in this case, it calls file_monitor_remove_watch
with the watch.

The stacktrace shows the watch is NULL in this case.

I think the problem is due to inotify causing things to happen in a separate thread.

The inotify response function makes sure to use idle_add to instruct the main thread
to act on the information, except if the inotify even has IN_IGNORED, indicating the
watch was removed (either with inotify_rm_watch or because the file was deleted).
In that case the code will call file_monitor_remove_watch from the other thread.
This is the same function that we see causing issues in the stacktrace.

As it is /dev/tty9 or similar that is being watched it is unlikely to have been removed,
and so we can assume it is inotify_rm_watch that is causing the event I believe.
The only caller of this is monitor_release_watch, whose only caller (that isn't tearing
down everything) is file_monitor_remove_watch.

I'm not clear why this doesn't lead to an infinite loop, except that an IN_IGNORED may
not be generated for every call to inotify_rm_watch.

I am going to try and debug this a bit further and test some patches to fix it.

Thanks,

James