Comment 3 for bug 485976

Revision history for this message
ChrisW (chris-simplistix) wrote :

Okay, this happened again...

On zeoraid2:

10:41 - CRITICAL: Storage zeo1 degraded. Reason: inconsistent OIDs

On zeoraid1:

11:22 - ERROR: lastTransaction() raised exception: RAID is inconsistent and was closed.

This is the packed storage once more.

Now, thinking about this more, I leave the clients connected to zeoraid1, and yet the initial error here happens on zeoraid2, which should be idle...

So, here's the event log from zeoraid1:
------
2009-11-25T10:40:33 INFO ZEO.StorageServer (13221) new connection ('client', 57615): <ManagedServerConnection ('client', 57615)>
------
2009-11-25T10:40:33 INFO ZEO.StorageServer (13221) disconnected
------
2009-11-25T10:40:38 INFO ZEO.StorageServer (13221) new connection ('client', 57617): <ManagedServerConnection ('client', 57617)>
------
2009-11-25T10:40:38 INFO ZEO.StorageServer (13221) disconnected
------
2009-11-25T11:22:32 INFO ZEO.StorageServer (13221) new connection ('client', 58926): <ManagedServerConnection ('client', 58926)>
------
2009-11-25T11:22:32 INFO ZEO.zrpc.Connection(S) (client:58926) received handshake 'Z303'
------
2009-11-25T11:22:32 INFO ZEO.StorageServer (13221/client:48809) disconnected
------
2009-11-25T11:22:32 INFO ZEO.ClientStorage [('zeo1', 6011)] Disconnected from storage: "('zeo1', 6011)"
------
2009-11-25T11:22:32 INFO ZEO.ClientStorage [('zeo2', 6011)] Disconnected from storage: "('zeo2', 6011)"
------
2009-11-25T11:22:33 ERROR ZEO.zrpc.Connection(S) (client:58926) lastTransaction() raised exception: RAID is inconsistent and was closed.

Likewise from zeoraid2:
------
2009-11-25T10:40:33 INFO ZEO.StorageServer (1670) new connection ('client', 57616): <ManagedServerConnection ('client', 57616)>
------
2009-11-25T10:40:33 INFO ZEO.zrpc.Connection(S) (client:57616) received handshake 'Z303'
------
2009-11-25T10:40:38 INFO ZEO.StorageServer (1670) new connection ('client', 57618): <ManagedServerConnection ('client', 57618)>
------
2009-11-25T10:40:38 INFO ZEO.zrpc.Connection(S) (client:57618) received handshake 'Z303'
------
2009-11-25T10:41:36 CRITICAL gocept.zeoraid Storage zeo1 degraded. Reason: inconsistent OIDs
------
2009-11-25T10:41:36 INFO ZEO.ClientStorage [('zeo1', 6011)] Disconnected from storage: "('zeo1', 6011)"
------
2009-11-25T11:22:32 INFO ZEO.StorageServer (1670) new connection ('client', 58927): <ManagedServerConnection ('client', 58927)>
------
2009-11-25T11:22:33 INFO ZEO.zrpc.Connection(S) (client:58927) received handshake 'Z303'
------
2009-11-25T11:22:33 INFO ZEO.StorageServer (1670) tid to old for invq 252843237928668467 < 252847970190423022
------

So, the questions I have:

- Why is the client connecting to both zeoraid1 and zeoraid2 at the same time?

- What causes the disconnections on zeoraid1 at 10:40:33 and 10:40:38?

- Which connection is the client using, based on this logging? (if you need debug logging stuff, lemme know which timeframe for)

- What triggers the inconsistent OIDs on zeoraid2 at 10:41, and how does this lead to inconsistent raid on zeoraid2 at 11:22?

Request:

- please can you un-mark this bug as a duplicate

- can you add some logging to show what connection is in use, or let me know what I should grep for to find this in the debug logs...

cheers,

Chris