Comment 7 for bug 485976

Revision history for this message
ChrisW (chris-simplistix) wrote : Re: [Bug 485976] Re: back end randomly transitioned to failed

Christian Theune wrote:
>> - Which connection is the client using, based on this logging? (if you
> need debug logging stuff, lemme know which timeframe for)
>
> I'd say that in the first time frame (10:40-10:42) ZR1 is being
> connected to by the Zope server but not used and ZR2 is being connected
> to by the batch processor and actually doing some serious business.

That's odd, since ZR1 is listed first on all clients...
Why would ZR1 not be used?

>> - What triggers the inconsistent OIDs on zeoraid2 at 10:41, and how
> does this lead to inconsistent raid on zeoraid2 at 11:22?
>
> You're asking about zeoraid1 at 11:22, right?

Nope, zeoraid2, at 10:41, as I put in a previous message:
2009-11-25T10:41:36 CRITICAL gocept.zeoraid Storage zeo1 degraded.
Reason: inconsistent OIDs

> I'm not yet sure what makes the OIDs inconsistent. I have the feeling
> that it's the packing.

...but packing only happens once a week, and not on a Wednesday.

> However, I wonder what happened before that
> transaction in ZR2 that came out with inconsistent OIDs.

Anything I can do to help you find out?

>> - can you add some logging to show what connection is in use, or let
> me know what I should grep for to find...
>
> Not quite sure what you mean. The connection is displayed: at some point
> a client connects and gets a storage server id assigned. all later
> calls have that id in them.

That would be the "what I should grep for" answer then ;-)

Chris