buildbot is letting production breaking changes through

Bug #645860 reported by Robert Collins
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Launchpad itself
Won't Fix
High
Unassigned

Bug Description

We're now in a state where for the last (approximate) week, db-stable will fail to deploy and we'll be running around fixing things on release day: a terrible place to be.

I think we have to -urgently-:
 - stop letting lp_lucid_db bless revisions
 - fix lp_db and have it bless revisions

We will still get 2.6 coverage from the lp_lucid builder, and we'll get 2.5 coverage from lp_db; giving us approximately full coverage except for the rarer direct landings on db-devel. And those will be safe for 2.5

Revision history for this message
Gary Poster (gary) wrote :

The "critical" value has a particular meaning (https://wiki.canonical.com/Launchpad/PolicyandProcess/DefinitionofCriticalPolicy) which I do not think is appropriate. I am downgrading it to "high".

The meaning of "high" has been diluted by a variety of causes IMO, which means that I understand the desire to call this critical. However, I don't think the answer is to call very-high priority bugs critical bugs, except possibly as a way of kicking.

Moreover, I don't want to commission this work until LOSAs say that they do not believe we will be on Lucid for the next release. I will continue to pursue this.

Changed in launchpad-foundations:
status: New → Triaged
importance: Critical → High
Revision history for this message
Gary Poster (gary) wrote :

In discussion with mthaddon, LOSAs believe that we will be fully on Lucid/PG 8.4 by the end of next week.

Revision history for this message
Robert Collins (lifeless) wrote :

That definition is for use in the context of escalating problems on the production website; according to Francis, a 'critical bug' is one which we 'must work on now'. Regardless of its relevance, under that policy this issue is critical: "PQM or buildbot down, unplanned 1 working day" - if you consider buildbot failing to do what its meant to do 'down'. [I do]

I took the following into account when saying 'critical':
 - it may cause a last minute, 4 hour delay in the release process
 - it has already incurred repeated significant delays in delivering CPs for OOPSes (and general improvements)

I was actually specifically intending that this bug jump the queue and be a priority interrupt, which understandably is disruptive and should happen very rarely.

As the DB servers won't be moving to Lucid till after the release (AIUI), waiting for that just invites the first condition to trigger, and we'll all pay the price if that happens. That or the release manager probably needs to run a test run on prod-devel 9 hours in advance (two runs + fuzz time) of trying to release.

Changed in launchpad-foundations:
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.