inconsistent handling of branches with '_' in their name

Bug #95109 reported by John A Meinel
10
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
High
Michael Hudson-Doyle

Bug Description

This is somewhat related to bug 5575

The web interface will prevent you from using a name with an underscore, but it is still possible when using automatic naming through 'bzr register-branch'.

I have several branches that work this way, such as:
https://code.launchpad.net/~jameinel/bzr/iter_changes_unicode_unknowns

My default naming convention is to use an underscore, and so when I 'bzr register-branch' it just registers the same url with LP.

This creates a branch with an underscore in its Launchpad name. Which seems to work fine, even though it violates some hidden naming constraint.

The only time I run into that constraint is when I want to update the branch status from "New" to "Merged", because the +edit page will complain if you use underscores.

My preferred "fix" for this, would be to just allow "_" as a part of a name. I don't really understand why it would not be allowed. But for consistency sake, the other possibilities are to have the auto-name include dashes, or to fail the registration (with an understandable error) if the final name would include underscores.

Oh, and it is also possible to create a branch with that sort of name when pushing via sftp.

bzr push sftp://bazaar.launchpad.net/~jameinel/+junk/test_branch

Gives me
https://code.launchpad.net/~jameinel/+junk/test_branch

Tags: lp-code
Revision history for this message
James Henstridge (jamesh) wrote :

The database constraints don't match the constraint in the Python code.

The Python code uses our standard valid_name() constraint of ^[a-z0-9][a-z0-9\+\.\-]*$

The database constraint uses a valid_branch_name() constraint of ^(?i)[a-z0-9][a-z0-9+\.\-@_]+$ (i.e. same as above but allows upper case, "@" and "_").

One of these constraints is incorrect.

Revision history for this message
James Henstridge (jamesh) wrote :

I wonder if the database constraint is left over from the old Arch days?

Revision history for this message
Jonathan Lange (jml) wrote :

We recently discussed this at a Launchpad-Bazaar meeting. This is an edited summary:

As a general rule, we don't allow underscore in Launchpad ids that appear in URLs because they're hard to type, read out and so forth. If there's a good reason that we need underscores in branch names, and thus in URLs, then we can do so and it's not a big deal.

So, for the time being, no underscores. This means tightening the database constraints, migrating existing data and ensuring that the LP plugin and the codehosting server give helpful messages when users try to create branches with underscores. Alternatively, we could rewrite paths that contain '_' to use '-' instead.

More generally, it would be good to have a conversion function that handles underscores as well as other characters that aren't allowed on Launchpad.

Revision history for this message
John A Meinel (jameinel) wrote :

Yeah, Steve mentioned in London that it was because '_' is hard to communicate to people, and was felt like a better ui would use '-'.

I'm okay with that, but I think it is something that needs a bit of documentation and helpful errors. So that when someone does something that seems natural to them, and it fails, we have a clear reason as to why. Rather than just "'_' is not allowed in names." Which causes more frustration.

Having heard the reasons, I'm actually pretty okay with it, but not everyone has Steve to talk to.

Revision history for this message
Andrew Bennetts (spiv) wrote :

> Having heard the reasons, I'm actually pretty okay with it, but not everyone has Steve to talk to.

Maybe the real bug is that there's only one Steve ;)

Tim Penhey (thumper)
Changed in launchpad-bazaar:
importance: Undecided → Medium
status: New → Confirmed
Revision history for this message
David Allouche (ddaa) wrote :

Is that related 124441?

In my opinion, the broader problem is that we do not our input (name, url, etc.) validation code in the various front-ends we provide (webapp, xmlrpc, sftp/bzr+ssh).

Revision history for this message
David Allouche (ddaa) wrote :

... the broader problem is that we do not REUSE our input [...] validation code...

Revision history for this message
Robert Collins (lifeless) wrote :

Just saw another user hit this today.

Assuming the approved solution is to ban creating branches with _, can we please make that change a critical priority? Fixing the existing branches is less important than stopping new users running into this hard-to-understand issue.

Revision history for this message
Jonathan Lange (jml) wrote :

I don't think it's urgent enough to be cherrypicked in, but it's certainly worth doing for 1.1.8.

Changed in launchpad-bazaar:
assignee: nobody → jml
importance: Medium → High
status: Confirmed → Triaged
Jonathan Lange (jml)
Changed in launchpad-bazaar:
assignee: jml → nobody
Revision history for this message
Tim Penhey (thumper) wrote :

David, is this fixed as part of the 1.1.7 rollout?

Changed in launchpad-bazaar:
assignee: nobody → ddaa
status: Triaged → Confirmed
Revision history for this message
David Allouche (ddaa) wrote :

This is not fixed in 1.1.7 as far as I know.

Revision history for this message
David Allouche (ddaa) wrote :

In my opinion, the fix should have two parts:

1. Fix sftp/bzr+ssh server and xmlrcp to IBranch['name'].validate: so people cannot register new branches with bad names.
2. Modify IBranch['name'].validate so it won't error out on invalid names that have not been modified: so users do not get errors for fields they have not changed.

David Allouche (ddaa)
Changed in launchpad-bazaar:
assignee: ddaa → nobody
Tim Penhey (thumper)
Changed in launchpad-bazaar:
assignee: nobody → jml
Jonathan Lange (jml)
Changed in launchpad-bazaar:
assignee: jml → mwhudson
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

More fun: you can register/upload a branch with upper case character in it's name, but the webapp doesn't allow this and codebrowse breaks for them.

I don't know how possible ddaa's 2. is.

Revision history for this message
David Allouche (ddaa) wrote :

Now that we have a process to communicate changes to users, it might be simpler to just fix the production data and not do point 2 at all.

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Actually, it is quite easy, this sort of thing works nicely:

class BranchNameField(TextLine):
    def _validate(self, value):
        if self.context is not None and self.context.name == value:
            return
        super(BranchNameField, self)._validate(value)

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Not going to make 1.1.10.

Changed in launchpad-bazaar:
milestone: 1.1.10 → 1.1.11
Revision history for this message
Tim Penhey (thumper) wrote :

OK, here is the definitive answer:

'_' is allowed in branch names. On the whole this will allow things to work as expected. People do use underscores in branch names on their local machines, and to quietly change them would cause confusion and mayhem and an end to order as we know it.

The UI validators should be changed to allow underscores in branch names, and we should use this validator consistently in all locations.

Revision history for this message
James Henstridge (jamesh) wrote :

We'll also need to think about what to do with codebrowse then. Turbogears/cherrypy craziness "normalises" dashes to underscores in URL components, and we were relying on the fact that underscores were not allowed in branch names to unambiguously undo the mapping.

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Thanks for the follow up. As we're not going to limit what the user can do, it doesn't seem so important to me to fix this asap, so I'll push it into 1.1.12.

It's not just _ though, it's also @ and uppercase characters (I should have mentioned this on the phone, I guess). What about them?

WRT codebrowse, I still think the answer is "dump cherrypy".

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Move to 1.1.12.

(I tried to do this before but got caught out by the usual bugtask editing usability gotcha...)

Changed in launchpad-bazaar:
milestone: 1.1.11 → 1.1.12
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Branch in review.

Changed in launchpad-bazaar:
status: Confirmed → In Progress
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

RF launchpad/devel revision 5241

Changed in launchpad-bazaar:
status: In Progress → Fix Committed
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Rolled out with 1.1.12.

Changed in launchpad-bazaar:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.