Comment 2 for bug 1202395

Revision history for this message
Mark Sapiro (msapiro) wrote :

This is complicated. It is not clear that this is a bug, and if it is a bug, it is not clear that the bug is in sync_members.

The problem occurs in the statements

    s = email.Utils.formataddr((name, addr)).encode(enc, 'replace')

when name contains non-ascii. The first issue is that the job of email.Utils.formataddr() is to take a name and address pair and return a string (e.g. 'name <addr>') suitable for inclusion on a To:, From:, Cc:, etc. email message header. Headers are not allowed to contain non-ascii, so it could be argued that if name contains non-ascii, the result returned by email.Utils.formataddr() should be RFC 2047 encoded so it doesn't contain non-ascii.

Ignoring that, the next issue is that Python's default encoding is ascii regardless of locale. Thus, when we try to encode() the string returned by email.Utils.formataddr(), Python must first decode it and does this using the ascii codec which throws the exception. Removing the encode() as the suggested patch does avoids this, but is not, I think, the best way to fix this.

I think the proper fix is to make your Python locale aware by editing the /usr/lib/pythonv.v/site.py module and changing the first

    if 0:

in the definition of setencoding() to

    if 1:

This will not only fix this issue with sync_members, it will also fix the garbled output from list_mermbers -f and probably other cases of non-ascii being replaced with '?' in the command line scripts.

Another way to do this is to add

import sys
sys.setdefaultencoding('utf-8')

to the sitecustomize.py module (/etc/pythonv.v/sitecustomize.py on Debian).