Comment 8 for bug 581207

Revision history for this message
M. Vefa Bicakci (mvb) wrote :

Please do not get offended, but I think you misunderstood the
problem. As I explained above in my first post, because the Turkish
language (and hence the Turkish locale) has different capitalization
rules regarding "i", even if the desktop entries are composed of
ASCII characters, we will get problems when we try to process
"i" or "I" characters. (In Turkish: "ı" <-> "I" and "i" <-> "İ".)

How?

Let say we are processing "X-AppInstall-Ignore". If we try to use
lower() on this string in a non-Turkish locale, we would get
"x-appinstall-ignore". However, if we do this in a Turkish locale,
then we would get "x-appInstall-Ignore". Notice that the "I"
characters stay the same. This is because we can't represent
the small dot-less i in ASCII.

So, the problem isn't whether a string contains Unicode data.
The problem is that we are trying to apply Turkish capitalization
rules to a string that contains English/ASCII data. As I noted
above Turkish capitalization rules of the "i" are different compared
to that of English, and because of this, we are not going to get
a string we expect.

I am going to attach a small Python script to illustrate this problem.
Please run it on your system to see the effects of the Turkish locale.
As you will see, the "i" and "I" characters are not capitalized or made
lowercase properly.

To my knowledge, the only work-around to this problem is to define
ASCII-only upper and lower functions and use them whenever the
data we are operating on is English.

I am sorry; I think I repeated myself a bit, but I really want to get
the message across. Please let me know if you have any questions.

Regards,

M. Vefa Bicakci