Comment 7 for bug 368626

Revision history for this message
Facundo Batista (facundo) wrote :

There are three cases to address:

a) User has a filename that SD can not convert to Unicode (decoding it using the user's filesystem encoding).

b) SD receives from the server a Unicode filename that can not convert to bytes (encoding it using the user's filesystem encoding).

c) SD receives from the server a filename that can not save locally because of filesystem restrictions.

In case a), we convert it replacing "high" bytes to create a name that will not overlap other similar names.

Example:

    Client 1 (using utf8): "Hola \xff"
    Server: u'Hola %FF' ("Hola %FF")
    Client 2 (using utf8) "Hola %FF" ("Hola %FF")

In case b), we encode to bytes safely (using UTF-8) and then apply the same conversion as before.

Example:

    Client 1 (using utf8): "Pi: \xcf\x80"
    Server: u"Pi: \u03c0" ("Pi: Ï€")
    Client 2 (using latin1) "Pi: %CF%80"

In case c), we translate it using a local table that will depend of the Ubuntu One client prepared for that filesystem.

Example:

    Client 1 (ext3 filesystem): "*.txt"
    Server: u"*.txt"
    Client 2 (ntfs filesystem) "star.txt"

All cases are addressed in today's Unicode boundary, where locally (filesystem and metadata) we have bytes, and in the protocol and server we use Unicode.

The translations are not reversible, and are done once: if the translation was needed a "server_name" field will be filled in the local metadata with the name of the server side, and the path will have the local names (no matter which one is the original one).

Those who compares server and local names (e.g.: Sync.merge_directory()) need to use this server_name when available.