Changes between Version 59 and Version 60 of DeveloperGuidelines/CodeConventions


Ignore:
Timestamp:
01/26/16 12:08:30 (9 years ago)
Author:
Dominic König
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • DeveloperGuidelines/CodeConventions

    v59 v60  
    9292The str() constructor in Python 2 assumes that its argument is ASCII-encoded, and raises an exception for unicode-objects that contain non-ASCII characters. To prevent that, we must implement safe ways for converting unicode into str, ''enforcing'' UTF-8 encoding.
    9393
    94 Additionally, indices in str objects count byte-wise, not character-wise - which can lead to invalid characters when extracting substrings from UTF-8 encoded strings. Therefore, for any substring- or character-operations we must safely ''decode'' the str into a unicode object, ''assuming'' UTF-8 encoding.
     94Additionally, indices in str objects count byte-wise, not character-wise - which can lead to invalid characters when extracting substrings from UTF-8 encoded strings. Further, in Python 2, str.lower() and str.upper() may not work correctly for some unicode characters (e.g. "Ẽ".lower() gives "Ẽ" again - instead of "ẽ"), depending on the server locale setting. Therefore, for any substring- or character-operations we must safely ''decode'' the str into a unicode object, ''assuming'' UTF-8 encoding.
    9595
    9696==== Unicode-Guideline ====