Changes between Version 59 and Version 60 of DeveloperGuidelines/CodeConventions
- Timestamp:
- 01/26/16 12:08:30 (9 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
DeveloperGuidelines/CodeConventions
v59 v60 92 92 The str() constructor in Python 2 assumes that its argument is ASCII-encoded, and raises an exception for unicode-objects that contain non-ASCII characters. To prevent that, we must implement safe ways for converting unicode into str, ''enforcing'' UTF-8 encoding. 93 93 94 Additionally, indices in str objects count byte-wise, not character-wise - which can lead to invalid characters when extracting substrings from UTF-8 encoded strings. Therefore, for any substring- or character-operations we must safely ''decode'' the str into a unicode object, ''assuming'' UTF-8 encoding.94 Additionally, indices in str objects count byte-wise, not character-wise - which can lead to invalid characters when extracting substrings from UTF-8 encoded strings. Further, in Python 2, str.lower() and str.upper() may not work correctly for some unicode characters (e.g. "Ẽ".lower() gives "Ẽ" again - instead of "ẽ"), depending on the server locale setting. Therefore, for any substring- or character-operations we must safely ''decode'' the str into a unicode object, ''assuming'' UTF-8 encoding. 95 95 96 96 ==== Unicode-Guideline ====