= Python 2/3 Compatibility = [[TOC]] This guideline documents coding conventions to achieve hybrid Python-2.7/Python-3.5 compatibility in Sahana. It's a working document that will be added to as we move towards full Python-3 compatibility. == Syntax == === No print statements === Don't use the print statement anywhere: {{{#!python print "example" # deprecated }}} You shouldn't use the print-function either, because it can clash with uWSGI: {{{#!python print("example") # not good }}} ...but it can be tolerated in CLI scripts which don't run in the WSGI environment. Best option for debug output is to use sys.stderr.write: {{{#!python import sys sys.stderr.write("example\n") # better }}} ...or the logger (as it can be configured globally to write to a log file instead of the system console): {{{#!python current.log.debug("example") # even better }}} === Use as-syntax for catching exceptions === When catching exceptions in a variable, don't use the comma-syntax: {{{#!python try: ... except Exception, e: # deprecated ... }}} Instead, use the as-keyword: {{{#!python try: ... except Exception as e: # new standard ... }}} === Raise Exceptions Instances === Python-3 does no longer support the {{{raise E, V, T}}} syntax - the new syntax is {{{raise E(V).with_traceback(T)}}}. However, the traceback is rarely required - and where E is an Exception class (rather than a string), we can use the {{{raise E(V)}}} syntax which works in all Python versions. {{{#!python # Works in Py2, but not in Py3: raise SyntaxError, "Error Message" # Works in all Python versions, hence our Standard: raise SyntaxError("Error Message") }}} If a traceback object must be passed (which is rarely needed), then we must use the PY2 constant to implement alternative statements. === No exec statements === In Python-3, the {{{exec}}} statement has become a function {{{exec()}}}. Python-2.7 accepts the function-syntax as well, so we use it throughout: {{{#!python # Deprecated: exec pyexpr # New Standard: exec(pyexpr) }}} NB the Python-2.7 {{{exec}}} statement also accepts a 3-tuple as parameter {{{exec(expr, globals, locals)}}} which syntax is equivalent to the exec-function in Python-3. === No cmp-parameter in sort/sorted === Python-3 does no longer support the {{{cmp}}} parameter for {{{x.sort()}}} and {{{sorted(x)}}}. We use the {{{key}}} parameter instead. For locale-sensitive sorting, use the s3compat alternative {{{sorted_locale}}}: {{{#!python # Python-2 pattern, not working in Python-3: x = sorted(x, cmp=locale.strcoll) # Python-3 pattern, not working with unicode in Python-2: x = sorted(x, key=locale.strxfrm) # Compatible pattern: from s3compat import sorted_locale x = sorted_locale(x) }}} == Usage == === No implicit package-relative imports === Python-3 does not search for modules relative to the current module in the same package - unless explicitly indicated by leading {{{.}}} or {{{..}}} in the module path. {{{#!python from s3datetime import s3_format_datetime # inside modules/s3, not working in Python-3 }}} Python-2.7 would search relative to the current module, but on the other hand, it supports the explicit-relative syntax as well. So we decide that only explicit paths shall be used in imports. To import a module in the same package (e.g. within s3), either use explicit-relative syntax: {{{#!python from .s3datetime import s3_format_datetime # inside modules/s3, preferred variant }}} ...or an absolute path relative to modules (or the global python path): {{{#!python from s3.s3datetime import s3_format_datetime # inside modules/s3, acceptable alternative }}} Outside of modules/s3, you should always import from the top-level of the s3 package (because the package structure may change over time): {{{#!python from s3 import s3_format_datetime # outside modules/s3 }}} === Alternative Imports === As the locations and names of some libraries have changed in Python-3, we use the compatibility module ({{{modules/s3compat.py}}}) to implement suitable alternatives. Similarily, the compat module provides alternatives for other objects such as types, functions and certain common patterns. Where an object is provided by modules/s3compat.py, it '''MUST''' be imported from there if used. The following objects are provided by s3compat: ==== Constants ==== ||= '''Name''' =||= '''Type''' =||= '''!Comments/Caveats''' =|| ||PY2||boolean||Constant indicating whether we're currently running on Py2, should only be used if alternatives cannot be generalized|| ==== Libraries ==== ||= '''Name''' =||= '''Type''' =||= '''!Comments/Caveats''' =|| ||Cookie||module||maps to http.cookies in Py3|| ||pickle||module||replaces cPickle in Py2|| ||urlparse||module||maps to urllib.parse in Py3|| ||urllib2||module||maps to urllib.requests in Py3, which contains only part of Py2's urllib2 - some urllib2 objects therefore need to be imported separately (see below)|| ==== Functions ==== ||= '''Name''' =||= '''Type''' =||= '''!Comments/Caveats''' =|| ||reduce||function|| || ||reload||function|| || ||sorted_locale||lambda i||locale-sensitive sorting, {{{sorted(i, cmp=locale.strcoll)}}} in Py2, {{{sorted(i, key=locale.strxfrm)}}} in Py3|| ||name2codepoint||function||from htmlentitydefs (Py2) resp. html.entities (Py3)|| ||unichr||function|| || ||urlencode||function||replaces urllib.urlencode in Py2|| ||urllib_quote||function||replaces urllib.quote in Py2|| ||urlopen||function||replaces urllib.urlopen and urllib2.urlopen in Py2|| ||xrange||function||maps to {{{range}}} in Py3, since {{{xrange}}} does no longer exist (but {{{range}}} behaves like it)|| ||zip_longest||function||replaces itertools.izip_longest in Py2 (which has been renamed to zip_longest in Py3)|| ==== Types ==== ||= '''Name''' =||= '''Type''' =||= '''!Comments/Caveats''' =|| ||basestring||type|| || ||long||type||same as {{{int}}} in Py3, so can occasionally lead to redundancy|| ||unicodeT||type||for type checking, instead of {{{unicode}}} in Py2, maps to {{{str}}} in Py3|| ||!ClassType||type||replaces {{{types.ClassType}}} (old-style classes) in Py2, maps to {{{type}}} in Py3 (old-style classes do no longer exist)|| ==== Type Tuples for isinstance() ==== ||= '''Name''' =||= '''Type''' =||= '''!Comments/Caveats''' =|| ||CLASS_TYPES||tuple||maps to tuple of all known class types: {{{(type,types.ClassType)}}} in Py2, just (type,) in Py3|| ||INTEGER_TYPES||tuple||maps to tuple of all known integer types: (int,long) in Py2, (int,) in Py3|| ||STRING_TYPES||tuple||maps to tuple of all known string types: (str,unicode) in Py2, (str,) in Py3|| ==== Other Classes and Exceptions ==== ||= '''Name''' =||= '''Type''' =||= '''!Comments/Caveats''' =|| ||HTTPError||Exception||replaces urllib2.HTTPError in Py2, '''NB''' HTTPError is a subclass of URLError, so must be caught first in order to differentiate|| ||HTMLParser||class||replaces HTMLParser.HTMLParser in Py2|| ||StringIO||class/function||maps to cStringIO.StringIO in Py2 (which is a function rather than a class), so can't use this for type checking|| ||BytesIO||class/function||for binary data streams, same as StringIO in Py2, but different in Py3|| ||URLError||Exception||replaces urllib2.URLError in Py2|| === No dict.iteritems, iterkeys or itervalues === Python-3 does no longer have {{{dict.iteritems()}}}. We use {{{dict.items()}}} instead: {{{#!python # Deprecated: for x, y in d.iteritems(): # Compatible: for x, y in d.items(): }}} '''NB''' In Python-2, {{{dict.items()}}} returns a list of tuples, but in Python-3 it returns a dict view object that is sensitive to changes of the dict. If a list is required, or the dict is changed inside the loop, the result of {{{dict.items()}}} must be converted into a list explicitly. The same applies to {{{dict.iterkeys()}}} (use {{{dict.keys()}}} instead) and {{{dict.itervalues()}}} (use {{{dict.values()}}} instead). === No dict.has_key === The {{{dict.has_key()}}} method has been removed in Python-3 in favor of the {{{x in y}}} pattern, which is also available (and equivalent) in Python-2.7: {{{#!python # Deprecated: if d.has_key(k): # New Standard: if k in d: }}} === Map, Filter and Zip return generators === In Python-3, the {{{map()}}}, {{{filter}}} and {{{zip()}}} functions return generator objects rather than lists. This is fine when we want to iterate over the result, especially when there is a chance to break out of the loop early. But where lists are required, the return value must be converted explicitly using the list constructor. {{{#!python # This is fine: for item in map(func, values): # This could be wrong: result = map(func, values) # This is better: result = list(map(func, values)) # This could be even better: result = [func(v) for v in values] }}} '''NB''' For building a list from a single iterable argument, we prefer list comprehensions over {{{map()}}} or {{{filter()}}} for readability and speed. '''NB''' Generator objects (unlike lists, tuples or sets) can only be iterated over once, and they cannot be accessed by index === Don't unicode.encode === Since there is no difference between {{{unicode}}} and {{{str}}} in Python-3, using the {{{encode()}}} method will produce {{{bytes}}} rather than {{{str}}}. A {{{bytes}}} object differs from string in that it is an array of integers rather than an array of characters. It will also give a distorted result with any later {{{str()}}} or {{{s3_str()}}}. {{{#!python if isinstance(x, unicodeT): # unicodeT maps to str in Py3 x = x.encode("utf-8") # x becomes a bytes-object in Py3, unlike in Py2 where it becomes a str str(x) # thus, in Py3, this results in something like "b'example'" instead of the expected "example" x[1] # is "e" in Py2, but 120 (an integer!) in Py3 }}} If you just want to encode a potential {{{unicode}}} instance as an utf-8 encoded {{{str}}}, use {{{s3_str}}} rather than {{{unicode.encode}}}: {{{#!python # Do this instead: x = s3_str(x) }}} If you need to exclude non-string types from the conversion, you can keep the type-check: {{{#!python if isinstance(x, unicodeT): x = s3_str(x) }}} === Don't str.decode either === Similar, the {{{str}}} type has no {{{decode}}} method in Python3. To convert a utf-8 encoded {{{str}}} to {{{unicode}}} in Py2, use {{{s3_unicode}}}. NB Do not attempt to utf-8-decode UI strings on the client side either when the server runs Python-3. === next(i) not i.next() === In Python-3 the {{{i.next}}} method of iterators has been renamed into {{{i.__next__}}}, and should not be called explicitly but via the built-in {{{next(i)}}} function. This function is also available in Python-2.7 (where it calls i.next() instead), so we generally use the next function: {{{#!python i = iter(l) # Deprecated: item = i.next() # Forward+backward-compatible way to do it: item = next(i) }}} === Cannot use LazyT as sorting key === This may be a temporary issue: Web2py's {{{LazyT}}} does not define a {{{__lt__}}} method which is used for sorting in Python-3, but it does define a {{{__cmp__}}} which is used by Python-2.7 and therefore works. For the same reason, sorting an array of {{{lazyT}}} does not work in Python-3. As a workaround, wrap the {{{lazyT}}} in {{{s3_str}}} when sorting or using it as sorting key. {{{#!python a = [T("quick"), T("lazy"), T("fox")] # This will crash with a TypeError in Py3: a.sort() # This will work both in Py2 and Py3: a.sort(key=lambda item: s3_str(item)) }}} === XML with XML declaration must be bytes === Any XML that contains an XML declaration must be UTF-8 encoded {{{bytes}}}. In Python-2, {{{bytes}}} is synonymous with {{{str}}} - but in Python-3 it is a different data type. - {{{S3XML.tostring}}} and {{{S3Resource.export_xml}}} will always return {{{bytes}}} - if string operations must be performed on such XML, use {{{s3_str}}} to convert it === BytesIO === Certain libraries expect a file-like object to represent binary data. Whilst in Python-2, this can be handled with {{{StringIO}}}, Python-3 requires to use {{{BytesIO}}} instead. - {{{zipfile}}} expects and returns {{{bytes}}} === Sorting requires type consistency === When sorting iterables with {{{i.sort()}}} or {{{sorted(i)}}}, all elements must have the same type - otherwise it will raise a {{{TypeError}}} in Python-3. This is particularly relevant when the iterable can contain {{{None}}}. In such a case, use a key function to deal with None: {{{#!python l = [4,2,7,None] # Works in Py2, but raises a TypeError in Py3: l.sort() # Works in both: l.sort(key=lambda item: item if item is not None else -float('inf')) }}} === Turkish letters İ and ı === In Turkish, the letters {{{I}}} and {{{i}}} are not a upper/lowercase pair. Instead, there are two pairs {{{(İ, i)}}} and {{{(I, ı)}}}, i.e. one with and one without the dot above. According to the Unicode spec, the lowercase pendant for {{{İ}}} is a sequence of two unicode characters, namely the {{{i}}} (with the dot) and the code point U0307 which means "with dot above". The latter is there to preserve the information about the dot for the conversion back to uppercase. Python-2 did not implement the U0307 character, so it converted the letters like this: {{{#!python # Actually wrong in both cases, but consistently so: >>> u"İ".lower().upper() u'I' >>> u"ı".upper().lower() u'i' # NB with utf-8-encoded str, Python-2 doesn't "İ".lower() at all! >>> print "İ".lower() İ }}} Python-3, where all str are unicode, does implement the U0307 character, so the behavior is different: {{{#!python >>> "İ".lower().upper() 'İ' # But: same inconsistency as Py2 with the dot-less lowercase ı >>> "ı".upper().lower() 'i' }}} Critically, the U0307 character changes the string length (it's an extra character!): {{{#!python # Python-2 >>> len(u"İ".lower()) 1 # Python-3 >>> len("İ".lower()) 2 }}} This is just something to keep in mind - an actual forward/backward compatibility pattern must be developed for the specific use-case. Neither the Python-2 nor the Python-3 are particularly helpful for generalization, the Turkish I's always need special treatment. NB ECMA-Script implements the same behavior as Python-3, so there is cross-platform consistency, at least.