= Code Conventions = [[TOC]] These conventions should be followed in all code. NOTE: These coding conventions are mandatory for code to be accepted for the [wiki:ucore Stable] series! This facilitates a team of developers working on the same code base. If each developer were to use their own style, there would be jarring changes in the code as someone read through it. This slows comprehension and can also cause spurious differences in changesets as developers alter style in frivolous ways on the code they touch. Our style guide aims to make the code base appear as if it was written by one precise punctilious programmer, rather than a a cacophony of competing coders. As with many aspects of coding and writing, there can be differences of opinion about the best style. These issues distract from the goal of writing code that works. Most project contributors have a slightly different personal coding style, but we all use the Sahana Eden style when contributing to the Sahana Eden project. == Python == === Code Style === * http://www.python.org/dev/peps/pep-0008/ * http://code.google.com/p/soc/wiki/PythonStyleGuide * Limit line length to 80 characters * Use " " for strings, UNLESS the string contains a ", in which case use ' (see also [#Quotes Quotes]) * Global variables should be avoided - use response.s3 to store them. ==== PEP8 Script ==== Use static/scripts/tools/pep8.py to check for [http://www.python.org/dev/peps/pep-0008/ PEP8] compliance. Execute the following in your eden directory: {{{ python static/scripts/tools/pep8.py yourfile.py }}} === Quotes === - for string constants in '''Python''' use '''double-quotes''' {{{" "}}} - UNLESS the string contains " - for string constants in '''!JavaScript''' use '''single-quotes''' {{{' '}}} - UNLESS the string contains a ' - for attributes in '''HTML''' use use '''single-quotes''' {{{' '}}} - for triple-quoted strings (such as '''Docstrings''' or Inline-XML) use {{{""" """}}} with '''double-quotes''', except: - for '''Inline-!JavaScript''' or JSON inside Python code use {{{''' '''}}} with '''single-quotes''' Note though that this: {{{ #!python inline_javascript = '''sometext="%s";''' % sometext }}} is a '''potential bug''' - the !JavaScript breaks when {{{sometext}}} contains a {{{"}}}. A safer way to do that is: {{{ #!python inline_javascript = '''sometext=%s;''' % json.dumps(sometext) }}} - for XML attributes use double-quotes {{{" "}}} (inner quotes must be escaped anyway) === Naming conventions === * Function names for the global namespace should start with an "s3_" prefix (if they are not module-specific) or with the module prefix plus underscore. * Names for Eden-specific methods in subclasses of web2py classes also start with "s3_". * should have the suffix "S3" * Names for Eden-specific subclasses of web2py classes and all classes which over-ride existing classes start with "S3", * All other classes defined in Eden end with "S3" (e.g. AuthS3 vs. S3Resource). * CamelCase is for class names only * Constant names are all-uppercase and with underscores as word separator * everything else (including table names and field names) should be all-lowercase with underscores as word separator * Names should be obvious and explicit, but not overly verbose (i.e. as long as they need to be to not make people think). They shouldn't require someone to look in another file or solve a puzzle to figure out what they mean, but shouldn't take too long to write. Avoid inventing new acronyms. e.g. bad: {{{ s3_prm_lkp, method_that_allows_us_to_create_a_gis_layer }}}, good: {{{ s3_personnel_search, create_gis_layer }}} === Comments and Docstrings === * All files, classes and functions should have docstrings which allow to auto-generate API documentation using [http://epydoc.sourceforge.net epydoc] Suggested style: {{{ """ This function is just an example @param: None """ }}} === Internationalisation === * All user-visible strings should be Internationalised: {{{T("My string")}}} * Remember to {{{str()}}} them before concatenating with strings (such as HTML tags): {{{"
" + str(T("My String")) + "
"}}} * All labels should be Internationalised in {{{controllers/module.py def resource()}}}: {{{table.field.label = T("My Label")}}} * DeveloperGuidelinesInternationalisation === Unicode === ==== Background ==== As a universal requirement, Sahana must support any unicode character in user-strings (i.e. from the database, forms or translations). Python 'unicode' objects are tuples of 4-byte codes from the unicode table (each code representing a character), which can be used to store strings containing any unicode characters. Such 'unicode' objects are not printable, though, i.e. they are not generally understood outside of the Python VM. When writing to interfaces, unicode-objects must be encoded as strings of printable characters, which Python represents as 'str' objects. The most common character encoding that covers all unicode characters is UTF-8. The str() constructor in Python 2 assumes that its argument is ASCII-encoded, and raises an exception for unicode-objects that contain non-ASCII characters. To prevent that, we must implement safe ways for converting unicode into str, enforcing UTF-8 encoding. Additionally, indices in str objects count byte-wise, not character-wise - which can lead to invalid characters when extracting substrings from UTF-8 encoded strings. Therefore, for any substring- or character-operations we must safely convert str into unicode, assuming UTF-8 encoding. ==== Unicode-Guideline ==== 1) All functions dealing with user-strings should be designed to accept both str and unicode, while safely handling strings with non-ASCII characters. For unicode-safe conversions, we use s3_unicode(s) and s3_str(s), instead of unicode(s) and str(s). 2) Where we receive str input, we assume utf-8 encoding. Most common encodings are subsets of utf-8 so that this is the safest assumption we can make. 3) Before indexing, splitting, slicing or iterating over a user-string, we always convert it into a unicode using: {{{ s = s3_unicode(s) }}} 4) We assume that any (external) function we call may attempt to convert input by calling str() - so we generally deliver all strings as utf-8 encoded str to prevent UnicodeDecodeErrors. This can be done by: {{{ s = s3_str(s) }}} or: {{{ s= s3_unicode(s).encode("utf-8") }}} 5) System-strings (like table or field names, attribute names, etc.) should never contain non-ASCII characters, so that they safely pass through str(). 6) In reading XML, we follow the encoding specified in the XML declaration rather than making assumptions about the encoding. For all other sources, we assume utf-8 (see 2). In exports, we always write utf-8. 7) All code is utf-8 encoded, so that all string constants are automatically utf-8 encoded str. We do not use u"..." for string constants. === Tools === * Some automated bug analysis / code quality checking tools - * [http://www.logilab.org/857 PyLint] * gives detailed report * code quality score tells exact impact of the changes made * http://www.logilab.org/card/pylint_tutorial * [http://pychecker.sourceforge.net/ PyChecker] recommended in [http://code.google.com/p/soc/wiki/PythonStyleGuide PythonStyleGuide] === Exceptions === * Do not use naked excepts. Yes there are lots already in the code base. Do not make the situation worse. They are being gradually fixed. * If you do use a naked except, the exception should be re-raised. * database errors are the typical tricky case as we don't want to create dependencies on database modules that aren't used. You can catch BaseException and check the exception class name, or you can provide and reuse standard tuples of exception classes in the configuration that map to the same generic database error. * try blocks should be minimised, so that there is less chance of catching the same exception type from a place that wasn't expected, (and then handling it incorrectly). To do this, move code before the exception raising line before the try block, and code after the exception raising line into an else block. * It's good practice to clean up resources in finally blocks. * avoid raising errors inside exception handlers. * don't ignore exceptions. At the very least, log them, or raise them in debug mode. Instead of this: {{{ try: # this could raise an error we aren't prepared for file_path = get_file_path() f = file(file_path, 'w') # this could also raise an exception f.write('Please send the helicopter to %s' % disaster_location) f.flush() except: pass # Argh! by ignoring the exception, now lots of things could be messed # up and the rest of the code could break in strange ways. # We will have a hard time figuring out the reason as the original # exception is now lost. # Worse, maybe the user doesn't notice and people die on a mountain. }}} Do this: {{{ # now we see any errors here separately file_path = get_file_path() try: # the try block is minimised f = file(file_path, 'w') except IOError, exception: # e.g. a permissions violation if session.s3.debug: # re-raise in development raise else: # logging is good logging.error("%s: %s" % (file_path, exception)) # email reports are great as the developer doesn't have # to log into the machine to read them. # this mustn't raise exceptions (tricky) send_error_email_to_devs(exception, locals(), globals(), request) if not exception_has_been_totally_handled_and_no_further_consequences(): # let the user know so they can take action if possible. raise HTTPError(500, "Cannot queue your message because: %s" % exception) else: f.write('Please send the helicopter to %s' % disaster_location) f.flush() finally: # finalise resources f.close() }}} When should you raise Exceptions, anyway? If it is not possible to continue with the respective processing because there can not be any plausible result, then the necessary consequences highly depend on the cause: - You should never raise an exception for '''Validation Errors''' - any invalid user input '''must be caught and properly reported''' back to the client (simply because users can't do anything about an exception, so it would render the functionality unusable - whilst a proper error message may help the user to correct their input). Failure to catch validation errors is always a bug. - '''Configuration Errors''', wherever possible, should at most issue a warning (if the invalid setting is obviously a mistake), and else get ignored and fall back to reasonable defaults. Only if there is no possible fallback action, then Configuration Errors should be treated as Runtime Errors. - '''Runtime Errors''' may raise exceptions if unrecoverable (i.e. if there's no reasonable handling possible at the respective level). However, if possible and in production mode, they should be caught at a higher level and properly reported back to the client in order to explain the reason and allow correction. Remember logging! - '''Programming Errors''' (=bugs) ''must'' lead to exceptions in order to attract immediate attention of the programmer to the bug and help them debugging. They should never be caught unless you intend to implement a plausible fallback action (and even then a warning may be appropriate in most cases). Eden is an unsupervised server system, so you must have very good reasons to let it crash. Remember that any uncaught exception leads to a HTTP 500 status with error ticket - and an HTTP 500 status in Eden is a bug by definition. Users can't do anything about error tickets, so the only legitimate reason to raise an uncaught exception in Eden is that you are 100% sure that the error condition leading to the exception is caused by a bug. === Nulls === Nulls make code much more complicated by introducing special cases that require special null-checking logic in every client, making code harder to understand and maintain. They often and easily cause hard-to-diagnose errors. * Avoid returning nulls wherever possible. Unfortunately the design of python and web2py make this very difficult. E.g. if creating a data structure from many objects, instead of returning nulls from each object and requiring special checking before adding to the structure, pass the data structure to the objects and let them fill in as necessary. * Check for nulls and handle them properly unless you are sure you aren't going to get them. * Unit tests should check that nulls aren't returned. * if you must return a None at the end of the function, do so explicitly, to make the intention clear, rather than falling off the end. * Database fields should generally be non-nullable. If there is the chance of unknown data in certain fields, it indicates that the table may be too big, and those fields may need splitting out into separate tables. It may seem reasonable to allow unknown data, but it places the burden of null-checking on every client of the table. If any client doesn't don't do it (the default), then there is a risk of undefined behaviour. By making them non-nullable, we get a level of code validation from the database layer. == !JavaScript == * Use ' not " where possible * Terminate all appropriate lines with ; (minifies better) * Take care to not have a trailing comma in arrays (IE8 and lower barf and hence so does Closure) * Avoid use of the ternary operator (many users unfamiliar with this and it has 2 statements in 1 lines which makes debugging harder) * http://javascript.crockford.com/code.html == HTML == ids & classes should be named ina way that they don't conflict & should be semantic (what they are for) rather than presentational: * http://woork.blogspot.co.uk/2008/11/css-coding-semantic-approach-in-naming.html * http://sixrevisions.com/css/css-tips/css-tip-2-structural-naming-convention-in-css/ ---- DeveloperGuidelines