wiki:DeveloperGuidelines/CodeConventions

Context Navigation

Version 64 (modified by Dominic König, 8 years ago) ( diff )
--

Code Conventions

These conventions should be followed in all code.

NOTE: These coding conventions are mandatory for code to be accepted for the Stable series!

This facilitates a team of developers working on the same code base. If each developer were to use their own style, there would be jarring changes in the code as someone read through it. This slows comprehension and can also cause spurious differences in changesets as developers alter style in frivolous ways on the code they touch.

Our style guide aims to make the code base appear as if it was written by one precise punctilious programmer, rather than a a cacophony of competing coders. As with many aspects of coding and writing, there can be differences of opinion about the best style. These issues distract from the goal of writing code that works. Most project contributors have a slightly different personal coding style, but we all use the Sahana Eden style when contributing to the Sahana Eden project.

Python

Code Style

http://www.python.org/dev/peps/pep-0008/
http://code.google.com/p/soc/wiki/PythonStyleGuide
- Limit line length to 80 characters
Use " " for strings, UNLESS the string contains a ", in which case use ' (see also Quotes)
Global variables should be avoided - use response.s3 to store them.

PEP8 Script

Use static/scripts/tools/pep8.py to check for PEP8 compliance.

Execute the following in your eden directory:

python static/scripts/tools/pep8.py yourfile.py

Print Statements

For compatibility with WSGI as well as newer web2py (>=2.15), using the print statement in web application sources is strongly discouraged.

CLI scripts containing print statements will even crash since gluon/shell.py now future-imports the Python-3 print_function.

For everything that is to be executed in the restricted environment (i.e. all server code):

use sys.stderr.write()
better yet (and mandatory for any permanent debug/error messages): use current.log.*, as it can be turned on/off centrally, and routed to a log file
if absolutely necessary, add the future-import to the file and use the print-function

For CLI scripts:

use sys.stderr.write() for status/error messages, or sys.stdout.write() for results
alternatively, add the future-import to the script and use the print-function

Using sys.std*.write() (remember that write expects string/bugger, so must convert explicitly):

import sys

# For status/error/debug messages:
sys.stderr.write("Here is a status message\n")

# For results:
sys.stdout.write("%s\n" % result)

Using the logger:

from gluon import current

# Use this for any permanent message output in server scripts:
current.log.error("Here is an error message")

If neither of the above is possible, import+use the print-function:

# This must be the first statement in the file:
from __future__ import print_function

# Then use the print-function:
print("The print-function", "accepts", "multiple arguments")

# Remember that server code must write to stderr, not stdout:
print("Direct output to stderr", file=sys.stderr)

Quotes

for string constants in Python use double-quotes " " - UNLESS the string contains "
for string constants in JavaScript use single-quotes ' ' - UNLESS the string contains a '
for attributes in HTML use use single-quotes ' '

for triple-quoted strings (such as Docstrings or Inline-XML) use """ """ with double-quotes, except:
for Inline-JavaScript or JSON inside Python code use ''' ''' with single-quotes

Note though that this:

inline_javascript = '''sometext="%s";''' % sometext

is a potential bug - the JavaScript breaks when sometext contains a ".

A safer way to do that is:

inline_javascript = '''sometext=%s;''' % json.dumps(sometext)

for XML attributes use double-quotes " " (inner quotes must be escaped anyway)

Naming conventions

Function names for the global namespace should start with an "s3_" prefix (if they are not module-specific) or with the module prefix plus underscore.
Names for Eden-specific methods in subclasses of web2py classes also start with "s3_".
should have the suffix "S3"
Names for Eden-specific subclasses of web2py classes and all classes which over-ride existing classes start with "S3",
All other classes defined in Eden end with "S3" (e.g. AuthS3 vs. S3Resource).
CamelCase is for class names only
Constant names are all-uppercase and with underscores as word separator
everything else (including table names and field names) should be all-lowercase with underscores as word separator
Names should be obvious and explicit, but not overly verbose (i.e. as long as they need to be to not make people think). They shouldn't require someone to look in another file or solve a puzzle to figure out what they mean, but shouldn't take too long to write. Avoid inventing new acronyms. e.g. bad: s3_prm_lkp, method_that_allows_us_to_create_a_gis_layer , good: s3_personnel_search, create_gis_layer

Comments and Docstrings

All files, classes and functions should have docstrings which allow to auto-generate API documentation using epydoc

Suggested style:

"""
    This function is just an example
    @param: None
"""

Internationalisation

All user-visible strings should be Internationalised: T("My string")
- Remember to str() them before concatenating with strings (such as HTML tags): "<p>" + str(T("My String")) + "</p>"
All labels should be Internationalised in controllers/module.py def resource(): table.field.label = T("My Label")
DeveloperGuidelinesInternationalisation

Unicode

Background

As a universal requirement, Sahana must support any unicode character in user-strings (i.e. from the database, forms or translations).

Python 'unicode' objects are tuples of 4-byte codes from the unicode table (each code representing a character), which can be used to store strings containing any unicode characters.

Such 'unicode' objects are not printable, though, i.e. they are not generally understood outside of the Python VM. When writing to interfaces, unicode-objects must be encoded as strings of printable characters, which Python represents as 'str' objects. The most common character encoding that covers all unicode characters is UTF-8.

The str() constructor in Python 2 assumes that its argument is ASCII-encoded, and raises an exception for unicode-objects that contain non-ASCII characters. To prevent that, we must implement safe ways to encode unicode into str, enforcing UTF-8 encoding.

Additionally, indices in str objects count byte-wise, not character-wise - which can lead to invalid characters when extracting substrings from UTF-8 encoded strings. Further, in Python 2, str.lower() and str.upper() may not work correctly for some unicode characters (e.g. "Ẽ".lower() gives "Ẽ" again - instead of "ẽ"), depending on the server locale setting. Therefore, for any substring- or character-operations we must safely decode the str into a unicode object, assuming UTF-8 encoding.

Unicode-Guideline

1) All functions dealing with user-strings should be designed to accept both str and unicode, while safely handling strings with non-ASCII characters. For unicode-safe conversions, we use s3_unicode(s) and s3_str(s), instead of unicode(s) and str(s).

2) Where we receive str input, we assume utf-8 encoding. Most common encodings are subsets of utf-8 so that this is the safest assumption we can make.

3) Before indexing, splitting, slicing or iterating over a user-string, we always convert it into a unicode using s3_unicode, e.g.:

    s_slice = s3_unicode(s)[:4]

4) To ensure correct behavior of lower() and upper() conversion, we convert into unicode first, using s3_unicode, e.g.:

   s_lowercase = s3_unicode(s).lower().encode("utf-8")

5) We assume that any (external) function we call may attempt to convert input by calling str() - so we generally deliver all strings as utf-8 encoded str to prevent UnicodeDecodeErrors. This can be done by:

    s = s3_str(s)

or:

   s = s3_unicode(s).encode("utf-8")

6) System-strings (like table or field names, attribute names, etc.) should never contain non-ASCII characters, so that they safely pass through str().

7) In reading XML, we follow the encoding specified in the XML declaration rather than making assumptions about the encoding. For all other sources, we assume utf-8 (see 2). In exports, we always write utf-8.

8) All code is utf-8 encoded, so that all string constants are automatically utf-8 encoded str. We do not use u"..." for string constants.

Tools

Some automated bug analysis / code quality checking tools -
- PyLint
  - gives detailed report
  - code quality score tells exact impact of the changes made
  - http://www.logilab.org/card/pylint_tutorial
- PyChecker recommended in PythonStyleGuide

Exceptions

Do not use naked excepts. Yes there are lots already in the code base. Do not make the situation worse. They are being gradually fixed.
- If you do use a naked except, the exception should be re-raised.
- database errors are the typical tricky case as we don't want to create dependencies on database modules that aren't used. You can catch BaseException and check the exception class name, or you can provide and reuse standard tuples of exception classes in the configuration that map to the same generic database error.
try blocks should be minimised, so that there is less chance of catching the same exception type from a place that wasn't expected, (and then handling it incorrectly). To do this, move code before the exception raising line before the try block, and code after the exception raising line into an else block.
It's good practice to clean up resources in finally blocks.
avoid raising errors inside exception handlers.
don't ignore exceptions. At the very least, log them, or raise them in debug mode.

Instead of this:

try:
    # this could raise an error we aren't prepared for
    file_path = get_file_path()
    f = file(file_path, 'w')
    # this could also raise an exception
    f.write('Please send the helicopter to %s' % disaster_location)
    f.flush()
except:
    pass
    # Argh! by ignoring the exception, now lots of things could be messed
    # up and the rest of the code could break in strange ways.
    # We will have a hard time figuring out the reason as the original 
    # exception is now lost.
    # Worse, maybe the user doesn't notice and people die on a mountain.

Do this:

# now we see any errors here separately
file_path = get_file_path()
try:
    # the try block is minimised 
    f = file(file_path, 'w')
except IOError, exception:
    # e.g. a permissions violation
    if session.s3.debug:
        # re-raise in development
        raise
    else:
        # logging is good
        logging.error("%s: %s" % (file_path, exception))

        # email reports are great as the developer doesn't have 
        # to log into the machine to read them.
        # this mustn't raise exceptions (tricky)
        send_error_email_to_devs(exception, locals(), globals(), request)

    if not exception_has_been_totally_handled_and_no_further_consequences():
        # let the user know so they can take action if possible.
        raise HTTPError(500, "Cannot queue your message because: %s" % exception)
else:
    f.write('Please send the helicopter to %s' % disaster_location)
    f.flush()
finally:
    # finalise resources
    f.close()

When should you raise Exceptions, anyway?

If it is not possible to continue with the respective processing because there can not be any plausible result, then the necessary consequences highly depend on the cause:

You should never raise an exception for Validation Errors - any invalid user input must be caught and properly reported back to the client (simply because users can't do anything about an exception, so it would render the functionality unusable - whilst a proper error message may help the user to correct their input). Failure to catch validation errors is always a bug.
Configuration Errors, wherever possible, should at most issue a warning (if the invalid setting is obviously a mistake), and else get ignored and fall back to reasonable defaults. Only if there is no possible fallback action, then Configuration Errors should be treated as Runtime Errors.
Runtime Errors may raise exceptions if unrecoverable (i.e. if there's no reasonable handling possible at the respective level). However, if possible and in production mode, they should be caught at a higher level and properly reported back to the client in order to explain the reason and allow correction. Remember logging!
Programming Errors (=bugs) must lead to exceptions in order to attract immediate attention of the programmer to the bug and help them debugging. They should never be caught unless you intend to implement a plausible fallback action (and even then a warning may be appropriate in most cases).

Eden is an unsupervised server system, so you must have very good reasons to let it crash.

Remember that any uncaught exception leads to a HTTP 500 status with error ticket - and an HTTP 500 status in Eden is a bug by definition. Users can't do anything about error tickets, so the only legitimate reason to raise an uncaught exception in Eden is that you are 100% sure that the error condition leading to the exception is caused by a bug.

Nulls

Nulls make code much more complicated by introducing special cases that require special null-checking logic in every client, making code harder to understand and maintain. They often and easily cause hard-to-diagnose errors.

Avoid returning nulls wherever possible. Unfortunately the design of python and web2py make this very difficult. E.g. if creating a data structure from many objects, instead of returning nulls from each object and requiring special checking before adding to the structure, pass the data structure to the objects and let them fill in as necessary.
Check for nulls and handle them properly unless you are sure you aren't going to get them.
Unit tests should check that nulls aren't returned.
if you must return a None at the end of the function, do so explicitly, to make the intention clear, rather than falling off the end.
Database fields should generally be non-nullable. If there is the chance of unknown data in certain fields, it indicates that the table may be too big, and those fields may need splitting out into separate tables. It may seem reasonable to allow unknown data, but it places the burden of null-checking on every client of the table. If any client doesn't don't do it (the default), then there is a risk of undefined behaviour. By making them non-nullable, we get a level of code validation from the database layer.

JavaScript

Use ' not " where possible
Terminate all appropriate lines with ; (minifies better)
Take care to not have a trailing comma in arrays (IE8 and lower barf and hence so does Closure)
Avoid use of the ternary operator (many users unfamiliar with this and it has 2 statements in 1 lines which makes debugging harder)

http://javascript.crockford.com/code.html

HTML

ids & classes should be named ina way that they don't conflict & should be semantic (what they are for) rather than presentational:

DeveloperGuidelines

Note: See TracWiki for help on using the wiki.

Download in other formats:

Plain Text