Code Conventions
Table of Contents
These conventions should be followed in all code.
NOTE: These coding conventions are mandatory for code to be accepted for the Stable series!
This facilitates a team of developers working on the same code base. If each developer were to use their own style, there would be jarring changes in the code as someone read through it. This slows comprehension and can also cause spurious differences in changesets as developers alter style in frivolous ways on the code they touch.
Our style guide aims to make the code base appear as if it was written by one precise punctilious programmer, rather than a a cacophony of competing coders. As with many aspects of coding and writing, there can be differences of opinion about the best style. These issues distract from the goal of writing code that works. Most project contributors have a slightly different personal coding style, but we all use the Sahana Eden style when contributing to the Sahana Eden project.
Python
Code Style
- http://www.python.org/dev/peps/pep-0008/
- http://code.google.com/p/soc/wiki/PythonStyleGuide
- Limit line length to 80 characters
- Use " " for strings, UNLESS the string contains a ", in which case use ' (see also Quotes)
- Global variables should be avoided - use response.s3 to store them.
PEP8 Script
Use static/scripts/tools/pep8.py to check for PEP8 compliance.
Execute the following in your eden directory:
python static/scripts/tools/pep8.py yourfile.py
Quotes
- for string constants in Python use double-quotes
" "
- UNLESS the string contains " - for string constants in JavaScript use single-quotes
' '
- UNLESS the string contains a ' - for attributes in HTML use use single-quotes
' '
- for triple-quoted strings (such as Docstrings or Inline-XML) use
""" """
with double-quotes, except: - for Inline-JavaScript or JSON inside Python code use
''' '''
with single-quotes
Note though that this:
inline_javascript = '''sometext="%s";''' % sometext
is a potential bug - the JavaScript breaks when sometext
contains a "
.
A safer way to do that is:
inline_javascript = '''sometext=%s;''' % json.dumps(sometext)
- for XML attributes use double-quotes
" "
(inner quotes must be escaped anyway)
Naming conventions
- Function names for the global namespace should start with an "s3_" prefix (if they are not module-specific) or with the module prefix plus underscore.
- Names for Eden-specific methods in subclasses of web2py classes also start with "s3_".
- Method names in Eden-specific classes do not need to (and thus should not) be prefixed
- Names for Eden-specific subclasses of web2py classes and all classes which over-ride existing classes should have the suffix "S3".
- S3Model subclasses in s3db should start with the all-uppercase module prefix, e.g. DVRCaseModel, PRImageModel (because the universal S3 prefix makes it harder to avoid name collisions between modules)
- S3Method subclasses in s3db must start with the all-lowercase module prefix plus underscore, e.g. pr_Contacts (otherwise S3Model won't export them)
- All other classes defined in Eden start with "S3", e.g. S3Resource (a genuine Eden class) vs. AuthS3 (an Eden subclass of a web2py class).
- Class names start with an uppercase letter.
- CamelCase is for class names only.
- Constant names are all-uppercase and with underscores as word separator.
- everything else (including table names and field names) should be all-lowercase with underscores as word separator.
- Names should be obvious and explicit, but not overly verbose (i.e. as long as they need to be to not make people think). They shouldn't require someone to look in another file or solve a puzzle to figure out what they mean, but shouldn't take too long to write. Avoid inventing new acronyms. e.g. bad:
s3_prm_lkp, method_that_allows_us_to_create_a_gis_layer
, good:s3_personnel_search, create_gis_layer
- Names should not be Python keywords or standard library names
- Variable names should not start with _undersore (a trailing underscore_ is acceptable in local scope, though)
- Indexes in names should be avoided (e.g. file1, file2), except where they designate new versions of classes/functions (e.g. S3AddPersonWidget2)
Comments and Docstrings
- All files, classes and functions should have docstrings which allow to auto-generate API documentation using epydoc
Suggested style:
""" This function is just an example @param: None """
- Add comments so others can understand your code when they try to find bugs or improve/extend it
- Comments in code should explain intent (or clarify techniques), not repeat the code (do not explain what is self-explaining)
Poor:
# Set free to True free = True
Good:
# Indicate that the position is now free free = True
- Commented code should have a comment explaining why it is commented
Unclear:
#result = somefunction(somearg) result = 0
Better:
# Not working yet, @todo: fix somefunction #result = somefunction(somearg) result = 0
Internationalisation
- All user-visible strings should be Internationalised:
T("My string")
- Remember to
str()
them before concatenating with strings (such as HTML tags):"<p>" + str(T("My String")) + "</p>"
- Remember to
- All labels should be Internationalised in
controllers/module.py def resource()
:table.field.label = T("My Label")
- DeveloperGuidelinesInternationalisation
Unicode
Background
As a universal requirement, Sahana must support any unicode character in user-strings (i.e. from the database, forms or translations).
Python 'unicode' objects are tuples of 4-byte codes from the unicode table (each code representing a character), which can be used to store strings containing any unicode characters.
Such 'unicode' objects are not printable, though, i.e. they are not generally understood outside of the Python VM. When writing to interfaces, unicode-objects must be encoded as strings of printable characters, which Python represents as 'str' objects. The most common character encoding that covers all unicode characters is UTF-8.
The str() constructor in Python 2 assumes that its argument is ASCII-encoded, and raises an exception for unicode-objects that contain non-ASCII characters. To prevent that, we must implement safe ways to encode unicode into str, enforcing UTF-8 encoding.
Additionally, indices in str objects count byte-wise, not character-wise - which can lead to invalid characters when extracting substrings from UTF-8 encoded strings. Further, in Python 2, str.lower() and str.upper() may not work correctly for some unicode characters (e.g. "Ẽ".lower() gives "Ẽ" again - instead of "ẽ"), depending on the server locale setting. Therefore, for any substring- or character-operations we must safely decode the str into a unicode object, assuming UTF-8 encoding.
Unicode-Guideline
1) All functions dealing with user-strings should be designed to accept both str and unicode, while safely handling strings with non-ASCII characters. For unicode-safe conversions, we use s3_unicode(s) and s3_str(s), instead of unicode(s) and str(s).
2) Where we receive str input, we assume utf-8 encoding. Most common encodings are subsets of utf-8 so that this is the safest assumption we can make.
3) Before indexing, splitting, slicing or iterating over a user-string, we always convert it into a unicode using s3_unicode, e.g.:
s_slice = s3_unicode(s)[:4]
4) To ensure correct behavior of lower() and upper() conversion, we convert into unicode first, using s3_unicode, e.g.:
s_lowercase = s3_unicode(s).lower().encode("utf-8")
5) We assume that any (external) function we call may attempt to convert input by calling str() - so we generally deliver all strings as utf-8 encoded str to prevent UnicodeDecodeErrors. This can be done by:
s = s3_str(s)
or:
s = s3_unicode(s).encode("utf-8")
6) System-strings (like table or field names, attribute names, etc.) should never contain non-ASCII characters, so that they safely pass through str().
7) In reading XML, we follow the encoding specified in the XML declaration rather than making assumptions about the encoding. For all other sources, we assume utf-8 (see 2). In exports, we always write utf-8.
8) All code is utf-8 encoded, so that all string constants are automatically utf-8 encoded str. We do not use u"..." for string constants.
Tools
- Some automated bug analysis / code quality checking tools -
- PyLint
- gives detailed report
- code quality score tells exact impact of the changes made
- http://www.logilab.org/card/pylint_tutorial
- PyChecker recommended in PythonStyleGuide
- PyLint
Exceptions
- Do not use naked excepts. Yes there are lots already in the code base. Do not make the situation worse. They are being gradually fixed.
- If you do use a naked except, the exception should be re-raised.
- database errors are the typical tricky case as we don't want to create dependencies on database modules that aren't used. You can catch BaseException and check the exception class name, or you can provide and reuse standard tuples of exception classes in the configuration that map to the same generic database error.
- try blocks should be minimised, so that there is less chance of catching the same exception type from a place that wasn't expected, (and then handling it incorrectly). To do this, move code before the exception raising line before the try block, and code after the exception raising line into an else block.
- It's good practice to clean up resources in finally blocks.
- avoid raising errors inside exception handlers.
- don't ignore exceptions. At the very least, log them, or raise them in debug mode.
Instead of this:
try: # this could raise an error we aren't prepared for file_path = get_file_path() f = file(file_path, 'w') # this could also raise an exception f.write('Please send the helicopter to %s' % disaster_location) f.flush() except: pass # Argh! by ignoring the exception, now lots of things could be messed # up and the rest of the code could break in strange ways. # We will have a hard time figuring out the reason as the original # exception is now lost. # Worse, maybe the user doesn't notice and people die on a mountain.
Do this:
# now we see any errors here separately file_path = get_file_path() try: # the try block is minimised f = file(file_path, 'w') except IOError, exception: # e.g. a permissions violation if session.s3.debug: # re-raise in development raise else: # logging is good logging.error("%s: %s" % (file_path, exception)) # email reports are great as the developer doesn't have # to log into the machine to read them. # this mustn't raise exceptions (tricky) send_error_email_to_devs(exception, locals(), globals(), request) if not exception_has_been_totally_handled_and_no_further_consequences(): # let the user know so they can take action if possible. raise HTTPError(500, "Cannot queue your message because: %s" % exception) else: f.write('Please send the helicopter to %s' % disaster_location) f.flush() finally: # finalise resources f.close()
When should you raise Exceptions, anyway?
If it is not possible to continue with the respective processing because there can not be any plausible result, then the necessary consequences highly depend on the cause:
- You should never raise an exception for Validation Errors - any invalid user input must be caught and properly reported back to the client (simply because users can't do anything about an exception, so it would render the functionality unusable - whilst a proper error message may help the user to correct their input). Failure to catch validation errors is always a bug.
- Configuration Errors, wherever possible, should at most issue a warning (if the invalid setting is obviously a mistake), and else get ignored and fall back to reasonable defaults. Only if there is no possible fallback action, then Configuration Errors should be treated as Runtime Errors.
- Runtime Errors may raise exceptions if unrecoverable (i.e. if there's no reasonable handling possible at the respective level). However, if possible and in production mode, they should be caught at a higher level and properly reported back to the client in order to explain the reason and allow correction. Remember logging!
- Programming Errors (=bugs) must lead to exceptions in order to attract immediate attention of the programmer to the bug and help them debugging. They should never be caught unless you intend to implement a plausible fallback action (and even then a warning may be appropriate in most cases).
Eden is an unsupervised server system, so you must have very good reasons to let it crash.
Remember that any uncaught exception leads to a HTTP 500 status with error ticket - and an HTTP 500 status in Eden is a bug by definition. Users can't do anything about error tickets, so the only legitimate reason to raise an uncaught exception in Eden is that you are 100% sure that the error condition leading to the exception is caused by a bug.
Print Statements
For compatibility with WSGI as well as newer web2py (>=2.15), using the print
statement in web application sources is strongly discouraged.
CLI scripts containing print statements will even crash since gluon/shell.py now future-imports the print()
-function.
For everything that is to be executed in the restricted environment (i.e. all server code):
- for messages that are relevant during system setup (e.g. 1st run, module import failures etc.), use
sys.stderr.write()
- for run-time debug/error messages: use
current.log.*
, as it can be controlled centrally, and routed to a log file when web2py isn't console-run - for temporary debug output in code under development, use
sys.stderr.write()
- if absolutely necessary (re-think your design!), add the future-import to the file and use the print-function
For CLI scripts:
- use
sys.stderr.write()
for status/error messages, orsys.stdout.write()
for results - alternatively, add the future-import to the script and use the print-function
Using sys.std*.write()
(remember that write expects string/buffer, so you must convert any parameters explicitly):
import sys # For status/error/debug messages: sys.stderr.write("Here is a status message\n") # For results: sys.stdout.write("%s\n" % result)
Using the logger:
from gluon import current # Use this for any permanent message output in server code: current.log.error("Here is an error message") current.log.debug("And this is a debug message")
If neither of the above is possible, import+use the print-function:
# This must be the first statement in the file: from __future__ import print_function # Then use the print-function: print("The print-function", "accepts", "multiple arguments") # Remember that server code must write to stderr, not stdout: print("Direct output to stderr", file=sys.stderr)
Nulls
Nulls make code much more complicated by introducing special cases that require special null-checking logic in every client, making code harder to understand and maintain. They often and easily cause hard-to-diagnose errors.
- Avoid returning nulls wherever possible. Unfortunately the design of python and web2py make this very difficult. E.g. if creating a data structure from many objects, instead of returning nulls from each object and requiring special checking before adding to the structure, pass the data structure to the objects and let them fill in as necessary.
- Check for nulls and handle them properly unless you are sure you aren't going to get them.
- Unit tests should check that nulls aren't returned.
- if you must return a None at the end of the function, do so explicitly, to make the intention clear, rather than falling off the end.
- Database fields should generally be non-nullable. If there is the chance of unknown data in certain fields, it indicates that the table may be too big, and those fields may need splitting out into separate tables. It may seem reasonable to allow unknown data, but it places the burden of null-checking on every client of the table. If any client doesn't don't do it (the default), then there is a risk of undefined behaviour. By making them non-nullable, we get a level of code validation from the database layer.
JavaScript
- Use ' not " where possible
- Terminate all appropriate lines with ; (minifies better)
- Take care to not have a trailing comma in arrays (IE8 and lower barf and hence so does Closure)
- Avoid use of the ternary operator (many users unfamiliar with this and it has 2 statements in 1 lines which makes debugging harder)
HTML
ids & classes should be named ina way that they don't conflict & should be semantic (what they are for) rather than presentational:
- http://woork.blogspot.co.uk/2008/11/css-coding-semantic-approach-in-naming.html
- http://sixrevisions.com/css/css-tips/css-tip-2-structural-naming-convention-in-css/