|Version 64 (modified by 10 years ago) ( diff ),|
Table of Contents
Writing code which runs fast.
NB Performance is not everything, and you should not optimize any code without actually having a problem with it. In most cases, response times of around 600ms are totally acceptable (or a minor compared with the average download times on-site).
There are more things to consider besides performance, e.g. usability and maintainability of code. New contributors should be able to easily understand the code, and bugs should be easy to find.
- If a specific inner-loop routine cannot be optimised in Python, then consider writing a C routine for this use case.
When to load the scripts:
"Slow post-load response is more harmful to user satisfaction than slow page load times, according to current HCI research."
- Optimize the models, throw away what we don't need. Every field counts.
Especially problematic in view of performance are references (joins), as they execute implicit DB requests. The more complex references are, the slower the model loads.
- Function definitions in models do _NOT_ harm - the functions are not executed when the model is loaded, but just compiled - pre-compilation of the whole application gives a speed-up of just 10ms (compare to the total execution times!).
In contrast to that, module-level commands (which are executed at time of loading of the model, e.g. CRUD string definitions) slow it down. Suggestion: put them into a "config" function for that model, and call only as needed.
- Avoid _implicit_ redirects! (that is, without user interaction, e.g. as in open_module. There may be redirects that cannot be avoided.).
A redirect simply doubles the response time (executes a new request and thus loads it all again).
- Be careful with Ajax - this might work nicely in local environments, but in real-world deployments this has shown to be unreliable and slow.
- Python runs very fast as opposed to DB queries, so it can be much faster to retrieve 100 rows in one query and then drop 95 of them in the Python code, than to retrieve 5 rows in 5 queries (=do not loop over queries, better loop over the results).
- Consider having configurations which are read from DB frequently but written-to rarely, be set in configuration files which are written-out from the DB (like the CSS from themes)
NB These vary on cases, so use the Profiler (and argument
-F profiler.log when running web2py.py) to see how they work in your cases...
for i in xrange(0, len(rows)): row = rows[i]
runs much faster than:
for row in rows:
(0.05 vs. 0.001 seconds in one test case, 2x improvement in another & a slight negative improvement in a 3rd).
value = db(table.id == id).select(table.field, limitby=(0, 1)).first()
runs 1.5x faster than:
value = table[id].field
(0.012 vs. 0.007 seconds vs in a test case)
NB If only expecting one record then the limitby provides a big speedup!
is significantly slower than
y = dict.get("x", None) if y:
is as fast as
if x in dict: y = dict["x"]
Also, keys() is slowing things down.
if a in dict.keys()
is ~25% slower than:
if a in dict
Another thing is that the profiler showed that there is extensive use of isinstance. So I tried to find an alternative, which would be:
if type(x) == "yyy"
In fact, this is ~30% faster than isinstance, but it won't find subclasses. So, if you test for:
if isinstance(x, dict)
and want Storages to match, then you cannot replace isinstance.
A real killer is hasattr(). I ran 5 million loops of
if "a" in dict:
if hasattr(dict, "a")
which was 4.5s vs. 12s.
Hence - for dicts, avoid hasattr to test for containment.
Golden Rules for DB Queries
These "rules" might seem a matter of course, however, sometimes you need to take a second look at your code:
- Insert a temporary
print >> sys.stderr, self.queryinto web2py's
select()function and take a look at what it says.
One complex query is usually more efficient than multiple simple queries (and gives the DB server a chance to optimize):
records = db(db.mytable.name == name).select() for r in records: other_records = db(db.othertable.code == r.code).select()
rows = db((db.mytable.name == name) & (db.othertable.code == db.mytable.code)).select() for row in rows: mytable_record = row.mytable othertable_record = row.othertable
Limit your Query
Ask exactly for what you expect - if you expect only one result, then limit the search by limitby:
db(db.mytable.id == id).select().first()
db(db.mytable.id == id).select(limitby=(0,1)).first()
If you need only certain fields of a record, then don't ask for all:
my_value = db(db.mytable.id == id).select(limitby=(0,1)).first().value
my_value = db(db.mytable.id == id).select(db.mytable.value, limitby=(0,1)).first().value
Don't ask twice…
...for the same record. Look down your code whether you need the same record again later:
my_value = db(db.mytable.id == id).select(db.mytable.value, limitby=(0,1)).first().value ... other_value = db(db.mytable.id == id).select(db.mytable.other_value, limitby=(0,1)).first().other_value
row = db(db.mytable.id == id).select(db.mytable.value, db.mytable.other_value, limitby=(0,1)).first() if row: my_value = row.value other_value = row.other_value
Don't loop over Queries
...if you can avoid it:
for id in ids: my_record = db(db.mytable.id == id).select().first() ...
records = db(db.mytable.id.belongs(ids)).select() for record in records: ...
Sometimes it is not as easy to see as in the above example - it could be hidden:
for x in y: id = some_function(x) if id: record = db(db.mytable.id == id).select()
ids = filter(lambda x: some_function(x), y) if ids: records = db(db.mytable.id.belongs(ids)).select() for record in records: ...
Or more complex:
for x in y: if x.a == some_value: record = db(db.mytable.id == x.key).select() ...<branch 1> else: record = db(db.othertable.id == x.other_key).select() ...<branch 2>
ids1 = filter(lambda x: (x.a == some_value) and x.key or None, y) ids2 = filter(lambda x: (x.a != some_value) and x.other_key or None, y) if ids1: records = db(db.mytable.id.belongs(ids1)).select() for record in records: ...<branch 1> if ids2: records = db(db.othertable.id.belongs(ids2)).select() for record in records: ...<branch 2>
- We have timings within the Selenium Functional Tests
- Web2Py can use cProfile:
web2py.py -F profiler.log
- or if running as service, edit
profiler_filename = 'profiler.log'
- YSlow plugin for Firebug: http://developer.yahoo.com/yslow/
- You can also use Pylot to test the application's behavior under load, and get more reliable results (+ in a nicer report form).
HTTP Packet Sizes & Uplinks: