Changes between Initial Version and Version 1 of BluePrintLazyRepresentation


Ignore:
Timestamp:
12/18/12 20:28:20 (12 years ago)
Author:
Dominic König
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • BluePrintLazyRepresentation

    v1 v1  
     1= !BluePrint: Lazy Representation =
     2[[TOC]]
     3
     4== Introduction ==
     5
     6For many output formats, data need to represented as strings (or even HTML elements). In the Table definition, you can define a representation method for every Field:
     7{{{
     8    Field("xy", "reference my_table",
     9          ...
     10          represent = my_representation_function,
     11          ...
     12          ),
     13}}}
     14
     15These representation functions receive the field value as parameter, and can perform additional lookups to render this value as string (or HTML).
     16
     17The problem is that, where many records are to be rendered, these functions are called many times and can therefore become a major performance bottleneck - especially if they perform additional database lookups. For the export of 50 records from a table that contains 5 fields with additional representation lookups, you would need 1 query to retrieve the 50 records, and then 250 additional queries to render them in the output format.
     18
     19In an XML export, where you have 1000 records, this would mean 1 query to retrieve the records - and 5000 additional queries to render the field representations.
     20
     21== Description ==
     22
     23To overcome the bottleneck arising from representation lookup, output formatters need to be able to perform representation lookups in bulk.
     24
     25To achieve this, the output formatter would collect all values for a field from all records in the output, and then call a special bulk-representation function which performs optimized DB lookups to render all values in as few queries as possible (ideally, at most one single query). That means, bulk representation functions reduce the number of DB queries in output formatting from 1 query per field and record to 1 query per field.
     26
     27Additionally, the bulk representation method should be lazy, i.e. only perform DB lookups when absolutely necessary and strictly avoid repeated DB lookups for the same value (within the same request).
     28
     29== Use-Cases ==
     30
     31The two most prominent use-cases are:
     32
     33  - data tables
     34  - XML exports
     35
     36Data tables typically render only a limited number of records (with server-side pagination). However, even with only 50 records per page, the field representation can turn into a major bottleneck.
     37
     38XML Exports are an even bigger problem as they are usually not paginated and thus can contain thousands of records (=tens of thousands of field representations).
     39
     40== Requirements ==
     41
     421) Bulk representation functions must be available (configurable) per Field.
     432) They should not be separate from single-value representations but use the same lazy lookup mechanism.
     443) Ideally, bulk representations do not introduce a new hook, but utilize the existing Field.represent hook.
     454) Ideally, we need only a few individual representation functions - most representations follow the same pattern anyway
     465) Standard representation of foreign keys would fall back to the name field in the referenced record
     476) Bulk representation functions should create only minimum overheads during model loading
     48
     49== Design ==
     50
     51The Field.represent hook can be set to a callable class instance:
     52
     53{{{
     54class MyRepresentation(object):
     55
     56    def __call__(self, value, row=None):
     57        # represent-code goes here
     58        ...
     59        return represent_str
     60
     61...
     62
     63    Field("xy", "reference my_table",
     64          ...
     65          represent = MyRepresentation(),
     66          ...
     67         )
     68}}}
     69
     70Besides the __call__() method, this class would define a bulk() method like:
     71
     72{{{
     73class MyRepresentation(object):
     74
     75    def __call__(self, value, row=None):
     76        # represent-code goes here
     77        ...
     78        return represent_str
     79
     80    def bulk(self, values, rows=None):
     81        # represent-code goes here
     82        ...
     83        return {values[0]:represent_str[0],
     84                values[1]:represent_str[1],
     85                ...
     86               }
     87}}}
     88
     89The bulk()-method would perform optimized DB lookups for the list of values it receives, and return a dict of {value:representation}.
     90
     91Output formatters (such as S3Resource.extract and S3Resource.export_tree) would then check whether the bulk()-method is available and use it instead of the single-value representation.
     92
     93== Implementation ==
     94
     95Having to define an individual bulk representation class for each and every Field seems though a little too much effort, so this calls for a base-class that already covers the standard case:
     96
     97{{{
     98    Field("xy", "reference my_table",
     99          ...
     100          represent = S3Represent(lookup="my_table")
     101          ...
     102         )
     103}}}
     104
     105S3Represent is defined in s3fields.py, and is therefore available in every model (controllers need to use s3base.S3Represent).
     106
     107The base class takes the following configuration parameters:
     108
     109||'''Parameter'''||'''Default'''||'''Desription'''||'''Comments'''||
     110||lookup||None||Name of the referenced table||for foreign keys||
     111||key||"id"||Name of the primary key in the referenced table||for foreign keys||
     112||fields||["name"]||Fields to lookup from the referenced table||for foreign keys||
     113||labels||"%(name)s"||String template to render the representation||can also be a callable receiving the Row||
     114||options||None||a dict with field options||for option lists, overrides lookup||
     115||translate||False||translate each label using T()||for foreign keys||
     116||linkto||None||URL to link the label to, with [id] as placeholder for the foreign key||for foreign keys, renders each label as A()||
     117||multiple||False||indicate that this is a list: type||values are expected to always be lists||
     118||default||current.messages.UNKNOWN_OPT||the default for unresolvable keys||||
     119||none||current.messages.NONE||the default for None-values (or empty lists for list: types||||
     120
     121S3Represent can be subclassed to meet specific requirements. Usually, the subclass would overwrite some of these methods:
     122
     123{{{
     124    def lookup_rows(self, key, values, fields=[]):
     125        """
     126            Lookup all rows referenced by values (in foreign key representations).
     127
     128            @param key: the key Field
     129            @param values: the values
     130            @param fields: the fields to retrieve
     131        """
     132}}}
     133
     134This method should be overwritten in case additional fields and/or joins are required for the represent_row function.
     135
     136  - ''For testing/benchmarking, lookup_rows() should increment self.queries for each query performed.''
     137
     138{{{
     139    def represent_row(self, row):
     140        """
     141            Represent the referenced row (in foreign key representations).
     142
     143            @param row: the row
     144        """
     145}}}
     146
     147This function receives each row retrieved by lookup_rows() and should return the string representation. It should '''not''' perform any additional DB lookups. It should return a lazyT if self.translate is True.
     148
     149{{{
     150    def link(self, k, v):
     151        """
     152            Represent a (key, value) as hypertext link.
     153
     154                - Typically, k is a foreign key value, and v the representation of the
     155                  referenced record, and the link shall open a read view of the referenced
     156                  record.
     157
     158                - In the base class, the linkto-parameter expects a URL (as string) with "[id]"
     159                  as placeholder for the key.
     160
     161            @param k: the key
     162            @param v: the representation of the key
     163        """
     164}}}
     165
     166This function can be overwritten to implement specific link construction mechanisms. It should '''not''' perform any additional DB lookups.
     167
     168----
     169BluePrint