Version 24 (modified by 12 years ago) ( diff ) | ,
---|
S3Report
Table of Contents
S3Report is a REST method handler for data analysis.
Functionality
S3Report can generate contingency tables (a.k.a. pivot tables) from S3Resources.
Dependencies
S3Report uses pyvttbl to generate contingency tables. A modified version of pyvttbl
is integrated into the S3 framework.
Pyvttbl requires the following Python modules:
- numpy
- matplotlib
Caveats
S3Report supports grouping and aggregation by virtual fields - however, using this without care can lead to severe scalability problems:
With large numbers of records involved it can take a long time to compute all virtual field values - and especially where this involves additional database lookups it may quickly exhaust any available bandwidth. Example calculation:
Records in the table | Number of virtual fields with DB lookups | Concurrent users | Total number of queries |
1,000 | 1 | 1 | 1,001 |
1,000 | 5 | 1 | 5,001 |
30,000 | 5 | 1 | 150,001 |
30,000 | 5 | 5 | 750,005 |
30,000 | 5 | 100 | 15,000,100 |
Unless you're running on a super-computer, such a scenario would take your site beyond any reasonable response times - and users may end up waiting forever to get their report (make sure your site has a reasonable timeout for requests).
Without virtual fields, you have a very different scenario:
Records in the table | Number of virtual fields with DB lookups | Concurrent users | Total number of queries |
1.000 | 0 | 1 | 1 |
30,000 | 0 | 100 | 100 |
Therefore, it is strongly recommended to minimize (better: avoid) the use of virtual fields which include additional DB lookup as rows, cols or fact parameter in reports wherever large numbers of records can be expected.
URL Methods
S3Report responds to the /report
URL method, for all resources (generic method).
The following parameters are accepted:
Parameter Explanation required? rows the name of the field to be used for the table rows yes cols the name of the field to be used for the table columns no (if no cols are specified, all instance values appear in 1 column) fact the name of the field to be used for the instance values no (falls back to "name" if present, otherwise to "id") aggregate the aggregation function no (default: list) table whether to hide the pivot table (table=0) or show it (table=1) no chart render as chart by default, format: chart=<type>:<dimension> no
rows, cols and fact support the same options as list_fields:
- fields in the table
- virtual fields in the table
- fields/virtual fields in tables linked by foreign keys ($-notation)
The chart
option accepts:
- type: piechart|barchart|breakdown
- dimension: rows|cols
S3Report supports a number of aggregation functions. The following functions have been tested so far:
Function Explanation list a comma-separated list of all instance values count the number of instance values sum the sum of all instance values avg the average (mean) of all instance values
Examples
- http://localhost:8000/eden/org/organisation/report?rows=country&cols=sector_id&fact=name
- http://localhost:8000/eden/pr/person/report?rows=age_group&cols=gender&fact=id&aggregate=count
$-notation for references:
reduced parameter list:
- http://localhost:8000/eden/org/office/report?rows=type&aggregate=count
- http://localhost:8000/eden/org/organisation/report?rows=sector_id
Report Options
Table configurations can contain a report_options
item, which is a Storage object. This object is used to configure reports and report options form.
The rows
, cols
, facts
and aggregate
items are lists of available values for the user to select from.
The defaults
property is a Storage object that contains the default values for the report. It can contain a value for rows
, cols
, fact
, aggregate
and totals
(as described in URL Methods).
The search
property is a list of S3Search widgets that will allow the report to be filtered. If no search
property is specified, no filter form will be available.
Here is an example of a report_options
item:
report_options=Storage( rows=report_fields, cols=report_fields, fact=report_fields, defaults=Storage( rows="project_id", cols="name", fact="time_actual", aggregate="sum", totals=True ), search=[ S3SearchOptionsWidget( name="project_id", label = T("Project"), field = "project_id", ), ], )
All items in report_options
and defaults
are optional.
If fact and methods are configured like:
report_options = Storage( fact = [field1, field2, field3], methods = [method1, method2], ... )
then S3Report will automatically provide a selection of all (mathematically) meaningful field-method-combinations. Some of these combinations may not be relevant for the use-case, though. To specify field-method-combinations instead, you can configure it like:
report_options = Storage( fact = [ (field1, method1), (field2, method1), (field2, method2), (field3, method1), ], ... )
Optionally, you can define a custom label for each combination (which will be used in the select list):
report_options = Storage( fact = [ (field1, method1, T("My Report A")), (field2, method1, T("My Report B")), (field2, method2, T("My Report C")), (field3, method1, T("My Report D")), ], ... )
Of course, you can also specify custom labels for the fields where required:
report_options = Storage( fact = [ ((T("My Label For Field1"), field1), method1, T("My Report A")), (field2, method1, T("My Report B")), (field2, method2, T("My Report C")), (field3, method1, T("My Report D")), ], ... )
Further Development
The following enhancements are currently under development:
- Support for JSON, CSV and XLS exports