wiki:S3/S3Anonymize

S3Anonymize

S3Anonymize is a tool to remove sensitive information from a record (and related records) based on configurable rules. It is primarily intended for person data, but can be re-used for any type of record.

Important Notes

  • S3Anonymize removes the specified record details permanently and irrevocably (it essentially overwrites them), even if Sahana is otherwise configured to archive data instead of deleting them
  • S3Anonymize requires both UPDATE and DELETE permission for the target master record, but no particular permission for any related records (except DELETE for deletion)
  • S3Anonymize will audit "anonymize" for the target master record (not for the entire cascade)

Configuring Rules

Rules are configured using s3db.configure for the target table. The rules format looks like this:

s3db.configure("pr_person",
               anonymize = {# A name and title for the rule set:
                            "name": "default",
                            "title": "Names, IDs, Reference Numbers, Contact Information, Addresses",

                            # Rules how to clean up fields in the master record:
                            "fields": {"first_name": ("set", "-"),      # Set field to this value
                                       "last_name": ("set", "-"),
                                       "pe_label": anonymous_id,        # Callable returning a new field value
                                       "date_of_birth": obscure_dob,
                                       "comments": "remove",            # Set field value to None
                                       },

                            # Rules for related records:
                            "cascade": [("dvr_case", {"key": "person_id",               # Foreign key in the related table
                                                      "match": "id",                    # Match this key of the parent table

                                                      # Field rules for the related table
                                                      "fields": {"comments": "remove",
                                                                 },
                                                      }),

                                        ("pr_contact", {"key": "pe_id",
                                                        "match": "pe_id",
                                                        "fields": {"contact_description": "remove",
                                                                   "value": ("set", ""),
                                                                   "comments": "remove",
                                                                   },

                                                        "delete": True,                 # Delete the related records after cleanup (default False)
                                                        }),
                                        ],
                            },
              )
  • in cascading rules, the key+match properties can be replaced by a lookup property to configure a callable with the signature lookup(table, rows, tablename) that returns a set of relevant record IDs in the related table
  • cascading rules can be nested (selection rules refer to the table under which the cascade is listed, not to the outermost master table)
  • standard field rules are:
    • "remove" sets the field value to None
    • "reset" sets the field value to the field default
    • ("set", value) sets the field value to the specified value
  • field rules can also be callables with the signature rule(master_id, field, current_value) that return the new value for the field
  • field rules must produce valid records (i.e. the resulting value must pass database constraints and validators)
  • after applying field rules, S3Anonymize will execute update_super and onaccept like any other CRUD method
  • records in related tables will additionally be deleted if "delete": True is specified (which makes sense if the field rules remove all useful information from those records anyway)
  • if cascading records are to be deleted, this will additionally execute ondelete (as last step)
  • the master record itself is not automatically deletable (so that the user can verify the result before deleting it manually)

Instead of a single set of rules, it is possible to configure multiple rule sets as list:

s3db.configure("pr_person",
               anonymize = [{...first rule set...}, {...second rule set...}],
               )

...each with its own name and title. These rule sets will later be selectable in the GUI, so that the user can choose to only remove some, but not other data from the record (see screenshot below).

GUI and REST Method

S3AnonymizeWidget

To embed S3Anonymize in the GUI, it comes with a special widget class S3AnonymizeWidget and a UI script (s3.ui.anonymize.js).

S3AnonymizeWidget produces an action button/link (with a hidden dialog) that can be embedded in the record view (e.g. in postp in place of the delete-button):

def postp(r, output):

    if r.record and not r.component and r.method in (None, "update", "read") and isinstance(output, dict):

        buttons = output.get("buttons") or {}

        from s3 import S3AnonymizeWidget
        buttons["delete_btn"] = S3AnonymizeWidget.widget(r, _class="action-btn anonymize-btn")

        output["buttons"] = buttons

    return output

The _class parameter can be used to control the appearance of the link. The widget function will automatically embed the UI dialog and script, and authorize the link.

Clicking on the link brings up a dialog like this:

In this dialog, the user can choose all or some of the configured rule sets, then confirm the action and submit the form.

Back-end Function

S3Anonymize implements the S3Method interface and can thus be configured as REST method for a resource using s3db.set_method.

Apart from that, S3Anonymize comes with a generic cascade() method that can be used to implement other anonymize/cleanup routines:

S3Anonymize.cascade(table, record_ids, rules)

...where:

  • table is the target Table
  • record_ids is a set or list of record IDs to anonymize
  • rules is single dict of rules as described above

This function returns nothing, but will raise in case of an error.

Important: S3Anonymize.cascade() does not check any permissions itself, i.e. this must be implemented by the caller instead

Last modified 7 years ago Last modified on 04/03/18 19:18:26

Attachments (1)

Download all attachments as: .zip

Note: See TracWiki for help on using the wiki.