|Version 4 (modified by 13 years ago) ( diff ),
BluePrint for automated database migration across schema changes
Caution to GSoC students considering this project
This is a difficult project, involving raw database operations on live user data, where the results must be carefully validated to avoid unrecoverable data loss. It will require knowledge of SQL and database administration, code generation or an equivalent means of specifying the process of converting a database, providing a UI when human assistance is needed to specify the conversion. It will involve dealing with source control tools, continuous integration, and automated testing. It is recommended only for advanced students.
Far too often, people make changes to the database schema or to the data with which the database is initially populated. This causes pain and suffering for anyone with a live Eden instance who needs to upgrade their Eden software.
Web2py handles some simple schema changes, but the ones that cause problems are precisely the ones it can't handle, such as renaming a field or table, or adding a required non-null field, or requiring the creation of new records from current ones (e.g. if a single foreign key reference is removed from a table and is placed in a new relationship table), or where the data themselves are modified.
We should provide a way of easing the process of migrating a database to a new revision.
- Track database schema and initial database content changes in Eden on each commit to trunk. (This could be done as part of the continuous integration tests that are launched on each commit: Run Eden from scratch, which creates the database and adds initial data, then compare this with the result of the run from the last commit.)
- Identify the differences between the old and new schemas, and old and new initial contents.
- Determine which changes will be handled by Web2py without assistance.
- For more significant changes, attempt to figure out how to convert from old to new schema. For instance:
- Attempt to match renamed tables by their fields, and renamed fields by their characteristics. Attempt to match fields moved out to new tables by names and characteristics.
- Determine if type changes might be automated, or partly automated (e.g. if a field's type changes from text to datetime).
- Provide a gui that shows the schema changes and suggested fixups. Allow the user to:
- Confirm or alter matching of renamed elements.
- Say how new fields should be populated from old ones.
- Provide a migration script.
- Record the operations needed to convert a populated database to match the new schema and new initial contents across each revision.
- Provide a way for a sysadmin who has updated their copy of Eden to run the set of conversions from their previous revision to the new revision.