|Version 3 (modified by 11 years ago) ( diff ),|
BluePrint for automated database migration across schema changes
Caution to GSoC students considering this project
This is a difficult project, involving raw database operations on live user data, where the results must be carefully validated to avoid unrecoverable data loss. It will require knowledge of SQL and database administration, code generation or an equivalent means of specifying the process of converting a database, providing a UI when human assistance is needed to specify the conversion. It will involve dealing with source control tools, continuous integration, and automated testing. It is recommended only for advanced students.
Far too often, people make changes to the database schema or to the data with which the database is initially populated. This causes pain and suffering for anyone with a live Eden instance who needs to upgrade their Eden software.
Web2py handles some simple schema changes, but the ones that cause problems are precisely the ones it can't handle, such as renaming a field or table, or adding a required non-null field, or requiring the creation of new records from current ones (e.g. if a single foreign key reference is removed from a table and is placed in a new relationship table), or where the data themselves are modified.
We should provide a way of easing the process of migrating a database to a new revision.
Detect database schema and initial database content changes in Eden on each commit to trunk. This could be done as part of the continuous integration tests that are launched on each commit: Run Eden from scratch, which creates the database and adds initial data, then compare this with the result of the run from the last commit. Identify what changed and present a gui to the culprit so they can match up what they renamed (the most "popular" way to break backward compatibility), or say how new fields should be populated from old ones, or (failing a simple fixup) provide a migration script. Record the operations needed to convert a populated database to match the new schema and new initial contents across each revision. Provide a way for a sysadmin who has updated their copy of Eden to run the set of conversions from their previous revision to the new revision.