[wiki:Haiti] ---- == Data Migration & Consistency == We want to be able to migrate data from Prod to Dev so that Testers / Documenters have real data to play with. Easy as a 1-off: {{{ /etc/init.d/apache2 stop rm -rf /home/haiti/dev/databases cp -ar /home/haiti/prod/databases /home/haiti/dev /etc/init.d/apache2 start }}} Maintaining this should be possible via [wiki: DataMigration#WebServices Web Services] Also want to be able to migrate from the current sqlite backend to [wiki:InstallationGuidelinesMySQL MySQL] We often need to use this same tricks to deal with database consistency issues. === CSV === Beware DB locks! * Don't keep Web2Py shell open longer than necessary * Consider goign into Maintenance Mode on the Webserver & disabling the Web2Py cron in {{{/etc/crontab}}} {{{ cd /home/haiti/web2py python web2py.py -S prod -M -N db.export_to_csv_file(open('db.csv','wb')) Ctrl+D }}} Remove reference Tables from CSV file (don't want to import duplicates!) NB Do CSV file edits in a text editor, not MS Excel (MS Excel makes the dumps unusable!) NB Be careful of file permissions! * All files in the databases folder should be {{{chown www-data:www-data}}} * If the 1st run of the model happens in the Python shell, then they will be owned by root... {{{ cd /home/haiti/web2py python web2py.py -S dev -M -N db.import_from_csv_file(open('db.csv','rb')) db.commit() Ctrl+D }}} ==== Changing a Column type in Sqlite live ==== sqlite handles live migrations pretty well, however if there is existing data then a conversion from string to integer can cause problems. For a development system, the easiest solution is simply to remove all {{{databases/*}}} & restart, however this isn't possible for a live system: {{{ vim /home/haiti/prod/models/00_db.py migrate = True cd /home/haiti/prod bzr pull cd /home/haiti/web2py python web2py.py -S prod -M db(db.or_organisation.id > 0).select().export_to_csv_file(open('orgs.csv','wb')) db.or_organisation.drop() db.commit() Ctrl+D python web2py.py -S prod -M db.or_organisation.import_from_csv_file(open('orgs.csv','rb')) db.commit() Ctrl+D /etc/init.d/apache2 force-reload vim /home/haiti/prod/models/00_db.py migrate = False }}} If the system is being heavily used, need to put up a holding page during this time to prevent data entry. For switching on maintenance. {{{ cd /home/haiti ./maintenance.on }}} For switching off maintenance. {{{ cd /home/haiti ./maintenance.off }}} ==== Changing a set of Lookup Options live ==== We can easily amend the module_resource_opts = {} live in the code (usually models/module.py) however we need to migrate existing data (after backing it up, of course): {{{ cd /home/haiti/prod bzr pull cd /home/haiti/web2py python web2py.py -S prod -M db(db.or_organisation.id > 0).select().export_to_csv_file(open('orgs.csv','wb')) db(db.or_organisation.type == 4).update(type='') db(db.or_organisation.type == 5).update(type='') db.commit() Ctrl+D /etc/init.d/apache2 force-reload }}} If the system is being heavily used, need to put up a holding page during this time to prevent data entry. For switching on maintenance. {{{ cd /home/haiti ./maintenance.on }}} For switching off maintenance. {{{ cd /home/haiti ./maintenance.off }}} ==== Dealing with database corruption ==== 1. We had problems with the or_organisation table - queries wouldn't complete. The tracebacks included: {{{ValueError: invalid literal for int() with base 10: '01/14/10 1'}}} By commenting out each field in turn in the model, discovered that this was the timestamp column which included this data. (Note the file was also {{{chown root:root}}}!) To fix this we needed to: * Delete column from model {{{ vim /home/haiti/prod/models/05_or.py resource = 'organisation' table = module + '_' + resource db.define_table(table, timestamp, deletion_status, # uuidstamp }}} * Export as CSV {{{ cd /home/haiti/web2py python web2py.py -S prod -M -N db.or_organisation.export_to_csv_file(open('prod_orgs.csv','wb')) Ctrl+D }}} * Drop table {{{ cd /home/haiti/web2py python web2py.py -S prod -M -N db.or_organisation.drop() db.commit() Ctrl+D }}} * Rebuild Table * Ensuring that table is owned by www-data * http://haiti.sahanafoundation.org * Reimport from CSV {{{ cd /home/haiti/web2py python web2py.py -S prod -M -N db.or_organisation.import_from_csv_file(open('prod_orgs.csv','rb')) db.commit() Ctrl+D }}} 2. rms_sms_request table couldn't be recreated Since this part of the system is under active development, it was necessary to drop the table as part of the [wiki:ReleaseManagement release process]. * Although normally we can re-populate the tables from their source feeds simply using: {{{ cd /home/haiti/web2py python web2py.py -S prod -M -N feature_class_id = db(db.gis_feature_class.name == 'SMS').select().first().id db(db.gis_location.feature_class_id == feature_class_id).delete() db.rms_tweet_request.truncate() db.rms_sms_request.truncate() db.commit() Ctrl+D }}} However this didn't complete fully as there was still an associated view file in {{{databases/}}} & hence when model tries to create: * {{{SyntaxError: invalid table name: rms_sms_request}}} To resolve this: * Take site down (I did this the ugly way!) {{{ vim /etc/crontab /etc/init.d/apache2 stop }}} * Backup all Data {{{ cd /home/haiti/prod mkdir databases2 cp -ar databases/* databases2 rm -rf databases }}} * Copy data down locally to be able to analyse easily, export as CSV {{{ tar cvf prod_db.tar databases2 gzip -9 prod_db.tar sz prod_db.tar.gz }}} * Allow model to recreate databases (owned as www-data) * http://haiti.sahanafoundation.org * Restore Data * Edit CSV file using a text editor (not MS Excel!) * remove reference tables & intiial records which are created by {{{models/zzz_1st_run.py}}} (NB This didn't seem to work properly on this occasion!) * Import CSV segments to see which parts work: {{{ cd /home/haiti/web2py python web2py.py -S prod -M -N db.import_from_csv_file(open('prod_part1.csv','rb')) db.commit() Ctrl+D }}} === Web Services === Better to avoid locks & do without SSH login * Currently needs to be done resource by resource (JOINed tables can be done together as Components of main resource) * need to do referenced tables before the tables which refer to them * migrate the locations first (using the xml interface), you can later migrate the orgs and offices together (through Component JOIN) * http://localhost:8000/sahana/gis/location/create?format=xml&fetchurl=http://haiti.sahanafoundation.org/prod/gis/location.xml * http://localhost:8000/sahana/pr/person/create?format=xml&fetchurl=http://haiti.sahanafoundation.org/prod/pr/person.xml * http://localhost:8000/sahana/or/organisation/create?format=xml&fetchurl=http://haiti.sahanafoundation.org/prod/or/organisation.xml * This should pull in all components, so no need to do: * http://localhost:8000/sahana/or/office/create?format=xml&fetchurl=http://haiti.sahanafoundation.org/prod/or/office.xml * http://localhost:8000/sahana/or/contact/create?format=xml&fetchurl=http://haiti.sahanafoundation.org/prod/or/contact.xml * BluePrintSynchronisation * Ideally we want to develop a way of having all tables loaded together into RAM, refs fixed-up & then imported to db together === Notes === 1) sqlite does seems to treat emails with different capitalizations as different entities. MySQL does not and can cause a problem during import if data is not clean.[[BR]] 2) Use of the INNODB engine for the db can lead to Integrity Errors during the first run and import -> Recommend MyISAM for now. ---- [wiki:Haiti]