Changes between Version 38 and Version 39 of BluePrint/Synchronisation


Ignore:
Timestamp:
09/08/09 06:40:35 (16 years ago)
Author:
Fran Boon
Comment:

Clean-up

Legend:

Unmodified
Added
Removed
Modified
  • BluePrint/Synchronisation

    v38 v39  
    55= Blueprint for Synchronization =
    66
    7 We need to implement a system performing automatic Sahana synchronization. This Synchronization will be between any Sahana servers (PHP and Py). Our focus should be on !SahanaPy but it should be compatible with SahanaPHP.
    8 Currently !SahanaPy data exporting module exports data in CSV (web based: not for autonomous process). We can add support for XML and JSON.
    9 XML exporting will ensure compatibility with PHP version Sahana. JSON is modern, futuristic, light over HTTP and reliable, so using JSON for data synchronization looks promising. These should strictly adhere to XSD standards set approved by W3C.
     7We need to implement a system performing automatic synchronization between Sahana instances.
    108
    11 Another important point is to use UUID information of each of sync activity. We must include UUID while exporting data. Current UUID of export module of PHP version includes 'Instance ID' to make it clear which install instance it belongs to. Similar approach should also be adopted in !SahanaPy.
     9For synchronisation with other systems, we should be able to talk in open standards, such as [wiki:BluePrintEDXL EDXL]
     10
     11Currently !SahanaPy data exporting module exports data in CSV (web based: not for autonomous process). We can add support for XML and JSON. JSON is a modern, light-weight alternative to XML which is appropriate to our low-bandwidth operating environment. XML export should be done using XSD stylesheets so that it's easy to export in different formats.
     12
     13Each syncable record has a UUID field to uniquely identify it across instances.
    1214
    1315Automatic synchronization is different from manual data export / import module present in Sahana. Automatic process should run continuously as daemon.
    1416
    1517Currently we are using database dump for exporting which is definitely not optimal way for synchronization of databases. A paper written by Leslie Klieb ( http://hasanatkazmi.googlepages.com/DistributedDisconnectedDatabases.pdf ) discusses various ways for this. In the light of this research, we can implement synchronization as following:
    16  * we need to put time stamp as additional attribute in each table of database (tables which has data like names of missing people etc, we do not need to sync internally required tables which an instance of Sahana installation uses for saving internal information). This time stamp and UUID of Sahana Instance together can represent a unique attribute. This time stamp attribute MUST be added to SahanaPHP for  making intelligent database synchronization.
     18 * we need to put time stamp as additional attribute in each table of database (tables which has data like names of missing people etc, we do not need to sync internally required tables which an instance of Sahana installation uses for saving internal information).
    1719
    18 There is a desire already for data deleted from Sahana to stay available but with a deleted flag. This would then not be visible during normal DB queries, but is accessible for audit purposes if required. We can make this a reusable field in {{{models/00_db.py}}} & then add it to each table definition (well, all real, syncable data - no need for internal settings). For this to be accomplished in SahanaPHP, we MUST put another attribute: delete flag (alongside time stamp). Delete flag will be Boolean represented if tuple has been deleted or not.
     20Data deleted from Sahana should stay available but with a deleted flag. This would then not be visible during normal DB queries, but is accessible for audit purposes if required. We can make this a reusable field in {{{models/00_db.py}}} & then add it to each table definition (well, all real, syncable data - no need for internal settings). Delete flag will be Boolean represented if tuple has been deleted or not.
    1921
    20 When new tuple is added: new date is entered, when tuple is updated: date is modified to present one. if tuple is deleted, we set delete flag as true for that tuple (and do not delete it for real)
     22When new tuple is added: new date is entered, when tuple is updated: date is modified to present one. If tuple is deleted, we set delete flag as true for that tuple (and do not delete it for real)
    2123 
    2224Now take two instances of Sahana A & B. Now A calls JSON-RPC (or XML-RPC) passing his (A's) UUID, now B looks into synchronization table (in B's database) for the last time data was sent from B to A, then B create JSON/XML of only those entries/tuples which are after that date and return then to A. It also sends in deleted tuples after the asked date.
     
    2426Now each machine either updates or puts new tuples in specific tables. It also deletes all tuples which the other machine has deleted IF and only if it hadn't updated that tuple in its own database after the deletion on other machine.
    2527
    26 An important outcome of this implementation can also be used in manual data exporting modules of Sahana (both versions). We can let the user select the age of data which they want to export (i.e. export data form a starting date to b date). Moreover, we can easily set these web services to call its own exposed web service rather them directly communicating with database layer.
     28An important outcome of this implementation can also be used in manual data exporting modules of Sahana. We can let the user select the age of data which they want to export (i.e. export data form a starting date to b date). Moreover, we can easily set these web services to call its own exposed web service rather them directly communicating with database layer.
    2729
    28 Now As it is quite literal after reading last paragraph that this cannot be accomplished over standard web site based architecture so we need to make daemon (or service ) which will continuously run in the background basically doing two tasks:
     30Now as it is quite literal after reading last paragraph that this cannot be accomplished over standard web site based architecture so we need to make daemon (or service ) which will continuously run in the background basically doing two tasks:
    2931 * 1) It must find (process in loop) other Sahana servers in the network who have some data
    3032 * 2) It must expose a service to the network telling servers as they enter the network that it has some new data
    3133
    32 This process needs to be autonomous and servers must be able to find each other without specifying IP. This can be accomplished by using ZeroConfig.
     34This process needs to be autonomous and servers must be able to find each other without specifying IP. This can be accomplished by using [wiki:BluePrintZeroConf ZeroConf].
    3335So we need to come out from domain of web2py for this task. We can definitely hook our software with web2py execution sequence for automatic starting of this service as the server goes online.
    34 
    35 For this to work with PHP version, we MUST make port this software with PHP version and we MUST must expose web services in PHP version for doing sync. We must find someone in PHP developers who can do it.
    3636
    3737We can always ship this with !PortablePython eliminating need of installing Python on end machines (like what XAMPP is doing for PHP and MySQL)