Installation on PostgreSQL
Table of Contents
Replacing the simple SQLite database with PostgreSQL allows greater scalability & the possibility of using PostGIS for spatial storage.
Main install docs are here: InstallationGuidelines/Linux/Server/CherokeePostgreSQL
This page provides some supplementary notes
Installation
It is assumed that you are running Debian 'Squeeze'. 'Lenny' works fine for Eden, but the main point is to use PostGIS where Squeeze or Wheezy are better...
(For Wheezy, replace '8.4' with '9.1')
apt-get install -y postgresql-8.4 python-psycopg2
If you are using postgres-9.3, use psycopg2 as the driver instead of the web2py default pg8000.
Enable automatic DB maintenance:
vim /etc/postgresql/8.4/main/postgresql.conf # Enable remote access listen_addresses = '*' port = 5432 track_counts = on autovacuum = on vim /etc/postgresql/8.4/main/pg_hba.conf host all all my.ip.add.ress/24 md5 /etc/init.d/postgresql restart vim /etc/iptables.rules -A INPUT -p tcp --dport 5432 -j ACCEPT reboot
NB Don't do remote access as 'postgres' - to do so means can't simply access psql via 'su postgres'
To log-in as user postgres (Debian/Ubuntu):
sudo su postgres
Tuning
Adjust postgresql.conf for performance relative to resources available
For small/medium sized servers (the majority of Sahana Eden installs) then we provide 2 scripts to configure appropriately depending on the amount of RAM available:
- https://github.com/sahana/eden_deploy/blob/master/manual/install-eden-cherokee-postgis.sh#L745
pg512 pg1024
For larger servers, then more manual tuning is useful to be able to fully utilise the hardware.
Some relevant URLs:
- http://pgtune.leopard.in.ua (But the RAM we have available is normally shared by UWSGI, so we should be a little more conservative)
- http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
- https://www.enterprisedb.com/docs/en/9.5/asguide/EDB_Postgres_Advanced_Server_Guide.1.12.html
- http://linuxfinances.info/info/quickstart.html
- https://blog.2ndquadrant.com/basics-of-tuning-checkpoints/
- https://www.depesz.com/2010/11/03/checkpoint_completion_target/
Manual adjustment of available settings is discussed here:
vim /etc/sysctl.conf # Increase Shared Memory available for PostgreSQL # 512Mb kernel.shmmax = 279134208 # 1024Mb (may need more) #kernel.shmmax = 552992768 kernel.shmall = 2097152 sysctl -w kernel.shmmax=279134208 sysctl -w kernel.shmall=2097152 vim /etc/postgresql/9.6/main/postgresql.conf # max_connections = 100 # This default is high for typical Sahana deployments with 4 UWSGI workers for each of production, test and demo (+ mules + superuser reserved connections) # - NB Typically we want workers limited to nproc + 1 (Number of CPUs: nproc or lscpu) # If running out of connections for clients, adjust upward _or_ add a connection pooling layer like pgpool max_connections = 20 # shared_buffers = 128MB # Ideal is if all frequently accessed indexes and table rows can fit in here - this is often unrealistic # setting too high relative to system resources can be detrimental to the system. # At the high end this should be 10-25% of available memory (ie that which is not being used by the OS or other processes). # Values suggested are quite conservative, so you can push them up a bit if there are resources available for it # but absolutely no more than 1/3 available memory. # 512Mb Mb system suggest starting at default: shared_buffers = 128MB # 1024 Mb system suggest starting at: shared_buffers = 256MB # 8096Mb system suggest starting at: #shared_buffers = 1024MB #work_mem = 4MB # This is a per sort setting, so setting this too high with a high number of max_connections can be bad. # If the database load turns out to have a lot of sorting and not a lot of separate connections, you should increase this # and drop the max_connections. # 512Mb system suggest starting at the default: work_mem = 4MB # 1024Mb system suggest starting at: #work_mem = 8MB # 8096Mb system suggest starting at: #work_mem = 32MB # maintenance_work_mem = 64MB # As you get more memory in the system, moving this up can be quite useful if you notice vacuum processes taking forever. # 512Mb/1024 Mb system suggest starting at the default # 8096Mb system suggest starting at: #maintenance_work_mem = 128MB min_wal_size = 1GB max_wal_size = 2GB # Improve write performance #checkpoint_completion_target = 0.5 checkpoint_completion_target = 0.7 # effective_cache_size = 4GB # 75% of available memory # 512Mb system suggest starting at: effective_cache_size = 256MB # 1024Mb system suggest starting at: #effective_cache_size = 512MB # 8096Mb system suggest starting at default: #effective_cache_size = 4GB /etc/init.d/postgresql restart
To check the impact of your settings:
- https://github.com/jfcoz/postgresqltuner
- https://www.postgresql.org/docs/9.3/static/monitoring-stats.html
- https://blog.codeship.com/tuning-postgresql-with-pgbench/
- pgtop:
apt-get install -y pgtop su postgres pg_top
Install Support Tools
(Optional) Install the Admin Pack for enhanced intrumentation:
apt-get install postgresql-contrib-8.4 su postgres psql -d postgres -f /usr/share/postgresql/8.4/contrib/adminpack.sql
(Optional) Install PHPPgAdmin if you want a web-based management tool:
apt-get install phppgadmin
Configure PostgreSQL for Sahana Eden
su postgres createuser -s -P sahana createdb -O sahana -E UTF8 sahana -T template0
If using PostGIS (recommended) then:
apt-get install -y postgresql-8.4-postgis su postgres createlang plpgsql -d sahana psql -d sahana -f /usr/share/postgresql/8.4/contrib/postgis-1.5/postgis.sql psql -d sahana -f /usr/share/postgresql/8.4/contrib/postgis-1.5/spatial_ref_sys.sql
NOTE: you must install PostGIS first
Ubuntu users (Lucid/PostgreSQL 8.4/Post-GIS 1.4) - the paths are slightly different
psql -d sahana -f /usr/share/postgresql/8.4/contrib/postgis.sql psql -d sahana -f /usr/share/postgresql/8.4/contrib/spatial_ref_sys.sql
Install Python support for PostgreSQL
If not done as-per above instructions:
apt-get install python-psycopg2
Configure Sahana Eden to use PostgreSQL
vim models/000_config.py settings.database.db_type = "postgres" settings.database.port = "" # use default settings.database.database = "sahana" settings.database.username = "sahana" settings.database.password = "mypassword" settings.database.pool_size = 30
'Connection pools' are used to reuse connections. How many pools should we use?
whatever you choose it always starts with one and grows with the number of concurrent requests up to the value of pools (the max number of concurrent pools). So pools should be the max number of concurrent requests you expect. That is the max value of: <number of requests/second> / <average time per request in seconds>
Add Geometry column to gis_location
Tell Sahana that the DB is spatially-enabled so that routines can make use of this:
vim models/000_config.py deployment_settings.gis.spatialdb = True
The rest of this section is DEPRECATED (now done by Spatial DAL)
If using PostGIS, then once the tables have been created:
cat << EOF > "/tmp/geometry.sql" UPDATE public.gis_location SET wkt = 'POINT (' || lon || ' ' || lat || ')' WHERE gis_feature_type = 1; SELECT AddGeometryColumn( 'public', 'gis_location', 'the_geom', 4326, 'GEOMETRY', 2 ); GRANT ALL ON geometry_columns TO sahana; UPDATE public.gis_location SET the_geom = ST_SetSRID(ST_GeomFromText(wkt), 4326); EOF su -c - postgres "psql -q -d sahana -f /tmp/geometry.sql"
(if using Windows can use the Query Editor that comes with pgAdmin III)
Ensure that future entries auto-populate properly:
cat << EOF > "/tmp/autopopulate.sql" CREATE OR REPLACE FUNCTION s3_update_geometry() RETURNS "trigger" AS \$$ DECLARE BEGIN if (NEW.wkt != '') then NEW.the_geom = SetSRID(GeomFromText(NEW.wkt), 4326); end if; RETURN NEW; END; \$$ LANGUAGE 'plpgsql' VOLATILE; ALTER FUNCTION s3_update_geometry() OWNER TO sahana; CREATE TRIGGER s3_locations_update BEFORE INSERT ON gis_location FOR EACH ROW EXECUTE PROCEDURE s3_update_geometry(); EOF su -c - postgres "psql -q -d sahana -f /tmp/autopopulate.sql"
Backups
su postgres pg_dump -c sahana > /tmp/sahana.sql pg_dump -Fc gis > /tmp/gis.dmp
Maintenance
Connect to Database (or use pgAdmin III GUI's PSQL Console on plugins menu):
su postgres psql
or
psql -d my_spatial_db -U username
List Databases:
\l
Connect to a Database:
\c sahana
List Tables:
\d
Describe Table structure:
\d tablename select field/* from tablename where field = 'value';
If you need to build a fresh database:
rm -f /home/web2py/applications/eden/databases/* su postgres pkill -f 'postgres: sahana sahana' dropdb sahana createdb -O sahana -E UTF8 sahana
To restore data from backup, clear the databases folder, settings.base.prepopulate = 0 & settings.base.migrate=True, run a migration & then restore the data:
su postgres psql sahana < /tmp/sahana.sql createdb gis psql gis -f /usr/share/postgresql-8.4/contrib/postgis.sql /usr/share/postgresql-8.4-postgis/utils/new_postgis_restore.pl -f /tmp/gis.dmp | psql gis
If wanting to be specific about which tables to restore, you can use:
pg_restore -t table | psql
If you want to exclude a table, there is no equivalent argument, so you can sue the following procedure:
pg_restore -l $pgdump_file | grep -v "TABLE DATA public table_to_exclude" > restore.pgdump.list
Remote Access
Say that you want to provide access for an external reporting tool.
Add the user account:
su postgres psql CREATE USER reporter WITH PASSWORD 'mypassword'; \q exit su -c - postgres "psql -q -d sahana -c 'GRANT CONNECT ON DATABASE sahana TO reporter;'" su -c - postgres "psql -q -d sahana -c 'GRANT SELECT ON ALL TABLES IN SCHEMA public TO reporter;'" su -c - postgres "psql -q -d sahana -c 'ALTER DEFAULT PRIVILEGES FOR USER reporter IN SCHEMA public GRANT SELECT ON TABLES TO reporter;'"
Allow remote access:
vim /etc/postgresql/9.4/main/postgresql.conf listen_addresses = '*' vim /etc/postgresql/9.4/main/pg_hba.conf host sahana reporter x.x.x.x/32 md5 /etc/init.d/postgresql restart
NB Also remember any network firewll, e.g. on Amazon EC2, amend the Security Group.
Upgrades
Use pgupgrade.
/etc/init.d/postgresql stop pg_upgrade -b /usr/lib/postgresql/8.4/bin -B /usr/lib/postgresql/9.6/bin -d /var/lib/postgresql/8.4/main -D /var/lib/postgresql/9.6/main /etc/init.d/postgresql start
Troubleshooting
Locales
View current Collation settings:
su postgres psql show lc_collate; \q exit
Problem:
createdb: database creation failed: ERROR: encoding UTF8 does not match server's locale en_US DETAIL: The server's LC_CTYPE setting requires encoding LATIN1.
Solution:
# Ensure all locales are available apt-get install locales-all # enable en_US.UTF8 sed -i 's|# en_US.UTF-8 UTF-8|en_US.UTF-8 UTF-8|' /etc/locale.gen locale-gen # Rebuild the PostgreSQL cluster pg_dropcluster 8.4 main pg_createcluster --locale=en_US.UTF-8 8.4 main /etc/init.d/postgresql restart # Recreate 'sahana' user echo "CREATE USER sahana WITH PASSWORD 'mypassword';" > /tmp/pgpass.sql su -c - postgres "psql -q -d template1 -f /tmp/pgpass.sql" rm -f /tmp/pgpass.sql
Problem:
- Trying to import a LATIN1 Shapefile into a UTF-8 table:
Unable to convert data value to UTF-8 (iconv reports "Invalid or incomplete multibyte or wide character"). Current encoding is "UTF-8". Try "LATIN1" (Western European), or one of the values described at http://www.postgresql.org/docs/current/static/multibyte.html.
Solution:
- Export in correct encoding
shp2pgsql -s 4326 -W LATIN1 -I mylatin1shapefile.shp public.gis
Problem:
- Want to store data in a LATIN1 table (although this is probably not a good idea!)
Solution:
# View available locales locale -a # Activate LATIN1 locale dpkg-reconfigure locales # select en_US ISO-8859-1, OK # Check available locale -a su -c - postgres "createdb -O gis -E LATIN1 mylatin1db -T template0 --lc-collate=en_US.ISO-8859-1 --lc-ctype=en_US.ISO-8859-1"
Performance
Use a C locale instead of UTF-8 if you don't need UTF-8 (much faster with LIKE queries)
- http://wiki.postgresql.org/wiki/Performance_Optimization
- http://wiki.postgresql.org/wiki/SlowQueryQuestions
- http://www.revsys.com/writings/postgresql-performance.html
- Vacuum the database
- http://www.postgresql.org/docs/8.4/static/runtime-config-logging.html
Log all queries which take longer than 500ms:
vim /etc/rsyslog.conf local0.* /var/log/postgresql.log /etc/init.d/rsyslogd restart vim /etc/postgresql/8.4/main/postgresql.conf log_destination = 'syslog' syslog_facility = 'LOCAL0' syslog_ident = 'postgres' log_min_duration_statement = 500 /etc/init.d/postgresql restart tail -n 45 -f /var/log/postgresql.log
Disable logging again:
vim /etc/postgresql/8.4/main/postgresql.conf log_min_duration_statement = -1 /etc/init.d/postgresql restart
Log all queries which take longer than 500ms & do an 'EXPLAIN' on those queries (not working)
- http://www.depesz.com/index.php/2008/11/23/waiting-for-84-auto-explain/
LOAD 'auto_explain'; set explain.log_min_duration = 500;
pgBadger is a simple and useful tool for analysing your logs:
PgAdmin4 on Windows
If you get 'application server could not be contacted' then delete C:\Users\{Your_User}\AppData\Roaming\pgAdmin
folder
See Also
- PostGIS Installation
- Setting up Web2Py with PostgreSQL on Ubuntu: http://www.web2pyslices.com/main/slices/take_slice/14