wiki:InstallationGuidelines/PostgreSQL

Version 79 (modified by Fran Boon, 7 years ago) ( diff )

--

Installation on PostgreSQL

Replacing the simple SQLite database with PostgreSQL allows greater scalability & the possibility of using PostGIS for spatial storage.

Main install docs are here: InstallationGuidelines/Linux/Server/CherokeePostgreSQL

This page provides some supplementary notes

Installation

It is assumed that you are running Debian 'Squeeze'. 'Lenny' works fine for Eden, but the main point is to use PostGIS where Squeeze or Wheezy are better...

(For Wheezy, replace '8.4' with '9.1')

apt-get install -y postgresql-8.4 python-psycopg2

If you are using postgres-9.3, use psycopg2 as the driver instead of the web2py default pg8000.

Enable automatic DB maintenance:

vim /etc/postgresql/8.4/main/postgresql.conf
# Enable remote access
listen_addresses = '*'
port = 5432
track_counts = on
autovacuum = on

vim /etc/postgresql/8.4/main/pg_hba.conf
host    all         all         my.ip.add.ress/24     md5

/etc/init.d/postgresql restart

vim /etc/iptables.rules
-A INPUT -p tcp --dport 5432 -j ACCEPT

reboot

NB Don't do remote access as 'postgres' - to do so means can't simply access psql via 'su postgres'

To log-in as user postgres (Debian/Ubuntu):

sudo su postgres

Tuning

Adjust postgresql.conf for performance relative to resources available

vim /etc/sysctl.conf
# Increase Shared Memory available for PostgreSQL
# 512Mb
kernel.shmmax = 279134208
# 1024Mb (may need more)
#kernel.shmmax = 552992768
kernel.shmall = 2097152

sysctl -w kernel.shmmax=279134208
sysctl -w kernel.shmall=2097152

vim /etc/postgresql/8.4/main/postgresql.conf

# This default is probably ok
# adjust upward _or_ add a connection pooling layer like pgpool if running out of connections for clients
max_connections = 100

# shared_buffers = 24MB
# Ideal is if all frequently accessed indexes and table rows can fit in here - this is often unrealistic
# setting too high relative to system resources can be detrimental to the system.
# At the high end this should be 10-25% of available memory (ie that which is not being used by the OS or other processes).
# Values suggested are quite conservative, so you can push them up a bit if there are resources available for it
# but absolutely no more than 1/3 available memory.
# 512Mb system suggest starting at:
shared_buffers = 56MB
# 1024Mb system suggest starting at:
#shared_buffers = 160MB

# effective_cache_size = 128MB
# 75% of available memory
# 512Mb system suggest starting at:
effective_cache_size = 256MB
# 1024Mb system suggest starting at:
#effective_cache_size = 512MB

#work_mem = 1MB
# This is a per sort setting, so setting this too high with a high number of max_connections can be bad.
# If the database load turns out to have a lot of sorting and not a lot of separate connections, you should increase this
# and drop the max_connections.
# 512Mb system suggest starting at:
work_mem = 2MB
# 1024Mb system suggest starting at:
#work_mem = 4MB

# As you get more memory in the system, moving this up can be quite useful if you notice vacuum processes taking forever.
maintenance_work_mem = 16MB

/etc/init.d/postgresql restart

Install Support Tools

(Optional) Install the Admin Pack for enhanced intrumentation:

apt-get install postgresql-contrib-8.4
su postgres
psql -d postgres -f /usr/share/postgresql/8.4/contrib/adminpack.sql

(Optional) Install PHPPgAdmin if you want a web-based management tool:

apt-get install phppgadmin

Configure PostgreSQL for Sahana Eden

su postgres
createuser -s -P sahana

createdb -O sahana -E UTF8 sahana -T template0

If using PostGIS (recommended) then:

apt-get install -y postgresql-8.4-postgis
su postgres
createlang plpgsql -d sahana
psql -d sahana -f /usr/share/postgresql/8.4/contrib/postgis-1.5/postgis.sql
psql -d sahana -f /usr/share/postgresql/8.4/contrib/postgis-1.5/spatial_ref_sys.sql

NOTE: you must install PostGIS first

Ubuntu users (Lucid/PostgreSQL 8.4/Post-GIS 1.4) - the paths are slightly different

psql -d sahana -f /usr/share/postgresql/8.4/contrib/postgis.sql
psql -d sahana -f /usr/share/postgresql/8.4/contrib/spatial_ref_sys.sql

Install Python support for PostgreSQL

If not done as-per above instructions:

apt-get install python-psycopg2

Configure Sahana Eden to use PostgreSQL

vim models/000_config.py
settings.database.db_type = "postgres"
settings.database.port = "" # use default
settings.database.database = "sahana"
settings.database.username = "sahana"
settings.database.password = "mypassword"
settings.database.pool_size = 30

'Connection pools' are used to reuse connections. How many pools should we use?

whatever you choose it always starts with one and grows with the number of concurrent requests up to the value of pools (the max number of concurrent pools).
So pools should be the max number of concurrent requests you expect. That is the max value of:
<number of requests/second> / <average time per request in seconds> 

Add Geometry column to gis_location

Tell Sahana that the DB is spatially-enabled so that routines can make use of this:

vim models/000_config.py
deployment_settings.gis.spatialdb = True

The rest of this section is DEPRECATED (now done by Spatial DAL)

If using PostGIS, then once the tables have been created:

cat << EOF > "/tmp/geometry.sql"
UPDATE public.gis_location SET wkt = 'POINT (' || lon || ' ' || lat || ')' WHERE gis_feature_type = 1;
SELECT AddGeometryColumn( 'public', 'gis_location', 'the_geom', 4326, 'GEOMETRY', 2 );
GRANT ALL ON geometry_columns TO sahana;
UPDATE public.gis_location SET the_geom = ST_SetSRID(ST_GeomFromText(wkt), 4326);
EOF
su -c - postgres "psql -q -d sahana -f /tmp/geometry.sql"

(if using Windows can use the Query Editor that comes with pgAdmin III)

Ensure that future entries auto-populate properly:

cat << EOF > "/tmp/autopopulate.sql"
CREATE OR REPLACE FUNCTION s3_update_geometry()
  RETURNS "trigger" AS \$$
  DECLARE
  BEGIN
    if (NEW.wkt != '') then
        NEW.the_geom = SetSRID(GeomFromText(NEW.wkt), 4326);
        end if;

    RETURN NEW;
  END;
\$$  LANGUAGE 'plpgsql' VOLATILE;
ALTER FUNCTION s3_update_geometry() OWNER TO sahana;
CREATE TRIGGER s3_locations_update
  BEFORE INSERT
  ON gis_location
  FOR EACH ROW
  EXECUTE PROCEDURE s3_update_geometry();

EOF
su -c - postgres "psql -q -d sahana -f /tmp/autopopulate.sql"

Backups

su postgres
pg_dump -c sahana > /tmp/sahana.sql
pg_dump -Fc gis > /tmp/gis.dmp

Maintenance

Connect to Database (or use pgAdmin III GUI's PSQL Console on plugins menu):

su postgres
psql

or

psql -d my_spatial_db -U username

List Databases:

\l

Connect to a Database:

\c sahana

List Tables:

\d

Describe Table structure:

\d tablename
select field/* from tablename where field = 'value';

If you need to build a fresh database:

rm -f /home/web2py/applications/eden/databases/*
su postgres
pkill -f 'postgres: sahana sahana'
dropdb sahana
createdb -O sahana -E UTF8 sahana

To restore data from backup, clear the databases folder, settings.base.prepopulate = 0 & settings.base.migrate=True, run a migration & then restore the data:

su postgres
psql sahana < /tmp/sahana.sql
createdb gis
psql gis -f /usr/share/postgresql-8.4/contrib/postgis.sql
/usr/share/postgresql-8.4-postgis/utils/new_postgis_restore.pl -f /tmp/gis.dmp | psql gis

If wanting to be specific about which tables to restore, you can use:

pg_restore -t table | psql

If you want to exclude a table, there is no equivalent argument, so you can sue the following procedure:

pg_restore -l $pgdump_file  | grep -v "TABLE DATA public table_to_exclude"  > restore.pgdump.list

Remote Access

Say that you want to provide access for an external reporting tool.

Add the user account:

su postgres
psql
CREATE USER reporter WITH PASSWORD 'mypassword';
\q
exit
su -c - postgres "psql -q -d sahana -c 'GRANT CONNECT ON DATABASE sahana TO reporter;'"
su -c - postgres "psql -q -d sahana -c 'GRANT SELECT ON ALL TABLES IN SCHEMA public TO reporter;'"
su -c - postgres "psql -q -d sahana -c 'ALTER DEFAULT PRIVILEGES FOR USER reporter IN SCHEMA public GRANT SELECT ON TABLES TO reporter;'"

Allow remote access:

vim /etc/postgresql/9.4/main/postgresql.conf
listen_addresses = '*'

vim /etc/postgresql/9.4/main/pg_hba.conf
host    sahana          reporter        x.x.x.x/32          md5

/etc/init.d/postgresql restart

NB Also remember any network firewll, e.g. on Amazon EC2, amend the Security Group.

Troubleshooting

Locales

View current Collation settings:

su postgres
psql
show lc_collate;
\q
exit

Problem:

createdb: database creation failed: ERROR:  encoding UTF8 does not match server's locale en_US
DETAIL:  The server's LC_CTYPE setting requires encoding LATIN1.

Solution:

# Ensure all locales are available
apt-get install locales-all
# enable en_US.UTF8
sed -i 's|# en_US.UTF-8 UTF-8|en_US.UTF-8 UTF-8|' /etc/locale.gen
locale-gen
# Rebuild the PostgreSQL cluster
pg_dropcluster 8.4 main
pg_createcluster --locale=en_US.UTF-8 8.4 main
/etc/init.d/postgresql restart
# Recreate 'sahana' user
echo "CREATE USER sahana WITH PASSWORD 'mypassword';" > /tmp/pgpass.sql
su -c - postgres "psql -q -d template1 -f /tmp/pgpass.sql"
rm -f /tmp/pgpass.sql

Problem:

  • Trying to import a LATIN1 Shapefile into a UTF-8 table:
    Unable to convert data value to UTF-8 (iconv reports "Invalid or incomplete multibyte or wide character"). Current encoding is "UTF-8". Try "LATIN1" (Western European), or one of the values described at http://www.postgresql.org/docs/current/static/multibyte.html.
    

Solution:

  • Export in correct encoding
    shp2pgsql -s 4326 -W LATIN1 -I mylatin1shapefile.shp public.gis
    

Problem:

  • Want to store data in a LATIN1 table (although this is probably not a good idea!)

Solution:

# View available locales
locale -a
# Activate LATIN1 locale
dpkg-reconfigure locales
# select en_US ISO-8859-1, OK
# Check available
locale -a
su -c - postgres "createdb -O gis -E LATIN1 mylatin1db -T template0 --lc-collate=en_US.ISO-8859-1 --lc-ctype=en_US.ISO-8859-1"

Performance

Use a C locale instead of UTF-8 if you don't need UTF-8 (much faster with LIKE queries)

Log all queries which take longer than 500ms:

vim /etc/rsyslog.conf
local0.*    /var/log/postgresql.log

/etc/init.d/rsyslogd restart

vim /etc/postgresql/8.4/main/postgresql.conf
log_destination = 'syslog'
syslog_facility = 'LOCAL0'
syslog_ident = 'postgres'
log_min_duration_statement = 500

/etc/init.d/postgresql restart

tail -n 45 -f /var/log/postgresql.log

Disable logging again:

vim /etc/postgresql/8.4/main/postgresql.conf
log_min_duration_statement = -1

/etc/init.d/postgresql restart

Log all queries which take longer than 500ms & do an 'EXPLAIN' on those queries (not working)

pgBadger is a simple and useful tool for analysing your logs:

PgAdmin4 on Windows

If you get 'application server could not be contacted' then delete C:\Users\{Your_User}\AppData\Roaming\pgAdmin folder

See Also


InstallationGuidelines

Note: See TracWiki for help on using the wiki.