Mikko Kortelainen

OpenStreetMap Nominatim Server for Geocoding

Here's how to install the OpenStreetMap Nominatim service on your own server. It can be used to geocode and reverse geocode addresses and map coordinates. You will also get a web interface which loads map tiles from openstreetmap.org while doing geocoding requests using your own server.

I was faced with thousands of geocoding requests per hour, and with such a high volume it is not wise to burden the global OSM servers. Even with aggressive caching it would have potentially been too much.

I will use Ubuntu 14.04 LTS as the platform. Just a basic install with ssh server. We will install Apache to serve http requests. Make sure you have enough disk space and RAM to hold the data and serve it efficiently. I used the Finland extract, which was about a 200 MB download. The resulting database was 26 GB after importing, indexing and adding Wikipedia data. The Wikipedia data probably actually took more disk space than the OSM data. My server has 4 GB RAM, which seems to be enough for this small data set.

1   Software requirements

PostgreSQL with PostGIS extension:

sudo apt-get install \
  postgresql postgis postgresql-contrib \
  postgresql-server-dev-9.3 postgresql-9.3-postgis-2.1 \
  postgresql-doc-9.3 postgis-doc

Apache with PHP5:

apt-get install \
  apache2 php5 php-pear php5-pgsql php5-json php-db

Git and various tools:

apt-get install \
  wget git autoconf-archive build-essential \
  automake gcc proj-bin

Tool for handling OpenStreetMap data:

apt-get install osmosis

Needed libraries:

apt-get install \
  libxml2-dev libgeos-dev libpq-dev libbz2-dev libtool \
  automake libproj-dev libboost-dev libboost-system-dev \
  libboost-filesystem-dev libboost-thread-dev \
  libgeos-c1 libgeos++-dev lua5.2 liblua5.2-dev \
  libprotobuf-c0-dev protobuf-c-compiler

2   Kernel tuning

We must increase kernel shared memory limits. Also we'll reduce swappiness and set the kernel memory overcommit to be a bit more conservative. Make sure you have anough swap space, more than physical memory is a good idea.

These have been estimated using 4 GB RAM.

sysctl -w kernel.shmmax=4404019200
sysctl -w kernel.shmall=1075200
sysctl vm.overcommit_memory=2
sysctl vm.swappiness=10

echo "kernel.shmmax=4404019200" >> /etc/sysctl.conf
echo "kernel.shmall=1075200" >> /etc/sysctl.conf
echo "vm.overcommit_memory=2" >> /etc/sysctl.conf
echo "vm.swappiness=10" >> /etc/sysctl.conf

3   PostgreSQL tuning

The following directives were set in /etc/postgresql/9.3/main/postgresql.conf:

shared_buffers                = 2048MB
work_mem                      = 50MB
maintenance_work_mem          = 1024M
fsync                         = off    # dangerous!
synchronous_commit            = off
full_page_writes              = off    # dangerous!
checkpoint_segments           = 100
checkpoint_timeout            = 10min
checkpoint_completion_target  = 0.9
effective_cache_size          = 2048M

The buffer and memory values depend on your available RAM so you should set sane values accordingly.  These are good for 4 GB RAM.

There are two truly dangerous settings here. Setting fsync off will cause PostgreSQL to report successful commits before the disk has confirmed the data has been written successfully. Setting full_page_writes off exposes the database to the possibility of partially updated data in case of power failure. These options are used here only for the duration of the initial import, because setting them so makes the import much faster. Fsync and full page writes must be turned back on after the import to ensure database consistency.

The synchronous_commit also jeopardizes the durability of committed transactions by reporting successful commits to clients asynchronously, meaning that the transaction may not have been fully written to disk. It does not compromize  consistency as the two previous directives discussed. It just means that in the case of power failure, some recent transactions that were already reported to the client as having been committed, may in fact be aborted and rolled back. The database will still be in a consistent state. We'll leave it off because it speeds up the queries, and we don't really care about it in this database. We can always rebuild it from scratch. But if you have other important databases in the same database cluster, I would recommend turning it back on as well, since the setting is cluster-wide.

The maintenance_work_mem will also be reduced to a lower value later, after the import.

Restart PostgreSQL to apply changes:

pg_ctlcluster 9.3 main restart

4   Dedicated user

We'll install the software under this user's home directory (user id "nominatim").

useradd -c 'OpenStreetMap Nominatim' -m -s /bin/bash nominatim

5   Database users

We will also make "nominatim" a database user, as well as the already created "www-data" (created by the apache2 package).

Nominatim will be a superuser and www-data will be a regular one.

This must be done as a database administrator. The "postgres" user by default is one (root is not). Change user:

su - postgres

Create database users:

createuser -sdRe nominatim
createuser -SDRe www-data
exit

6   Download and Compile Nominatim

Su to user nominatim:

su - nominatim

Set some environment variables. We'll use these later. These are the download locations for files and updates. Lines 3 and 4 below are for Finland. Customize to your needs.

Also set the BASE_URL to point to your server and install directory. The last forward slash seems to be important.

WIKIPEDIA_ARTICLES="http://www.nominatim.org/data/wikipedia_article.sql.bin"
WIKIPEDIA_REDIRECTS="http://www.nominatim.org/data/wikipedia_redirect.sql.bin"
OSM_LATEST="http://download.geofabrik.de/europe/finland-latest.osm.pbf"
OSM_UPDATES="http://download.geofabrik.de/europe/finland-updates"
BASE_URL="http://maps.example.org/nominatim/"

You can browse available areas at download.geofabrik.de.

6.1   Clone the Repository from Github

We will use the release 2.3 branch:

git clone --recursive \
  https://github.com/twain47/Nominatim.git \
  --branch release_2.3

cd Nominatim

The --recursive option is needed to clone everything, because the repository contains submodules.

6.2   Compile Nominatim

Simply:

./autogen.sh
./configure
make

If there were no errors, we are good. In case of missing libraries, check that you installed all requirements earlier.

6.3   Configure Local Settings

Some required settings should go to ~/Nominatim/settings/local.php:

cat >settings/local.php <<EOF
<?php
// Paths
@define('CONST_Postgresql_Version', '9.3');
@define('CONST_Postgis_Version', '2.1');
@define('CONST_Website_BaseURL', '$BASE_URL');

// Update process
@define('CONST_Replication_Url', '$OSM_UPDATES');
@define('CONST_Replication_MaxInterval', '86400');
@define('CONST_Replication_Update_Interval', '86400');
@define('CONST_Replication_Recheck_Interval', '900');
?>
EOF

The last four definitions are only required if you are going to do incremental updates.

7   Download Data

OpenStreetMap data (about 200 MB for Finland):

wget -O data/latest.osm.pbf $OSM_LATEST

The OSM data is all that is really required. The rest below are optional.

Wikipedia data (about 1.4 GB):

wget -O data/wikipedia_article.sql.bin $WIKIPEDIA_ARTICLES
wget -O data/wikipedia_redirect.sql.bin $WIKIPEDIA_REDIRECTS

Special phrases for country codes and names (very small):

./utils/specialphrases.php --countries >data/specialphrases_countries.sql

Special search phrases (a few megabytes):

./utils/specialphrases.php --wiki-import >data/specialphrases.sql

Next we'll import all this stuff into the database.

8   Import Data

The utils/setup.php will create a new database called "nominatim" and import the given .pbf file into it. This will take a long time depending on your PostgreSQL settings, available memory, disk speed and size of dataset. The full planet can take days to import on modern hardware. My small dataset took a bit over two hours.

./utils/setup.php \
  --osm-file data/latest.osm.pbf \
  --all --osm2pgsql-cache 1024 2>&1 \
  | tee setup.log

The messages will be saved into setup.log in case you need to look at them later.

If you had the Wikipedia data downloaded, the setup should have imported that automatically and told you about it. You can import the special phrases data if you downloaded it earlier using these commands:

psql -d nominatim -f data/specialphrases_countries.sql
psql -d nominatim -f data/specialphrases.sql

9   Database Production Settings

Now that the import is done, it is time to configure the database to settings that are suitable for production use. The following changes were made in /etc/postgresql/9.3/main/postgresql.conf:

maintenance_work_mem          = 128M
fsync                         = on
full_page_writes              = on

You should also set the synchronous_commit directive to on if you have other databases running on this same database cluster. See the PostgreSQL Tuning section earlier in this post.

Apply changes:

pg_ctlcluster 9.3 main restart

10   Create the Web Site

The following commands have to be run as root.

Create a directory to install the site into and set permissions:

mkdir /var/www/html/nominatim
chown nominatim:www-data /var/www/html/nominatim
chmod 755 /var/www/html/nominatim
chmod g+s /var/www/html/nominatim

Ask bots to keep out:

cat >/var/www/html/nominatim/robots.txt <<'EOF'
User-agent: *
Disallow: /
EOF

10.1   Apache configuration

Edit the default site configuration file /etc/apache2/sites-enabled/000-default.conf and make it look like something like below:

<VirtualHost *:80>
  ServerName maps.example.org
  ServerAdmin webmaster@example.org
  DocumentRoot /var/www/html

  ErrorLog ${APACHE_LOG_DIR}/error.log
  CustomLog ${APACHE_LOG_DIR}/access.log combined

  <Directory "/var/www/html/nominatim">
    Options FollowSymLinks MultiViews
    AddType text/html .php
  </Directory>
</VirtualHost>

Apply changes by restarting web server:

apache2ctl restart

10.2   Install the Nominatim Web Site

The installation should be done as the nominatim user:

su - nominatim
cd Nominatim

Run the setup.php with the option to create the web site:

./utils/setup.php --create-website /var/www/html/nominatim

At this point, the site is ready and you can point your browser to the base URL and try it out. It should look something like this:

Valinta_064

11   Enable OSM Updates

We will configure a cron job to have the database updated periodically using diffs from Geofabrik.de. As the nominatim user:

rm settings/configuration.txt
./utils/setup.php --osmosis-init

Enable hierarchical updates in the database (these were off during import to speed things up):

./utils/setup.php --create-functions --enable-diff-updates

Run once to get up to date:

./utils/update.php --import-osmosis --no-npi

There seems to be a time check in the import function that prevents another run immediately. Instead it will wait until the interval configured in the "Configure Local Settings" section earlier in this post has passed (default is 1 day).

We will add the command to crontab to be executed every monday at 03:00. Run:

crontab -e

Add a new line to the crontab file looking like this:

00 03 * * mon /home/nominatim/Nominatim/utils/update.php --import-osmosis --no-npi >>/home/nominatim/nominatim-update.log 2>&1

A log file will be created as /home/nominatim/nominatim-update.log.

12   Wrap Up

That's it! You can get JSON or XML data out of the database with http requests. A couple of examples:

Forward geocoding:

Reverse geocoding:

Check out the Nominatim documentation for details.

This post was adapted from my notes of such an installation process. Let me know in the comments section if I introduced any mistakes. Other comments are also welcome!