This guide will walk you through the deploy and setup of a CKAN repository using a Ubuntu 16.04. CKAN is an open source dataset repository used to collect and distribute large data collections from all kinds of sources including census data, spreadsheets, geographical data, government and research data.
Once you have completed this guide you will have a production server running CKAN 2.7 that is ready to import a new dataset, but if you want to get started to CKAN then you must check the official documentation to setup a development environment.
To complete this guide you will need:
- A cloud VPS or virtual machine running Ubuntu 16.04 64bits with a Public IP
- A domain name associated to your VPS ( you can use a subdomain as well )
- Some experience working with Python virtual environments and the command line shell
- Some knowledge about Apache, Nginx, and Tomcat
- A sample dataset with structured in CSV or Excel format ( Optional )
1. Setup the System
Lets first set up the ubuntu machine with required packages:
a. Update the server
$ sudo ssh root@$REMOTE_SERVER apt update && apt upgrade
b. Create a backup script
$ cat > /usr/bin/backme.sh
#! /bin/bash -e backme () { TIMESTAMP=$( date +'%s') for SOURCE_FILE in $* do cp -av $SOURCE_FILE ${SOURCE_FILE}-${TIMESTAMP} done } backme $* exit $?
$ sudo chmod +x /usr/bin/backme.sh
c. Update the server hostname configuration, you will need to replace "x.x.x.x" with your server IP address, also replace yourdomain.com with the domain assigned to your server.
backme.sh /etc/hosts /etc/hostname echo "ckanproduction.yourdomain.com" > /etc/hostname echo -e "\nx.x.x.x ckanproduction.yourdomain.com ckanproduction" >> /etc/hosts $ sudo hostname ckanproduction.yourdomain.com exit $ sudo ssh root@$REMOTE_SERVER
d. Setup a user account to deploy CKAN
adduser ckanadmin # USE THE INTERACTIVE PROGRAM TO SETUP A NEW USER AND PASSWORD TO MANAGE CKAN echo -e "\nckanadmin ALL=(ALL:ALL) NOPASSWD:ALL" >> /etc/sudoers
Also, you may use "visudo" to setup the user permissions.
$ su - ckanadmin mkdir .ssh $ cat > .ssh/authorized_keys # INSERT A PUBLIC KEY TO ACCESS YOUR SERVER AS THE USER ckanadmin
chmod 700 .ssh exit $ ssh ckanadmin@$REMOTE_SERVER
e. Install the software package dependencies for CKAN
$ cat > packagelist
apache2 build-essential git git-core jq libapache2-mod-rpaf libapache2-mod-wsgi libffi-dev libgeos-c1 libgeos-dev libjts-java libpq5 libpq-dev libtomcat6-java libxml2-dev libxslt1-dev links openjdk-8-jdk postgresql postgresql-9.3-postgis-2.1 python-dev python-jinja2 python-pastescript python-pip python-virtualenv python-werkzeug redis-server solr-tomcat tomcat8 unzip virtualenvwrapper zlib1g-dev postfix unzip unrar p7zip-full python-gdal
$ sudo apt install -y $( cat packagelist ) $ sudo service apache2 stop $ sudo apt install nginx -y $ sudo service nginx stop
Note: Use apt search packagename*
to search package version which is supported for your current version Ubuntu and accordingly update the 'packagelist' file.
2. Download and Install the CKAN package for production
We have to download the respective file from ckan packaging website using wget command.
$ sudo wget http://packaging.ckan.org/python-ckan_2.7-trusty_amd64.deb $ sudo dpkg -i python-ckan_2.7-trusty_amd64.deb
3. Setup a PostgreSQL database
a. Check that PostgreSQL was installed correctly, Check that the encoding of databases is UTF8
sudo -u postgres psql -l
b. Next you’ll need to create a database user ( Please use a different password )
$ sudo -u postgres createuser -S -D -R -P ckan_default password: f!+hRnztXgDtKSLW9kY
c. Create a new PostgreSQL database,
$ sudo -u postgres createdb -O ckan_default ckan_default -E utf-8 database: ckan_default
5. Setup Solr
a. Edit the Tomcat configuration file
$ sudo backme.sh /etc/tomcat8/server.xml $ sudo sed -r -i "/Connector port=\"8080\"/ s/$/address=\"127.0.0.1\"/" /etc/tomcat6/server.xml $ sudo sed -r -i "/port=.8080./ s/8080/8983/g" /etc/tomcat8/server.xml $ diff /etc/tomcat8/server.xml /etc/tomcat8/server.xml.last sudo service tomcat8 restart
b. Replace the default schema.xml file with a symlink to the CKAN schema
$ sudo mv -v /etc/solr/conf/schema.xml /etc/solr/conf/schema.xml.orig.back $ sudo ln -s /usr/lib/ckan/default/src/ckan/ckan/config/solr/schema.xml /etc/solr/conf/schema.xml $ sudo service tomcat8 restart
c. Change the solr_url setting in your CKAN configuration file
$ sudo backme.sh /etc/ckan/default/production.ini $ sudo sed -r -i "/solr_url/ s/^#//" /etc/ckan/default/production.ini $ diff /etc/ckan/default/production.ini /etc/ckan/default/production.ini.last $ sudo ln -s /usr/bin/rotatelogs /usr/sbin/rotatelogs $ sudo service tomcat8 restart $ sudo service apache2 restart links http://localhost:8983/solr/ $ ssh -L 8983:localhost:8983 ckanadmin@ckanproduction.yourdomain.com Use a web browser to check http://localhost:8983/solr/
6. Update the configuration and initialize the database
d. Setup ckan.site_id
sudo backme.sh /etc/ckan/default/production.ini sudo sed -i "/ckan.site_id/ s/default/tenosolution/" /etc/ckan/default/production.ini diff /etc/ckan/default/production.ini /etc/ckan/default/production.ini.last
e. Setup ckan.site_url
sudo backme.sh /etc/ckan/default/production.ini sudo sed -r -i "/ckan.site_url/ s/\$/ http\:\/\/ckanproduction.yourdomain.com/" /etc/ckan/default/production.ini diff /etc/ckan/default/production.ini /etc/ckan/default/production.ini.last
c. Edit the sqlalchemy.url setting ( sqlalchemy.url = postgresql://ckan_default:pass@localhost/ckan_default )
sudo backme.sh /etc/ckan/default/production.ini sudo sed -i "/sqlalchemy/ s/pass/f!+hRnztXgDtKSLW9kY/" /etc/ckan/default/production.ini diff /etc/ckan/default/production.ini /etc/ckan/default/production.ini.last
c. Initialize your CKAN database
$ sudo ckan db init
d. Restart the web servers
$ sudo service apache2 restart $ sudo service nginx restart
7. Setting up Admin Account and Test Data
a. Initialize the virt environment
$ . /usr/lib/ckan/default/bin/activate $ cd /usr/lib/ckan/default/src/ckan
b. Creating a sysadmin user ( Please setup a different password )
paster sysadmin add mycompanyadmin -c /etc/ckan/default/production.ini password: mycompanyadmin
c. Creating test data
$ paster create-test-data -c /etc/ckan/default/production.ini
d. Setup CKAN ownership to install extensions later...
$ cd /usr/lib/ckan $ sudo chown -R ckanadmin: default/
8. Setup ckanext-spatial
a. Install PostGIS:
cd /usr/lib/ckan/default/src/ckan $ sudo apt-get install postgresql-9.3-postgis-2.1 python-dev libxml2-dev libxslt1-dev libgeos-c1
b. create the necessary tables and functions in the database
$ sudo -u postgres psql -d ckan_default -f /usr/share/postgresql/9.3/contrib/postgis-2.1/postgis.sql $ sudo -u postgres psql -d ckan_default -f /usr/share/postgresql/9.3/contrib/postgis-2.1/spatial_ref_sys.sql
c. Change the owner of spatial tables to the CKAN user
sudo -u postgres psql -d ckan_default -c 'ALTER VIEW geometry_columns OWNER TO ckan_default;' $ sudo -u postgres psql -d ckan_default -c 'ALTER TABLE spatial_ref_sys OWNER TO ckan_default;'
d. see if PostGIS was properly installed:
$ sudo -u postgres psql -d ckan_default -c "SELECT postgis_full_version()"
e. Install the extension
$ cd /usr/lib/ckan/default/src $ git clone https://github.com/okfn/ckanext-spatial.git $ cd ckanext-spatial $ sudo pip install -r pip-requirements.txt $ python setup.py develop
f. Configure the extension:
$ cd /usr/lib/ckan/default/src/ckanext-spatial $ paster --plugin=ckanext-spatial spatial initdb 4326 -c /etc/ckan/default/production.ini $ sudo backme.sh /etc/ckan/default/production.ini $ sudo sed -r -i "/ckan.plugins =/ s/$/\ spatial_metadata spatial_query/" /etc/ckan/default/production.ini $ diff /etc/ckan/default/production.ini /etc/ckan/default/production.ini.last deactivate
9. Configure the FileStore
d. Create the file store directory
$ sudo mkdir -p /var/lib/ckan/default $ sudo chown -R www-data: /var/lib/ckan/default $ sudo chmod -u+rwx /var/lib/ckan/default
e. Enable FileStore and file uploads
$ sudo backme.sh /etc/ckan/default/production.ini $ sudo sed -i "/ckan.storage_path/ s/^#//" /etc/ckan/default/production.ini $ diff /etc/ckan/default/production.ini /etc/ckan/default/production.ini.last $ sudo chown -R www-data: /var/lib/ckan
10. Install extensions
a. Initialize the virt environment
$ . /usr/lib/ckan/default/bin/activate
a. Install ckanext-spatialUI ( http://extensions.ckan.org/extension/spatialui/ )
$ cd /usr/lib/ckan/default/src $ git clone https://github.com/XVTSolutions/ckanext-spatialUI $ cd ckanext-spatialUI $ python setup.py develop $ sudo backme.sh /etc/ckan/default/production.ini $ sudo sed -r -i "/ckan.plugins =/ s/$/\ spatialUI/" /etc/ckan/default/production.ini $ diff /etc/ckan/default/production.ini /etc/ckan/default/production.ini.last
b. Install ckanext-pdfview ( http://extensions.ckan.org/extension/pdfview/ )
$ cd /usr/lib/ckan/default/src $ git clone https://github.com/ckan/ckanext-pdfview.git $ cd ckanext-pdfview $ python setup.py develop $ sudo backme.sh /etc/ckan/default/production.ini $ sudo sed -r -i "/ckan.plugins =/ s/$/\ resource_proxy pdf_view/" /etc/ckan/default/production.ini $ diff /etc/ckan/default/production.ini /etc/ckan/default/production.ini.last
c. Install ckanext-officedocs ( http://extensions.ckan.org/extension/officedocs/ )
$ cd /usr/lib/ckan/default/src $ git clone https://github.com/jqnatividad/ckanext-officedocs.git $ cd ckanext-officedocs $ python setup.py install $ sudo backme.sh /etc/ckan/default/production.ini $ sudo sed -r -i "/ckan.plugins =/ s/$/\ officedocs_view/" /etc/ckan/default/production.ini $ diff /etc/ckan/default/production.ini /etc/ckan/default/production.ini.last
d. Install ckanext-dictionary ( http://extensions.ckan.org/extension/dictionary/ )
$ cd /usr/lib/ckan/default/src $ git clone https://github.com/cmuphillycapstone/ckanext-dictionary.git $ cd ckanext-dictionary $ python setup.py develop $ sudo backme.sh /etc/ckan/default/production.ini $ sudo sed -r -i "/ckan.plugins =/ s/$/\ dictionary/" /etc/ckan/default/production.ini $ diff /etc/ckan/default/production.ini /etc/ckan/default/production.ini.last
g. Install ckanext-geoview ( https://github.com/ckan/ckanext-geoview )
$ cd /usr/lib/ckan/default/src $ git clone https://github.com/ckan/ckanext-geoview.git $ cd ckanext-geoview $ python setup.py develop $ sudo backme.sh /etc/ckan/default/production.ini $ sudo sed -r -i "/ckan.plugins =/ s/$/\ geo_view geojson_view wmts_view/" /etc/ckan/default/production.ini $ sudo sed -r -i "/ckan.views.default_views =/ s/$/\ geo_view geojson_view wmts_view/" /etc/ckan/default/production.ini $ diff /etc/ckan/default/production.ini /etc/ckan/default/production.ini.last
h. Install ckanext-geopusher ( https://github.com/datacats/ckanext-geopusher )
$ cd /usr/lib/ckan/default/src $ git clone https://github.com/datacats/ckanext-geopusher.git $ cd ckanext-geopusher $ backme.sh requirements.txt $ sed -r -i "/celery/ s/$/==3.1.25/" requirements.txt $ pip install -r requirements.txt $ python setup.py develop $ sudo backme.sh /etc/ckan/default/production.ini $ sudo sed -r -i "/ckan.plugins =/ s/$/\ geopusher/" /etc/ckan/default/production.ini $ sudo sed -i "100ickanext.geoview.ol_viewer.formats = wms kml geojson wfs" /etc/ckan/default/production.ini $ diff /etc/ckan/default/production.ini /etc/ckan/default/production.ini.last $ sudo service apache2 reload deactivate
11. Setting up the DataStore ( run this in a new terminal )
$ . /usr/lib/ckan/default/bin/activate
a. Enable the datastore plugin
$ sudo backme.sh /etc/ckan/default/production.ini $ sudo sed -r -i "/ckan.plugins =/ s/$/\ datastore/" /etc/ckan/default/production.ini $ diff /etc/ckan/default/production.ini /etc/ckan/default/production.ini.last
b. Set-up the database, Create a database_user called datastore_default. ( Please use a different password )
$ sudo -u postgres psql -l $ sudo -u postgres createuser -S -D -R -P -l datastore_default $ password: 7DN4ta2igWVlFj $ sudo -u postgres createdb -O ckan_default datastore_default -E utf-8
c. Setup your CKAN config
$ sudo backme.sh /etc/ckan/default/production.ini $ sudo sed -r -i -e "/ckan.datastore.write_url/ s/^#//" -e "/ckan.datastore.write_url/ s/ckan_default\:pass/ckan_default:f!+hRnztXgDtKSLW9kY/" /etc/ckan/default/production.ini $ sudo sed -r -i -e "/ckan.datastore.read_url/ s/^#//" -e "/ckan.datastore.read_url/ s/datastore_default\:pass/datastore_default:7DN4ta2igWVlFj/" /etc/ckan/default/production.ini $ diff /etc/ckan/default/production.ini /etc/ckan/default/production.ini.last
d. Set permissions, use this command for a source code installation
$ sudo ckan datastore set-permissions | sudo -u postgres psql --set ON_ERROR_STOP=1
e. Setup datapusher
$ sudo backme.sh /etc/ckan/default/production.ini $ sudo sed -r -i "/ckan.plugins =/ s/$/\ datapusher/" /etc/ckan/default/production.ini $ sudo sed -r -i "/datapusher/ s/^#ckan/ckan/" /etc/ckan/default/production.ini $ diff /etc/ckan/default/production.ini /etc/ckan/default/production.ini.last
12. Restart the web services
$ sudo -i $ service tomcat8 restart; service apache2 restart; service nginx restart
You should be able to access you CKAN repository at http://ckanproduction.yourdomain.com and login with the admin user previously created.
Please refer to ckan docs for further details about how to manage and setup your new site.
First of all, thnx a lot for the tutorial. It's great! But I have some problems with install all needs package to Ubuntu:
Can you help me?
Hi Viktor,
The error says 'openjdk-7-jdk is not available', in the cat file change to openjdk-8-jdk
Bobbin, thnx for U answer.
There were a few more problems at this step:
Package 'libtomcat6-java' has no installation candidate
Unable to locale package postgresql-9.3-postgis-2.1
Couldn't find any package by glod 'postgresql-9.3-postgis-2.1'
Couldn't find any package by regex 'postgresql-9.3-postgis-2.1'
Unable to locate package tomcat6
Unable to locate package tomcat6-common
tomcat6 is now legacy, you can manually install or add in the file tomcat8, remove tomcat6-common as it comes with tomcat8.
Try to replace postgresql-9.3-postgis-2.1 with postgresql-10-postgis-2.4
You can use search apt for supported version
$ sudo apt search postgresql*
$ sudo apt search tomcat*
Hi!
first, it is a great tut!
i have a strange issue my connection is blocked: solr and tomcat cant estabilished it :-/
i installed tomcat8 after i seen tomcat6 is legacy :-)
tomcat9 is installed
my system is ubuntu 18.04 the rest worked for me
thank u in advanced
alex
Hi Alex,
Checked logs? It should give some hint why blocked