1 Introduction

In this document, I intend to explain the steps that I went through to create a global network of ‘walking routes’, that is PGrouting ready network to calculate the time and distance it would take to ‘walk’ or ‘sail’ between two points on earth.

A PDF version of this document is provided here, https://www.wnvermeulen.com/empires/routing_tutorial.pdf

This projected is part of a paper with Pierre-Louis Vézina and Gunes Gokmen on the Imperial Routes of Global trade, published in Journal of Economic Growth.

I’m an autodidact in GIS methods, and while I had some basic knowledge in SQL when I started, I learned a lot on how to do things efficiently within this project. Nevertheless, I’m sure there are still a lot of inefficient parts in my code. When coding, everyone makes a continuous trade-off between (a) quick and dirty coding but potentially slow execution time and (b) slow and elegant coding with fast execution. Most of the time, I go for fast solutions only when the execution is unbearably long, or when I have epiphanies. With every block of code I learned a bit more, and am better able to implement efficient solutions early on, such that the difference between (a) and (b) becomes smaller over time.

Many solutions were adapted from online sources, in particular stackexchange. However, I did not keep track of all these sources, so references for the implementations that I used are very scarce. I’m sorry for that, since the online community has been invaluable and people deserve a huge credit for offering their solutions publicly. Perhaps my full solution posted here can be a form of payback. My implementation is also quite similar to http://orbis.stanford.edu and probably I used pieces of code, or strategies from their implementation.

I did this in PostGIS rather than the seemingly more commonly used Python with ArcGIS or QGIS. I needed routing, and at the time that I started with this, the most readily available solution seemed to be PGRouting. I use QGIS to create maps and to check that spatial manipulation in PostGIS is doing what I intent, but not to actually do any ‘real’ analysis or manipulation. The only exception, as will be noted below, is if sometimes I have to add or remove individual edges based on their specific location on the globe.

1.1 Game plan

The basic sequence is as follows

  1. create database
  2. load relevant external data, e.g. global altitude data
  3. create a globe with dots, clip by land.
  4. connect dots to a mesh, attach altitude information for land-based dots
  5. calculate distance and time-cost for walking, given altitude difference
  6. repeat the above for water, with lower resolution and different costs of traversing water based on winds.
  7. calculate optimal routes for current day cities, to each nearest city.
  8. network of routes becomes global routing network.

There is a lot of housekeeping for each, and in between, steps. Removal and creation of indices, removal and recreation of (temporary) tables, etc. As with other types of coding and data projects, over 90% of the code is housekeeping.

One issue with Postgress, is that it is not straight forward to use multiple treads to do time-consuming operations. For instance, when you have millions of rows, and a process on each row takes a few milliseconds, you’ll be waiting a long-time. However, when the operation is completely independent across rows, such an operation could be easily split between processes. I do this manually, by starting new threads in new consoles. Therefore, the script below should probably not be run in one go. Each block needs to be separately called in and waited for completion before starting the next. I looked into ways to automate multi-threading, but found it to much of a hassle to get it going so left it out.

2 Prerequisites

You’ll need an installed Postgres system, with Postgis and PGrouting installed. I followed this person’s instructions: http://www.kyngchaos.com/software/postgres/

You need a large and fast harddisk to do this. I coded everything on a 2014 iMac with 16GB of RAM, but had to buy a 512GB ssd drive to put the database on. SSD because otherwise things take way too long, and additional drive because especially the global raster on altitudes is space intensive. When I interact with the system, e.g. when loading data into the database, I only put code that I used for my system. A windows version would probably be similar.

My database runs on my local machine, not online. I use a standard set of username/password, and the database is called ‘empires’ after the main project for which this was done. Any paths to the local file system to load or store data are indicated as <path>, and need to be replaced with something suitable.

This tutorial is not an introduction to PostgreSQL and PostGIS. So, to really understand what’s going on, some prior knowledge is useful. Perhaps some minimum required knowledge is the following

  • SQL commands ends with ;. So when copying, this is how to recognize how commands break over lines.
  • when copying from the PDF version of this tutorial, things might go wrong quickly. While I use indentation for clarity of code, these indents should be spaces, not tabs. At least that’s my experience.
  • The PostgreSQL are not case sensitive. There is an old convention to have all commands and terms in capitals, e.g. SELECT * FROM cities WHERE capital = TRUE, but I’m too lazy to implement this structurally and in any decent editor with syntax highlighting the colour schemes replaced the issue of making code more readable. Yet the capitalisation remained, albeit not perfectly.
  • I cut lines to make things more readable. The character for continuation on the next line is / in python and \ in bash.

3 Setting up the database and loading data

Create database with required features, then log in using the postgres account,in Bash:

createdb -U wessel empires
psql -U postgres

then in the postgres console:

GRANT ALL PRIVILEGES ON DATABASE empires to wessel;

Get out of psql console (cntl+d) and then log back in using your own user account, in bash:

psql -U wessel -d empires

(in Postgres) Attach the required extensions to the empires database.

CREATE EXTENSION postgis;
CREATE EXTENSION postgis_topology;
CREATE EXTENSION pgrouting;

Given the global scale of the project, we cannot use a projection without significantly distorting distance calculations later. Therefore, we use the standard EPSG:4326 projection, which saves all data as long/lat coordinates, and apply functions where necessary that account for the spherical dimension. Take-away: load or save all external data in 4326.

3.1 Global altitude data

Global altitude data was taken from http://www.viewfinderpanoramas.org/Coverage%20map%20viewfinderpanoramas_org3.htm. I used two python files to download all the tiles and load them as a raster table in the database. This is a lot of data, around 16GB in zip files, and in the database the table is (with indices) 25GB.

Downloading

from lxml import html
import requests
import urllib2
from os import listdir
from sys import stdout


url = 'http://www.viewfinderpanoramas.org/' / 
      + 'Coverage%20map%20viewfinderpanoramas_org3.htm'

page = requests.get(url)
tree = html.fromstring(page.text)

links = tree.xpath('/html/body/map/area/@href')

# build in check for file existence
from os import listdir
from os.path import isfile, join
mypath = '<path>/dem'
files_in_dir = [ f for f in listdir(mypath) if isfile(join(mypath,f)) ]


i=0
for link in links:
    filename = link.split('/')[-1]
    if filename not in files_in_dir:
        if filename not in ['01-15.zip', '16-30.zip', 
                            '31-45.zip', '46-60.zip']:  
            stdout.write(filename+'\n')
            stdout.flush()
            file = urllib2.urlopen(link)
            with open(join(mypath,filename), 'wb') as output:
                while True:
                    data = file.read()
                    if data:
                        output.write(data)
                    else:
                        break

transferring to the database

from sys import stdout

import zipfile
from os import listdir, remove
from os.path import isfile, join, basename
from subprocess import call
import shutil


def unzip(source_filename, dest_dir):
    with zipfile.ZipFile(source_filename) as zf:
        for member in zf.infolist()[1::]:
            file = member.filename.split('/')[1::]
            filename = basename(member.filename)
            #print filename
            if not filename: continue
            zf.extract(member, dest_dir)
            shutil.move(join(dest_dir,member.filename), 
                                join(dest_dir,filename))         


mypath = '<path>/dem/'

onlyfiles = [ f for f in listdir(mypath) if isfile(join(mypath,f)) ]

i=0

for file in listdir(mypath):
    if file.endswith(".zip"):
        stdout.write(str(i)+" "+file+'\n')
        stdout.flush()
        os.makedirs(join('<path>'+'temp'))
        unzip(join(mypath,file), join('/Volumes/WD/','temp'))
        if i==0: # prepare table on first run
            command = "raster2pgsql -s 4326 -p -M -t 100x100 -N -32768" /
                 + "<path>/temp/*.hgt dem3 | psql -U wessel -d empires"
        else: # insert data
            command = "raster2pgsql -s 4326 -a -t 100x100 -N -32768" / 
                 + "<path>/temp/*.hgt dem3 | psql -U wessel -d empires"
        return_code = call(command, shell=True)  
        return_code = call("rm -R <path>/temp/" , shell=True)  
        i+=1
# vacuum and index afterwards

You could probably do it in one go, but for my development I chose to do it in steps. Keeping the raw altitude data can also become useful when creating tables for location specific maps, although in principle it could also be extracted from the table that is created here.

Housekeeping

CREATE INDEX dem3_gist_idx ON dem3 USING GIST (ST_ConvexHull(rast));

3.2 External shapefiles

These shapefiles are obtained from https://www.naturalearthdata.com/downloads/

From bash, load land mass shapefile, current day countries, lakes and rivers. (The \ signs are just for clarity and to allow for lines to break. Each command starts with shp2pgsql.)

shp2pgsql -I -s 4326 -W "LATIN1" <path>/ne_50m_land/ne_50m_land.shp \
  land | psql -d empires -U wessel
shp2pgsql -I -s 4326 -W "LATIN1" <path>/ne_50m_admin_0_countries.shp \
  countries | psql -d empires -U wessel
shp2pgsql -d -s 4326 -I -W "LATIN1" <path>/ne_10m_lakes.shp  \
  public.lakes | psql -d empires -U wessel
shp2pgsql -c -s 4326 -I -W "LATIN1" \
  <project path>/ne_10m_rivers_lake_centerlines.shp  \
  public.rivers | psql -d empires -U wessel

4 The Grid: a globe with dots

Imagine the earth. Now we would like to have dots at equally spaced distances from each other, we’ll call this the grid. This only works when keep either latitude or longitude constant. Let’s keep latitude (the vertical dimension) constant, and choose the equator (lat=0) to start. Then, given the circumspection of the earth (360 degrees, and around 44000 km), and a desired distance between two dots (e.g. 5 km), you can create a string (or ring) of dots at equally spaced distances.

Moving to the next latitude, about 5 km north, we’d need to calculate the number of required dots given that the distance required to go the full 360 degrees will have slightly reduced. I work this backwards. First I create an index table that holds the information on the latitudes that the rings of dots need to be created and I calculate the distance in longitudal degrees for each latitude. Then I have all the information to create the actual grid.


drop table if exists grid_index cascade;
create table grid_index (
id serial primary key,
latt real,
latt_d real, -- latitude geography km distance from 0 degrees
latt_d_57 real, -- latitude geography km distance from 57 degrees
long_d real -- how many degrees is 1 (or other defined) km
);
create index grid_index_latt_idx on grid_index(latt);
create index grid_index_latt_d_idx on grid_index(latt_d);
create index grid_index_long_d_idx on grid_index(long_d);

From point 0,0 on earth, looking towards the North pole, gives me the distance points that are equally spaced at 5000 meters (5km). A more precise routing mesh would require here to set the distance to something smaller, but this will also increase the disk space and computer power required to do everything below. I took 5km as reasonable resolution that still allows for sufficient detail of global altitude, yet manageable with my home system. I don’t need to go the full 90 degrees, since there is no land that high. So I stop at 81 degrees north.

insert into grid_index (latt_d) (
    select y
    from 
    generate_series(0, 
      floor(
       st_distance(
        geography(st_point(0,0)), geography(st_point(0,81)))
       )::int, 5000) as y
);

Do the same, going south. Now I can stop already at -57 degrees, because we do not care about Antarctica.

insert into grid_index (latt_d) (
    select -1*y
    from 
    generate_series(1000, 
   floor(
    st_distance(
     geography(st_point(0,0)), geography(st_point(0,-57)))
   )::int, 5000) as y
);

Then given the distances, I need to get the degrees of these points. I do this by creating a buffer from the origin point with a distance as large as defined in the above. Given the buffer, obtain the maximum altitude point, which is the number I want. However, buffers don’t work nice after some extreme distance. So I take new reference point after 57 degrees

update grid_index set 
 latt = st_ymax(st_buffer(st_point(0,0)::geography,latt_d)::geometry) 
 where latt_d>0 
 and 
  latt_d<=st_distance(st_point(0,0)::geography,st_point(0,57)::geography);
update grid_index set 
 latt = st_ymin(st_buffer(st_point(0,0)::geography,-1*latt_d)::geometry) 
 where latt_d<0;

-- update for equator only
update grid_index set latt = 0 where latt_d = 0;

update grid_index set latt_d_57 = floor(latt_d-st_distance(
 geography(st_point(0,0)), geography(st_point(0,57)))) 
 where 
 latt_d>st_distance(st_point(0,0)::geography,st_point(0,57)::geography);

update grid_index set 
 latt = 
  st_ymax(st_buffer(st_point(0,57)::geography,latt_d_57)::geometry)::real 
 where 
 latt_d>st_distance(st_point(0,0)::geography,st_point(0,57)::geography);

update grid_index set 
 long_d = 
  st_xmax(st_buffer(st_point(0,latt)::geography,5000)::geometry)::real;

The final preparatory step is to realise that we only need coordinates where there is land. The obvious solution is to go ahead with a full set of coordinates and then clip this against geometries of land. However, this turned out to be a very time-consuming and inefficient procedure, because of the huge number of dots in play. So instead, for each latitude that we have defined above, I create a line across the globe and intersect this with the land geometry, so I have vertical lines that only exist where there is land, and exactly overlay the rings of dots that will be created later. Again this is a sort of index table, where each row holds a combination of a latitude and longitudal limits that define the extent (through min and max) of the grid to be created.

-- define min and max long_d for each latt in land for each polygon. 
drop table if exists long_limits;
create table long_limits as(
SELECT 
 latt::real, 
 st_xmin(clipped_geom)::real long_min , 
 st_xmax(clipped_geom)::real long_max
 FROM (
  SELECT 
  latt, 
  gid, 
  (ST_Dump(ST_Intersection(box.geom, land.geom))).geom As clipped_geom
   FROM  (
    select 
     latt , 
     st_setsrid(
       st_makeline(st_point(-180, latt),St_point(180, latt)),
       4326) as geom 
     FROM grid_index ) as box
   INNER JOIN land
    ON ST_Intersects(box.geom, land.geom)
  )  As clipped
);
create index long_limits_latt on long_limits(latt);

Now I’m ready to create rows of x coordinates for each latitude, the grid. This is the first time that I split the code over four processes that I execute separately by opening four windows of the postgres console, based on latitudes. Each process creates its own table, after which the four tables are joined in a view. Since the grid is not my final objective, the objective is a mesh that connects each grid-point, this was my solution to deal with scale and speed issues.

First, prepare the four tables.


drop table if exists grid_1;
create table grid_1 (
    latt  real,
    long  real
);
drop table if exists grid_2;
create table grid_2 (
    latt  real,
    long  real
);
drop table if exists grid_3;
create table grid_3 (
    latt  real,
    long  real
);
drop table if exists grid_4;
create table grid_4 (
    latt  real,
    long  real
);

Then, the code for each table is identical, except for the table reference and the limits of the coordinates that it deals with. This will create 134,495,648 points. In order to make the id’s unique between the tables, set the start of the sequences of the serials manually to 0; 200,000,000; 300,000,000; 400,000,000.

I first create a full ring of dots, after which the relevant ones are filtered out using the long_limits that were defined above.

Table 1

insert into grid_1 (latt, long) (
with idx as (
    select * 
    from grid_index 
    where 
    grid_index.latt<(-57 + (57+81)/4) --the latitude limit for Table 1
    )
select 
 idx.latt, 
 generate_series((-180*10^5)::int, (180*10^5)::int, 
 floor(idx.long_d*10^5)::int)::real/(10^5) as long 
 from idx); -- this creates the longitudal coordiantes, by going from 
            -- -180 degrees to 180 degrees in steps of long_d. The 
            -- multiplication by 10^5 is to avoid computer rouding 
            -- along the way

-- filter the points above to those that are relevant, 
-- these therefore filter out all coordinates that are in oceans
drop table if exists grid_1_filtered;
create table grid_1_filtered as (
 select 
 grid_1.latt, grid_1.long 
 from 
 grid_1 
 join 
 long_limits 
 on (grid_1.latt=long_limits.latt and 
     grid_1.long >= long_limits.long_min and 
     grid_1.long<=long_limits.long_max
 )
);
--here we create the actual geospatial grid points.
alter table grid_1_filtered add column geom GEOMETRY;
update grid_1_filtered set 
  geom = st_setsrid(ST_POINT(long::real, latt::real),4326); 

-- housekeeping
drop table grid_1;
alter table grid_1_filtered rename to grid_1;
create index grid_1_geom on grid_1 using gist(geom);

-- note the adjustment of the pkey sequence, 
-- to allow for unique pkey when the four tables are combined.
create sequence grid_1_pkey_seq start with 1;
alter table grid_1 
 add column pkey INTEGER DEFAULT nextval('grid_1_pkey_seq');
alter sequence grid_1_pkey_seq owned by grid_1.pkey;
create index grid_1_pkey_idx on grid_1(pkey);
alter table grid_1 add constraint grid_1_pkey_chk CHECK(pkey<200000000);

-- adding altitude by looking up the value in dem3 raster table.
alter table grid_1 add column alt real;
update grid_1 set alt = st_value(rast, geom) 
 from dem3 where st_intersects(rast, geom);

This creates the required points starting at 57 degrees south.

A cut out from around Tasmania

A cut out from around Tasmania

Table 2

insert into grid_2 (latt, long) (
with idx as (
    select * from grid_index 
    where 
     grid_index.latt>=(-57 + (57+81)/4) AND 
     grid_index.latt<(-57 + 2*(57+81)/4)
    )
select idx.latt, generate_series((-180*10^5)::int, (180*10^5)::int,
 floor(idx.long_d*10^5)::int)::real/(10^5) as long from idx);

drop table if exists grid_2_filtered;
create table grid_2_filtered as (
 select grid_2.latt, grid_2.long 
 from  
  grid_2 
 join 
  long_limits on 
  (grid_2.latt=long_limits.latt and 
   grid_2.long >= long_limits.long_min and 
   grid_2.long<=long_limits.long_max
 )
);
alter table grid_2_filtered add column geom GEOMETRY;
update grid_2_filtered 
 set geom = st_setsrid(ST_POINT(long::real, latt::real),4326); 
drop table grid_2;
alter table grid_2_filtered rename to grid_2;
create index grid_2_geom on grid_2 using gist(geom);

create sequence grid_2_pkey_seq start with 200000001;
alter table grid_2 
 add column pkey INTEGER DEFAULT nextval('grid_2_pkey_seq');
alter sequence grid_2_pkey_seq owned by grid_2.pkey;
create index grid_2_pkey_idx on grid_2(pkey);
alter table grid_2 
 add constraint grid_2_pkey_chk CHECK(pkey>200000000 and pkey<300000000);
alter table grid_2 
 add column alt real;
update grid_2 
 set alt = st_value(rast, geom) from dem3 where st_intersects(rast, geom);

Table 3

insert into grid_3 (latt, long) (
with idx as (
 select * from grid_index 
 where grid_index.latt>=(-57 + 2*(57+81)/4) AND 
       grid_index.latt<(-57 + 3*(57+81)/4)
 )
select 
 idx.latt, 
 generate_series((-180*10^5)::int, (180*10^5)::int, 
                 floor(idx.long_d*10^5)::int)::real/(10^5) 
                 as long from idx
);

drop table if exists grid_3_filtered;
create table grid_3_filtered as (
 select grid_3.latt, grid_3.long from 
 grid_3 
 join 
 long_limits 
 on (grid_3.latt=long_limits.latt and 
     grid_3.long >= long_limits.long_min and 
     grid_3.long<=long_limits.long_max)
);
alter table grid_3_filtered add column geom GEOMETRY;
update grid_3_filtered 
 set geom = st_setsrid(ST_POINT(long::real, latt::real),4326); 
drop table grid_3;
alter table grid_3_filtered rename to grid_3;
create index grid_3_geom on grid_3 using gist(geom);

create sequence grid_3_pkey_seq start with 300000001;
alter table grid_3 
 add column pkey INTEGER DEFAULT nextval('grid_3_pkey_seq');
alter sequence grid_3_pkey_seq owned by grid_3.pkey;
create index grid_3_pkey_idx on grid_3(pkey);
alter table grid_3 
 add constraint grid_3_pkey_chk CHECK(pkey>300000000 and pkey<400000000);
alter table grid_3 add column alt real;
update grid_3 
 set alt = st_value(rast, geom) from dem3 where st_intersects(rast, geom);

and finally, Table 4

insert into grid_4 (latt, long) (
with idx as (
 select * from grid_index where grid_index.latt>=(-57 + 3*(57+81)/4)
 )
select idx.latt, 
 generate_series(
   (-180*10^5)::int, (180*10^5)::int,floor(idx.long_d*10^5)::int
  )::real/(10^5) as long from idx);

drop table if exists grid_4_filtered;
create table grid_4_filtered as (
 select grid_4.latt, grid_4.long 
 from grid_4 
 join long_limits 
 on (grid_4.latt=long_limits.latt and 
     grid_4.long >= long_limits.long_min and 
     grid_4.long<=long_limits.long_max
 )
);
alter table grid_4_filtered add column geom GEOMETRY;
update grid_4_filtered 
 set geom = st_setsrid(ST_POINT(long::real, latt::real),4326); 
drop table grid_4;
alter table grid_4_filtered rename to grid_4;
create index grid_4_geom on grid_4 using gist(geom);

create sequence grid_4_pkey_seq start with 400000001;
alter table grid_4 
  add column pkey INTEGER DEFAULT nextval('grid_4_pkey_seq');
alter sequence grid_4_pkey_seq owned by grid_4.pkey;
create index grid_4_pkey_idx on grid_4(pkey);
alter table grid_4 add constraint grid_4_pkey_chk CHECK(pkey>400000000);
alter table grid_4 add column alt real;
update grid_4 
 set alt = st_value(rast, geom) 
 from dem3 where st_intersects(rast, geom);

We can check that this is correct, by taking two sequential points and calculate their distance.

-- distance in degrees between two points:
select st_distance(p1.geom, p2.geom) dist 
 from grid_4 as p1, grid_4 as p2 
 where p1.pkey=401316438 and p2.pkey=401316439;
-- in meters
select st_distance(geography(p1.geom), geography(p2.geom)) dist 
 from grid_4 as p1, grid_4 as p2 
 where p1.pkey=401316438 and p2.pkey=401316439;

Combine the four tables in a view.

create view grid as (
select * from grid_1
union
select * from grid_2
union 
select * from grid_3
union 
select * from grid_4);

5 Connect the dots to create a mesh

The idea is to connect each dot with its nearest neighbour. This is fine when keeping latitude constant, but when going up or down it can be tricky. The grid is not rectangular, since we’re dealing with a globe.

First, prepare the mesh table. It’s called o_mesh, since I borrowed the code from somewhere in stackexchange.

drop table if exists o_mesh CASCADE;
create table o_mesh 
(
 -- id BIGSERIAL PRIMARY KEY,
 "source" int,
 target int,
 direction smallint, -- varchar(2),
 geom GEOMETRY('LINESTRING', 4326),
 dist real
);

The following is repeated for each of the four grid tables, by just replacing grid_1 with grid_2, grid_3 and grid_4 sequentially.

The idea is to connect each two points together once. This sounds quite straightforward, but what we have is a large set of dots, with no information on which two dots are closest. Moreover, the dots are only equally spaced longitudinal given a latitude, at 5km. We also know that these rings are 5km spaced from each other in latitude. But these do not line-up exactly. So we know that the maximum distance is sqrt(52+52), but this is in km, where we need it in degrees, which will vary as we move up the globe (i.e. at different latitudes). Therefore, the maximum allowed distance in degrees must be calculated for each latitude separately. (I guess people that understand spherical calculations better than I do, will have a closed form formula for this. I do not.)

We also only want each link once, so we need to limit links to only a quarter of the quadrant, pure east, pure north, and anything north-east. This is implemented at the very end.

INSERT INTO o_mesh (source, target, direction, geom, dist)

with indeks as (
select 
 idx1.*, 
 idx2.latt+0.000001 as next_latt,
 idx3.latt-0.000001 as prev_latt,
 st_distance(st_point(0,idx1.latt), st_point(idx1.long_d,idx2.latt)) 
from
  (
   select row_number() over () id1, * 
   from grid_index order by latt
  ) idx1 
  left join 
  (
   select (row_number() over () -1) id2, * 
   from grid_index order by latt
  ) idx2
  on id1=id2 
  left join 
  (
   select (row_number() over () +1) id3, * 
   from grid_index order by latt
  ) idx3
  on id1=id3
)


SELECT
source.pkey as source, target.pkey as target, (
case
when source.long = target.long AND source.latt < target.latt then 1 --'N'
when source.long < target.long AND source.latt < target.latt then 2 --'NE'
when source.long < target.long AND source.latt = target.latt then 3 --'E'
when source.long < target.long AND source.latt > target.latt then 4 --'SE'
when source.long = target.long AND source.latt > target.latt then 5 --'S'
when source.long > target.long AND source.latt > target.latt then 6 --'SW'
when source.long > target.long AND source.latt = target.latt then 7 --'W'
when source.long > target.long AND source.latt < target.latt then 8 --'NW'

else 9 --'ER'

end ) as direction,

-- the line that represent the connection
st_setsrid(
 ST_MakeLine(
  source.geom::GEOMETRY, target.geom::GEOMETRY
 ), 4326) AS geom,

-- the distance in meters between two points
st_distance(geography(source.geom), geography(target.geom)) AS dist

FROM

-- distance should match what is made in the grid.
(select 
grid_1.*, dist_limit, next_latt, prev_latt
 FROM 
 grid_1 
 left join 
 indeks on grid_1.latt=indeks.latt order by grid_1.latt 
) AS source 
JOIN 
 (select *, FROM grid_1) AS target 
ON 
-- one way links only
(
  target.long > source.long  
  AND target.latt between (source.prev_latt) and (source.next_latt) 
  OR
  (target.long = source.long
   AND target.latt > source.latt AND target.latt <= next_latt)
) 
and ST_DWithin(target.geom, source.geom, dist_limit);
A cut out from around Tasmania

A cut out from around Tasmania

The preceding, when done for each of the 4 grid tables, neglects that we need links between them as well. I fix this separately. The reason that I did not do the previous using the view has probably something to do with the fact that the indexes on the view do not work well, and having one large table with all grid points was also unwieldy.

Note that while I create an actual geospatial line that I can plot in a map, this is not required for any of the calculation. The table is nothing more than an index that indicates which two nodes (grid points) are connected, with added information on the ‘cost’ to traverse the length. This cost can be what ever we want it to be, geographical distance, time, or anything else.

-- insert where connections overlap partial grid tables.
create index grid_1_latt_idx on grid_1(latt);
create index grid_2_latt_idx on grid_2(latt);
create index grid_3_latt_idx on grid_3(latt);
create index grid_4_latt_idx on grid_4(latt);

-- fixing missing links. this is slower than it should be. there are just 
-- 64772 grid points, but the 'with'-query probably looses the index. 
-- maybe a temporary table with index on geom would be better.
drop table if exists border_grid;
create table border_grid as (
    select * from grid_1 where latt= (select max(latt) from grid_1)
    union
    select * from grid_2 
    where 
    latt= (select min(latt) from grid_2 ) 
     OR 
    latt= (select max(latt) from grid_2)
    union
    select * from grid_3 
    where 
     latt= (select min(latt) from grid_3 )
      OR 
     latt= (select max(latt) from grid_3)
    union
    select * from grid_4 
    where latt= (select min(latt) from grid_4 )
);
create index border_grid_geom_idx on border_grid using gist(geom); 


with indeks as (
select 
 idx1.*, 
 idx2.latt+0.000001 as next_latt, 
 idx3.latt-0.000001 as prev_latt, 
 st_distance(
  st_point(0,idx1.latt), st_point(idx1.long_d,idx2.latt)
 ) dist_limit 
 from 
  (select row_number() over () id1, * 
   from grid_index order by latt) as idx1 
  left join 
  (select (row_number() over () -1) id2, * 
   from grid_index order by latt) as idx2 
  on id1=id2 
  left join 
  (select (row_number() over ()+1) id3, * 
  from grid_index order by latt) as idx3 
  on id1=id3
)


INSERT INTO o_mesh (source, target, direction, geom, dist)
SELECT
source.pkey as source, target.pkey as target, (
case
when source.long = target.long AND source.latt < target.latt then 1 --'N'
when source.long < target.long AND source.latt < target.latt then 2 --'NE'
when source.long < target.long AND source.latt = target.latt then 3 --'E'
when source.long < target.long AND source.latt > target.latt then 4 --'SE'
when source.long = target.long AND source.latt > target.latt then 5 --'S'
when source.long > target.long AND source.latt > target.latt then 6 --'SW'
when source.long > target.long AND source.latt = target.latt then 7 --'W'
when source.long > target.long AND source.latt < target.latt then 8 --'NW'

else 9 --'ER'

end ) as direction,

st_setsrid(
 ST_MakeLine(source.geom::GEOMETRY, 
 target.geom::GEOMETRY), 4326) AS geom,

st_distance(geography(source.geom), geography(target.geom)) AS dist

FROM

-- distance should match what is made in the grid.
(select border_grid.*, dist_limit, next_latt , prev_latt
 FROM border_grid left join indeks 
 on border_grid.latt=indeks.latt order by border_grid.latt
) AS source 
JOIN
 (select * FROM border_grid) AS target 
ON 
-- one way links only
(
  target.long > source.long  
  AND target.latt between (source.prev_latt) and (source.next_latt) 
  OR
  (target.long = source.long
   AND target.latt > source.latt AND target.latt <= next_latt)
) 
and ST_DWithin(target.geom, source.geom, dist_limit);

drop table border_grid;

Housekeeping.

-- vacuum after entering so much data
VACUUM ANALYZE;

-- creating indices: do this after insertions.
CREATE INDEX o_mesh_source_idx ON o_mesh("source");
CREATE INDEX o_mesh_target_idx ON o_mesh("target");


-- CREATE INDEX o_mesh_geog_idx ON o_mesh using GIST(geog);
CREATE INDEX o_mesh_geom_idx ON o_mesh using GIST(geom);
alter table o_mesh add column pkey serial primary key;

-- vacuum after entering so much data
VACUUM ANALYZE;

6 Walking cost

We already have the distance between each two points, around 5km. Next is to calculate the walking time it requires to go from one point to the other. For this we take into account the altitude difference, and the direction.

First prepare the mesh table for the additional information.


alter table o_mesh add column cost_elev real;
alter table o_mesh add column source_alt real; --altitude source
alter table o_mesh add column target_alt real; -- altitude origin
alter table o_mesh add column angle real; -- angle
alter table o_mesh add column hours real; -- travel hours
alter table o_mesh add column rev_hours real; -- " reverse direction
alter table o_mesh add column "type" smallint; -- land, water, cross-over

Add information on the altitude.

update o_mesh set source_alt = alt 
 from grid where o_mesh.source=grid.pkey;
update o_mesh set target_alt = alt 
 from grid where o_mesh.target=grid.pkey;

I use Naismith’s rule https://en.wikipedia.org/wiki/Naismith%27s_rule with additions, which states the following - Allow 1 hour for every 5 kilometres (3.1 mi) forward, plus 1 hour for every 600 metres (2,000 ft) of ascent. - Aitken - Langmuir corrections – Subtract 10 minutes for every 300 meters of descent for slopes between 5 degrees and 12 degrees – add 10 minutes for every 300 meters of descent for slopes greater than 12 degrees.

Actually, this is a proper function: \[\text{velocity} = 6 * \exp ( -3.5 * | \tan (angle)+0.05|)\]

I add to this that the angle cannot be more than 12 degrees. If the slope is higher than that, a walker will lengthen its path to reach destination; making z-shapes to ascend a mountain.

So, the length of slope = sqrt(geodesic length^2 + height^2) =height/sin(angle) If angle>0.12, geodesic distance: height/tan(0.12), slope length: height/sin(0.12)

So, then cost = distance / speed = (height/sin(angle)) / (1000 * 6EXP(-3.5ABS(TAN(angle)+0.05)))

To implement this I distinguish 3 cases, no angle (flat surface), inclined surfaces up to 12 degrees, and inclination over 12 degrees. The the whole thing needs repetition for when the surface slopes down.

-- note transformation from radients to degrees and back.
update o_mesh 
 set angle = atan2((target_alt-source_alt),dist) 
 where pkey>=0E6 and pkey<4E6; -- determines range for processes

update o_mesh set hours = (
 case
 when angle=0. THEN dist/ 5036.742
 when abs(angle) <= 12.0/180.0*pi() 
  THEN (target_alt-source_alt)/sin(angle) / 
       (1000.0 * 6.0*EXP(-3.5*ABS(TAN(angle)+0.05)))
 else 
  (abs(target_alt-source_alt)/sin(12.0/180.0*pi())) / 
  (1000.0 * 6.0*EXP(-3.5*ABS(TAN(sign(angle)*12.0/180.0*pi())+0.05)))
 end
 ),
 rev_hours = ( -- only difference is the minus in the dominator 
               -- adjusting speed, distance is the same up or down.
 case
 when angle = 0 THEN dist/ 5036.742
 when abs(angle) <= 12.0/180.0*pi() 
  THEN (target_alt-source_alt)/sin(angle) / 
  (1000.0 * 6.0*EXP(-3.5*ABS(TAN(-1*angle)+0.05))) 
    -- alternative: (dist/1000)/COS(angle) 
 else (abs(target_alt-source_alt)/sin(12.0/180.0*pi())) / 
  (1000.0 * 6.0*EXP(-3.5*ABS(TAN(sign(angle)*-1*12.0/180.0*pi())+0.05)))
 end
 ),
 "type" = 1 where pkey>=0 and pkey<4E6; -- needs to be same as above
 
-- cleaning up the table
delete from o_mesh where 
source is null or 
target is null or
source_alt is null or 
target_alt is null or
dist is null;  

Next is to add time costs of crossing water-bodies, like rivers. The arbitrary assumption is 3 hours, which is quite a lot when recognising that the normal distance between two neighbouring points on a flat surface would be crossed in around 1 hour.

First prepare a table, tw for traversing water.

create table upd_tw (pkey int);

Then split the following over multiple processes by pkey to speed up the insertions.

insert into upd_tw
select pkey from o_mesh where exists(select 1 from 
 (select gid, geom from rivers
   union all
   select gid, geom from lakes
  ) as water
 where st_intersects(o_mesh.geom, water.geom))
 and pkey>=0E6 and pkey<2E6; -- determines range for processes

create index upd_tw_pkey on upd_tw(pkey);

Then update the costs in one go.

begin;
set local work_mem = '5000 MB';

-- create new table with new column filled with default values
create table o_mesh_new as 
 select source, target, direction, geom, dist, 
        pkey, cost_elev, source_alt, target_alt, 
        angle hours, rev_hours, 1 as "type", 
        0 as travers_waters, hours as hours_tw, 
        rev_hours as rev_hours_tw 
 from o_mesh;
create index o_mesh_new_pkey on o_mesh_new(pkey);

-- update new columns where necessary
-- set the crossing time for lakes and rivers here as original hours + X
update o_mesh_new set 
  travers_waters=1,
  hours_tw=hours + 3,
  rev_hours_tw=rev_hours + 3 
   where pkey in (select pkey from upd_tw);

-- replace old table with the new
drop table o_mesh;
alter table o_mesh_new rename to o_mesh;

-- recreate primary key
alter table o_mesh 
  drop column pkey, 
  add column pkey bigserial primary key;

--alter table o_mesh add constraint 
create index o_mesh_geom_idx on o_mesh using gist(geom);
create index o_mesh_source_idx on o_mesh(source);
create index o_mesh_target_idx on o_mesh(target);

end;

Housekeeping

vacuum full analyze o_mesh;

Finally, it is quite straightforward to extent the cost travelling between any two points with additional information. Say, with information from a raster on specific geology zones (jungle, desert, savannah, steppe, etc.), one could add this information by associating such data to the mesh, and adding some multiplication constant to the hours for each type of zone.

7 Water

Adding a water-based network is to a large extent identical to the land based network. There is one major simplification: dots are not spaced 5km apart but 50km. That reduces the size of the network substantially, and therefore allows it to be created quite a bit faster as well. A complication is that the date line (where -180 meets +180) needs to be dealt with structurally. This was sidestepped in the land mesh, but will still have to solved later.

The relevant grid_index_water is just a subset from grid_index, because we do not use every 5km, but every 50km. the mod function is useful for this.

drop view if exists grid_index_water;
create view grid_index_water as (
 select id, latt, latt_d latt_d_57,
  st_xmax(st_buffer(st_point(0,latt)::geography,50000)::geometry)::real   
   as long_d 
  from grid_index 
  where 
   mod(latt_d::int,50000::int)=0 or 
   mod(-latt_d::int,50000::int)=1000
);

The straightforward solution would be to create all the points. Cut out those that overlap land. In this case this is feasible, because the number of points that we deal with is much smaller. An alternative solution would have followed the one for land.

The water grid:

drop table if exists grid_water;
create table grid_water (
    latt  real,
    long  real
);
insert into grid_water (latt, long) (
with idx as (
    select * from grid_index_water 
     where grid_index_water.latt>-57 and grid_index_water.latt<81
    )
select idx.latt, generate_series((-180*10^5)::int, (180*10^5)::int,
       floor(idx.long_d*10^5)::int)::real/(10^5) as long 
 from idx);
South and Western Europe

South and Western Europe

Adding manually some points in Sea of Marmara, to facilitate navigation from Mediterranean to the Black Sea. This issue comes up because of the low resolution, but I don’t want to increase the resolution for the globe for this small water body.

insert into grid_water(latt, long) 
 VALUES (40.889, 27.624), (40.889, 28.319), (40.889, 28.984);

Then create all points, and delete those that cross land.

alter table grid_water add column geom GEOMETRY('POINT', 4326);
update grid_water 
 set geom = st_setsrid(ST_POINT(long::real, latt::real),4326); 
create index grid_water_geom on grid_water using gist(geom);
delete from grid_water using land 
 where st_intersects(grid_water.geom, land.geom);
-- this is slow because we're comparing each of 69k rows with 133k rows.
alter table grid_water add column pkey serial primary key;
South and Western Europe

South and Western Europe

Additionally, I would like to know which dots are within 200 km of a coastline. This 200 km is used in the analysis to make a distinction between crossing small water bodies, such as the Mediterranean or connect islands, but prohibit ocean crossings. These buffers will interfere with the date-line, so we need to deal with that separately.

Starting from the longitudinal limits we created for land, take 200km buffers around these, with spatial union where necessary and use this to identify all water dots within 200km of a coast line.

drop table if exists land_buffers;
create table land_buffers as (
with points_1 as (
select latt, long_min, long_max, 
   (st_dump(
     st_split(
      st_buffer(
       st_point(long_min, latt)::geography, 200000
      )::geometry('POLYGON', 4326),
      st_setsrid(st_makeline(st_point(-174, -57),
                             st_point(-174,  81)), 4326)
     )
   )).geom p1_geom, 
   (st_dump(
     st_split(
      st_buffer(
       st_point(long_max, latt)::geography, 200000
      )::geometry('POLYGON', 4326),
      st_setsrid(st_makeline(st_point(174, -57),
                             st_point(174,  81)), 4326)
     ) 
    )).geom p2_geom from long_limits
)
    select latt, long_min long, 
           p1_geom::GEOMETRY('POLYGON', 4326) geom 
    from points_1
    union
    select latt, long_min long, 
           p2_geom::GEOMETRY('POLYGON', 4326) geom 
    from points_1
);
alter table land_buffers add column id serial primary key;
create index land_buffers_geom on land_buffers using gist(geom);
-- identify buffers crossing dateline
alter table land_buffers add column dl int;

with points as (
select row_number() over (PARTITION BY id) r, * 
from (
select id, (st_dumppoints(st_envelope(geom))).geom 
 from land_buffers lb
 where 
 (lb.geom &&
  st_setsrid(st_makeline(ST_MakePoint(174,90), 
                         ST_MakePoint(174,-90)),4326)
 )
 or
 (lb.geom &&
  st_setsrid(st_makeline(ST_MakePoint(-174,90),
                         ST_MakePoint(-174,-90)),4326)
 )
) foo),
points_dl as (select 
 p1.id, 
 st_astext(p1.geom) as geom_bl, 
 sign(st_x(p1.geom)) sign, 
 st_astext(p2.geom) as geom_tr, 
 sign(st_x(p2.geom)) sign 
 from points p1 join points p2 on (p1.id=p2.id and p1.r=1 and p2.r=3) 
 where  sign(st_x(p1.geom))=-1*sign(st_x(p2.geom)) 
)
update land_buffers set dl = 1 
 from points_dl where points_dl.id = land_buffers.id; 
update land_buffers set dl = 0 where dl is null;
South and Western Europe

South and Western Europe

Then if a point falls in the buffer, it should be denoted as such, using the indicator is_lb (is land border).

-- intersects land_buffers
alter table grid_water add column is_lb int;
update grid_water set is_lb = foo.is_lb 
 from 
    (select g.pkey, 
            bool_or(st_intersects(g.geom,lb.geom))::int as is_lb 
     from grid_water g, land_buffers lb 
     where g.geom && lb.geom and lb.dl=0 group by g.pkey) as foo 
 where grid_water.pkey = foo.pkey;
South and Western Europe

South and Western Europe

While this works well generally, around the +/-180 degrees longitude this is a bit of mess. So, we use a the geography method there, and work specifically on all points in the 174 to 180 and -174 to -180 degrees range.

-- grid points to be added back
drop table if exists grid_water_dl;
create table grid_water_dl as (
 select * from grid_water where (long<-174.0 or long > 174.0) 
);
create index grid_water_dl_geom_idx on grid_water_dl using gist(geom);

Check distance of each point from land for within 200km?

alter table grid_water_dl add column is_lb int;
update grid_water_dl set is_lb = chk 
 from (
  select g.pkey, bool_or(st_dwithin(geography(g.geom), 
         geography(land.geom), 200000))::int chk 
  from grid_water_dl g, land group by pkey) as foo 
 where grid_water_dl.pkey = foo.pkey; 

Update points in grid_water with dateline fix

update grid_water 
 set is_lb = 1 where pkey in (
  select pkey from grid_water_dl where is_lb=1
 );
Alaska

Alaska

Alaska

Alaska

Then creating the mesh is the same as before, but I allow for more cross-connections. The reason is a combination of the larger resolution, which would create some large shipping detours, and the constraint of winds, which cannot be directed against at certain angels. So I need to allow for more flexibility.

drop table o_mesh_water;
create table o_mesh_water (
 "source" int,
 target int,
 direction smallint, -- varchar(2),
 geom GEOMETRY('LINESTRING', 4326),
 dist real
);

INSERT INTO o_mesh_water (source, target, direction, geom, dist)

-- grid_index_water with next and previous latt value
with indeks as (
select 
 idx1.*, 
 idx2.latt+0.000001 as next_latt,
 idx3.latt-0.000001 as prev_latt,
 st_distance(st_point(0,idx1.latt), 
             st_point(2*idx1.long_d,idx2.latt)) dist_limit 
from 
  (select  row_number() over (order by latt) id1, * 
   from grid_index_water ) idx1 
  left join 
  (select (row_number() over (order by latt) -1) id2, * 
  from grid_index_water ) idx2 
  on id1=id2 
  left join 
  (select (row_number() over (order by latt) +1) id3, * 
   from grid_index_water ) idx3 
  on id1=id3
)

SELECT
source.pkey as source, target.pkey as target, (
case
when source.long = target.long AND source.latt < target.latt then 1 --'N'
when source.long < target.long AND source.latt < target.latt then 2 --'NE'
when source.long < target.long AND source.latt = target.latt then 3 --'E'
when source.long < target.long AND source.latt > target.latt then 4 --'SE'
when source.long = target.long AND source.latt > target.latt then 5 --'S'
when source.long > target.long AND source.latt > target.latt then 6 --'SW'
when source.long > target.long AND source.latt = target.latt then 7 --'W'
when source.long > target.long AND source.latt < target.latt then 8 --'NW'

else 9 --'ER'

end ) as direction,

st_setsrid(ST_MakeLine(source.geom::GEOMETRY, 
                       target.geom::GEOMETRY), 4326) AS geom,

st_distance(geography(source.geom), geography(target.geom)) AS dist

FROM

-- distance should match what is made in the grid.
(select grid_water.*, dist_limit, next_latt , prev_latt
FROM 
 grid_water 
 left join 
 indeks 
 on grid_water.latt=indeks.latt 
 order by grid_water.latt 
) AS source 
JOIN 
 (select * FROM grid_water ) AS target 
ON 
-- one way links only
(
  ( target.long > source.long  -- next long, up or down
    AND target.latt between (source.prev_latt) and (source.next_latt)
    AND target.latt != source.latt 
    AND ST_DWithin(target.geom, source.geom, dist_limit)
  )
  OR
  ( target.long > source.long  -- next long straight
    AND target.latt = source.latt 
    AND ST_DWithin(target.geom, source.geom, dist_limit/2)
  )
  OR
  (target.long = source.long -- same long, up or down
   AND target.latt > source.latt AND target.latt <= next_latt
   AND ST_DWithin(target.geom, source.geom, dist_limit))
);

This creates a nice mesh. However, there are also some issues, as are apparent in the yellow lines on the map. Some lines cut across way more land than I’m comfortable with to ignore. However, I did not want to remove all lines that cross land, because that would remove too many lines. So I ended up just selecting those lines manually in QGIS, copying them to a separate layer, and removing them from the original. Not all is bad, however. Around Gibraltar the Atlantic is beautifully connected with the Mediterranean. But in other parts of the world additional links should be required. For instance, around Istanbul, and Denmark.

Southern Europe, North Africa

Southern Europe, North Africa

Black Sea

Black Sea

Denmark and Kattegat

Denmark and Kattegat

This is fixed manually in the following. First I copied the crossing lines to a new shapefile in QGIS, my manual selection. Then these lines are deleted using a simple spatial match on the original. This shapefile that includes these is provided at http://www.wnvermeulen.com/empires/shp/o_mesh_water_rm.zip

Load that first (in bash console)

shp2pgsql -I -s 4326 -W "LATIN1" <path>/o_mesh_water_rm.shp  \
    o_mesh_water_rm | psql -d empires -U wessel

Then remove from original table using a spatial match.

delete from o_mesh_water o using o_mesh_water_rm rm 
 where st_within(o.geom, rm.geom);

Next, I add manually connections that I want to exist but did not appear due to the constraints set in the algorithm above.


-- additional connections to connect see of marmara with black sea

insert into o_mesh_water (source, target, direction, geom, dist)
select source, target,
3 as direction, 
st_setsrid(ST_MakeLine(s.the_geom::GEOMETRY, 
                       t.the_geom::GEOMETRY), 4326) AS geom,
st_distance(geography(s.the_geom), 
            geography(t.the_geom)) AS dist
from 
(select id as source, the_geom from o_mesh_water_vertices_pgr o 
order by  o.the_geom <#> st_setsrid(
         ST_POINT(27.624::real, 40.889::real), 4326) limit 1) s,
(select id as target, the_geom from o_mesh_water_vertices_pgr o 
order by  o.the_geom <#> st_setsrid(
         ST_POINT(28.319::real, 40.889::real),4326) limit 1) t;

insert into o_mesh_water (source, target, direction, geom, dist)
select source, target,
4 as direction, 
st_setsrid(ST_MakeLine(s.the_geom::GEOMETRY, 
                       t.the_geom::GEOMETRY), 4326) AS geom,
st_distance(geography(s.the_geom), 
            geography(t.the_geom)) AS dist
from 
(select id as source, the_geom from o_mesh_water_vertices_pgr o 
order by  o.the_geom <#> st_setsrid(
            ST_POINT(27.624::real, 40.889::real),4326) limit 1) s,
(select id as target, the_geom from o_mesh_water_vertices_pgr o 
order by  o.the_geom <#> st_setsrid(
            ST_POINT(27.758::real, 40.548::real),4326) limit 1) t;

insert into o_mesh_water (source, target, direction, geom, dist)
select source, target,
4 as direction, 
st_setsrid(ST_MakeLine(s.the_geom::GEOMETRY, 
                       t.the_geom::GEOMETRY), 4326) AS geom,
st_distance(geography(s.the_geom), 
            geography(t.the_geom)) AS dist
from 
(select id as source, the_geom from o_mesh_water_vertices_pgr o 
order by  o.the_geom <#> st_setsrid(
           ST_POINT(27.624::real, 40.889::real),4326) limit 1) s,
(select id as target, the_geom from o_mesh_water_vertices_pgr o 
order by  o.the_geom <#> st_setsrid(
           ST_POINT(28.351::real, 40.548::real),4326) limit 1) t;


insert into o_mesh_water (source, target, direction, geom, dist)
select source, target,
3 as direction, 
st_setsrid(ST_MakeLine(s.the_geom::GEOMETRY, 
                       t.the_geom::GEOMETRY), 4326) AS geom,
st_distance(geography(s.the_geom), 
            geography(t.the_geom)) AS dist
from 
(select id as source, the_geom from o_mesh_water_vertices_pgr o 
order by  o.the_geom <#> st_setsrid(
           ST_POINT(28.319::real, 40.889::real),4326) limit 1) s,
(select id as target, the_geom from o_mesh_water_vertices_pgr o 
order by  o.the_geom <#> st_setsrid(
           ST_POINT(28.984::real, 40.889::real),4326) limit 1) t;


insert into o_mesh_water (source, target, direction, geom, dist)
select source, target,
4 as direction, 
st_setsrid(ST_MakeLine(s.the_geom::GEOMETRY, 
                       t.the_geom::GEOMETRY), 4326) AS geom,
st_distance(geography(s.the_geom), 
            geography(t.the_geom)) AS dist
from 
(select id as source, the_geom from o_mesh_water_vertices_pgr o 
order by  o.the_geom <#> st_setsrid(
           ST_POINT(28.319::real, 40.889::real),4326) limit 1) s,
(select id as target, the_geom from o_mesh_water_vertices_pgr o 
order by  o.the_geom <#> st_setsrid(
           ST_POINT(28.351::real, 40.548::real),4326) limit 1) t;


insert into o_mesh_water (source, target, direction, geom, dist)
select source, target,
2 as direction, 
st_setsrid(ST_MakeLine(s.the_geom::GEOMETRY, 
                       t.the_geom::GEOMETRY), 4326) AS geom,
st_distance(geography(s.the_geom), 
            geography(t.the_geom)) AS dist
from 
(select id as source, the_geom from o_mesh_water_vertices_pgr o 
order by  o.the_geom <#> st_setsrid(
           ST_POINT(28.984::real, 40.889::real),4326) limit 1) s,
(select id as target, the_geom from o_mesh_water_vertices_pgr o 
order by  o.the_geom <#> st_setsrid(
           ST_POINT(29.406::real, 41.481::real),4326) limit 1) t;

insert into o_mesh_water (source, target, direction, geom, dist)
select source, target,
2 as direction, 
st_setsrid(ST_MakeLine(s.the_geom::GEOMETRY, 
                       t.the_geom::GEOMETRY), 4326) AS geom,
st_distance(geography(s.the_geom), 
            geography(t.the_geom)) AS dist
from 
(select id as source, the_geom from o_mesh_water_vertices_pgr o 
order by  o.the_geom <#> st_setsrid(
           ST_POINT(28.984::real, 40.889::real),4326) limit 1) s,
(select id as target, the_geom from o_mesh_water_vertices_pgr o 
order by  o.the_geom <#> st_setsrid(
           ST_POINT(28.810::real, 41.481::real),4326) limit 1) t;

-- baltic sea with north sea
insert into o_mesh_water (source, target, direction, geom, dist)
select source, target,
2 as direction, 
st_setsrid(ST_MakeLine(s.the_geom::GEOMETRY, 
                       t.the_geom::GEOMETRY), 4326) AS geom,
st_distance(geography(s.the_geom), 
            geography(t.the_geom)) AS dist
from 
(select id as source, the_geom from o_mesh_water_vertices_pgr o 
order by  o.the_geom <#> st_setsrid(
           ST_POINT(10.813::real, 55.420::real),4326) limit 1) s,
(select id as target, the_geom from o_mesh_water_vertices_pgr o 
order by  o.the_geom <#> st_setsrid(
           ST_POINT(11.119::real, 56.289::real),4326) limit 1) t;

insert into o_mesh_water (source, target, direction, geom, dist)
select source, target,
2 as direction, 
st_setsrid(ST_MakeLine(s.the_geom::GEOMETRY, 
                       t.the_geom::GEOMETRY), 4326) AS geom,
st_distance(geography(s.the_geom), 
            geography(t.the_geom)) AS dist
from 
(select id as source, the_geom from o_mesh_water_vertices_pgr o 
order by  o.the_geom <#> st_setsrid(
           ST_POINT(10.813::real, 55.420::real),4326) limit 1) s,
(select id as target, the_geom from o_mesh_water_vertices_pgr o 
order by  o.the_geom <#> st_setsrid(
           ST_POINT(11.119::real, 56.289::real),4326) limit 1) t;

To keep track of the different ‘modes’ of traveling later, we create a new integer column where each number stands for a mode: 1=land, 2=water, 3=land-water.

alter table o_mesh_water add column "type" smallint;

Some housekeeping and other preparations

CREATE INDEX o_mesh_water_geom_idx ON o_mesh_water using GIST(geom);
alter table o_mesh_water add column pkey serial primary key;

The following command is a preparation for the routing commands. It creates new tables of nodes, called o_mesh_water_vertices_pgr.

select pgr_createTopology('o_mesh_water', 0.0001, 'geom', 'pkey');

Here I add the indicator on the dots within 200km of a coast.

alter table o_mesh_water alter column within200km set default FALSE;

update o_mesh_water set within200km = TRUE from
  (select o.id from o_mesh_water_vertices_pgr o , grid_water gw
  where o.the_geom=gw.geom and gw.is_lb=1) idx
where o_mesh_water.source=idx.id OR o_mesh_water.target=idx.id;

7.1 Winds

For the cost of sailing we require two pieces of information. wind direction and speed across the oceans, and information on sailing speed given the wind and sailing direction.

The wind information originally comes from the QuickSCAT satellites: https://podaac-opendap.jpl.nasa.gov/opendap/allData/quikscat/L3/wind_1deg_1mo/ We used the file: sfcWind_QuikSCAT_L2B_v20110531_199908-200910.nc. This holds daily data from 1999-2009. Assuming that sailing predominantly occurs in certain months of the year we take the average speed and direction for these months and this is the information that we work with. The assumption is similar to Pascali (2017)1

The second piece of information also comes from Pascali (2017), but the working paper version, which documents polar diagram of typical sailing ship that gives the speed of sailing given the wind speed and the directions the wind is coming from relative to the direction the ship is taking.

Both files are attached here: <windsSFC.csv>, <polardiag_vec.csv>

drop table if exists winds;
create table winds (
sfcWind numeric, lat numeric,   lon numeric, vas numeric,   
uas numeric, direction numeric, speed2 numeric
);
\copy winds FROM 'windsSFC.csv' DELIMITERS ',' CSV HEADER
alter table winds 
 add column geom GEOMETRY('POINT', 4326), 
 add id SERIAL PRIMARY KEY;
update winds set geom = st_setsrid(st_point(lon, lat), 4326);
create index winds_geom on winds using gist(geom);
alter table winds add column speed numeric;
update winds set speed = speed2*3.6; -- from m/s to km/h

drop table if exists polar;
create table polar (
 angle numeric, windspeed_raw numeric, sailspeed numeric
);
-- speeds as kmh = knots * 1.852001
\copy polar FROM 'polardiag_vec.csv' DELIMITERS ',' CSV HEADER
alter table polar add id SERIAL PRIMARY KEY;
alter table polar add column windspeed numeric;
insert into polar 
 select 180+(180-angle) angle, windspeed_raw, sailspeed 
 from polar where angle!=0;
update polar set windspeed = round(ceiling(windspeed_raw)/6)*6;
create index polar_angle_speed on polar(angle, windspeed);

First things first. The winds data, with some rendering in QGIS already allows for some awesome figures.

North Atlantic

North Atlantic