Book Review: The ESRI Guide to GIS Analysis Vol. 2: Spatial Measurements & Statistics

Title: The ESRI Guide to GIS Analysis Vol. 2: Spatial Measurements & Statistics
Author: Andy Mitchell
Publisher: ESRI Press
Year: 2005
Aimed at: GIS/Analysts/Map Designers – intermediate
Purchased from: www.wordery.com

ESRI GA V2

This textbook acts as companion text for GIS Tutorial 2: Spatial Analysis Workbook (for ArcGIS 10.3.x) where you can match up the chapters in each book. Although not a necessity, I would recommend using both texts in tandem to apply the theory and methods discussed with practical tutorials and walkthroughs using ArcGIS. This is the second book of the series and follows on from The ESRI Guide to GIS Analysis Volume 1: Geographic Patterns & Relationships.

The first chapter is, inevitably, an introduction to spatial measurements and statistics. You perform analysis to answer questions and to answer these questions you not only need data but you also need to understand the data. Are you using nominal, ordinal, interval or ratio values, or a combination of these? The type of value(s) will shape the analysis techniques and methods used to calculate the statistics. You will need to interpret the statistics, test their significance and question the results. These elements are briefly visited with the premise of getting more in-depth as the book progresses. The chapter ends with a section on ‘Understanding data distributions’ which is essentially a brief introduction to data exploratory techniques such as describing frequency distributions, spatial distributions, and the presence of outliers and how they can affect analysis.

Chapter 2 discusses measuring geographic distributions with the bulk of the chapter focused on finding the center (mean, meridian, central feature), and measuring compactness (standard distance), orientation and direction of distributions (spatial trends). These are discussed for points, lines, and areal features and also using weighted factors based on attributes. These are useful for adding statistical confidence to patterns derived from a map. Formulas and equations begin to surface and although not necessary to learn them off by heart, because the GIS does all the heavy lifting for you, it gives insight into what goes on under the hood, and knowing the underlying theory and formulas can often aid in troubleshooting and producing accurate analysis. The last section of this chapter is fundamental to the rest of the text, testing statistical significance. This allows you to measure a confidence level for your analysis using the null hypothesis, p-value, and z-score. This can be a difficult topic to comprehend and may require further reading.

The third chapter, a lengthy one, is based around using statistical analysis to identify patterns, to enhance and backup the visual analysis of the map with confidence or to find patterns not may not have been immediately obvious. The human eye will often see patterns that do not really exist, so alternatively, statistical analysis might indicate what you thought was a strong pattern was actually quite weak. The statistical analysis methods are beginning to heat up and here we are introduced to; the Kolmorogov-Smirnov test and Chi Square test for quadrat analysis in identifying patterns in areas of equal size; the nearest neighbour index for calculating the average distance between features and identifying clustering or dispersion; and the K-function as an alternative to the nearest neighbour index, each used to measure the pattern of feature locations. These are followed by measuring the spatial pattern of feature values using; the join count statistic for areas with categories; Geary’s c and Moran’s I for measuring the similarity of nearby features, and the General-G statistic for measuring the concentration of high and low values for features having continuous values. The formulas for each are presented along with testing the significance of and interpreting the results. The final section of this chapter discusses defining spatial neighbourhoods and weights when analysing patterns. There are a few things to consider such as local or regional influences, thresholds of influence, interaction between adjacent features, and the rate of regional decline of influence.

Chapter 4 is titled ‘Identifying Clusters’ with a main focus on hotspot analysis. First, we are introduced to nearest neighbour hierarchical clustering which is heavily used in crime analysis. While Chapter 3 discussed global methods for identifying patterns and returns a single statistic, this chapter focuses on local statistics to show where these patterns exist within the global setting. Geary’s c and Moran’s I both have local versions and their definition, implementation, and factors influencing the results are discussed and critiqued along with Art Getis’ and Keith Ord’s Gi* method for identifying hot and cold spots.While the methods in Chapter 3 enforced that there are patterns in the data (or not), the methods in Chapter 4 highlight where these clustered patterns are. The last section of Chapter 4 discusses using statistics with geographic data; how the very nature of geographic data affects your analysis, how geographic data is represented in a GIS affects your data analysis, the influence of the study area boundary, and GIS data and errors.

“To the extent you’re confident in the quality of your GIS data, you can be confident in the quality of your analysis results.”

The last chapter ventures away from identifying patterns and clusters and focuses on analysing geographic relationships and using statistics to analyse such. Geographic relationships and processes are used to predict where something is likely to occur and examining why things occur where they do. Chapter 5 looks at statistical methods for identifying geographical relationships with a Pearson’s correlation coefficient and Spearman’s correlation coefficient discussed and assessed. Linear regression (ordinary least squares), and geographically weighted regression are presented as methods for analysing geographic processes. These methods warrant a full text in their own right and there is a list of further reading available at the end of the chapter.

Overall Verdict: I feel that I will be referring back to this text a lot. Having recently completed a MSc in Geocomputation I wish that this had crossed my path during the course of my studies and I would highly recommend this book to anyone venturing into spatial analysis where statistics can aid and back up the analysis. Although they are littered throughout the chapters, you really do not need to get bogged down with the formulas behind the statistical analysis techniques, the most important points is that you understand what the methods are performing, their limitations, and how to assess the results and this book really is a fantastic reference for doing just that. Knowing the theory is a huge step to being able to apply the analysis techniques confidently and derive accurate reporting of your data.

Book Review: The ESRI Guide to GIS Analysis Vol. 1: Geographic Patterns & Relationships

Title: The ESRI Guide to GIS Analysis Vol. 1: Geographic Patterns & Relationships
Author: Andy Mitchell
Publisher: ESRI Press
Year: 1999
Aimed at: GIS/Analysts/Map Designers – beginner
Purchased from: www.wordery.com

GIS Analysis Vol 1

This textbook is a companion text for GIS Tutorial 2: Spatial Analysis Workbook (for ArcGIS 10.3.x) where you can match up the chapters in each book. Although not a necessity, I would recommend using both texts in tandem to apply the theory and methods discussed with practical tutorials and walkthroughs using ArcGIS.

The title of this book might lead you to believe that ArcGIS will feature heavily throughout the text but Michael F. Goodchild sets this straight in the Preface by stating that he applauds ESRI for backing this book even though it isn’t Arc eccentric. The author, Andy Mitchell, presents the material as generic GIS such that most GIS software packages should be able to utilise the techniques discussed.

Chapter 1 is a short introduction to what GIS analysis is, understanding the representation of geographic features in a GIS, and the common attributes associated with geographic features that allow for analysis. The wording is simplistic in nature and easy to follow, and acts as a good entrance to the rest of the book.

The second chapter begins to delve into the realm of visual analysis, using your brain to to discern patterns for a better understanding of the data and the area that you are mapping. Several real-life mapped examples are displayed to show how ‘mapping where things are’ aids in more focused decision making. The chapter steps through; deciding what to map, preparing your data, and making your map, with comparison figures to show you why you might perform such tasks.

Why map the most and least? Because mapping features based on quantities adds an additional level of information beyond simply mapping the locations of the features and this notion is made clear from providing some real-life examples in Chapter 3. The author then takes us down a path to understanding quantities and the importance of knowing the type of quantities that you are mapping, and this naturally leads onto the next topic of classification, why use classes? and choosing an appropriate classification method/scheme for the purpose of your data. It is important to understand how classification methods such as Natural Breaks (Jenk’s), Quantile, Equal Interval, and Standard Deviation classify your data and having a general guideline on choosing the appropriate method.

A great recurring aspect in this book is that every chapter begins with a question and Chapter 4’s is ‘Why Map Density?’ and then proceeds to answer the question and the methods available for mapping in a GIS. This chapter discusses density for defined areas, dot density mapping, and density surfaces, what the GIS does to create them and the results of the output.

The fifth chapter takes a look at mapping what’s inside an area, discusses why you would want to map inside an area?, and some analysis and results that can be derived from such. Do you need to map a single area to find what’s happening inside or multiple areas to analyse what’s happening inside each for comparison purposes? Methods are explained along with how the GIS performs these for analysis. You might want to find out if a certain feature is within an area, a list of all features inside an area and a count of each, or the sum of a designated land type area within a boundary for examples. Summaries and statistics can also be generated from what is found inside an area boundary.

Having assessed some simple techniques for mapping what’s inside an area, the next chapter casts it’s attention towards finding what’s nearby. People often think of nearness in straight lines or along transport networks, but GIS is also useful for travel cost analysis giving weight to different land use or soil types for example when considering the path for a pipeline. Nearness by straight-line distance, distance/cost over a network, and cost over a geographic surface are discussed in detail. At this point we are venturing into understanding some of the concepts behind Network Analysis.

The last chapter looks at mapping change with regards to change over time for time pattern analysis. Three ways of mapping change are presented; creating a time series, creating a tracking map, and measuring change, along with the considerations required when creating each type for change in discrete features, events, summarized areas, and continuous categories and values.

Following the last chapter there are some recommendations for some further reading.

Overall Verdict: The perfect companion for a GIS student embarking on their geospatial educational quest. The theory behind GIS is essential for accurate analysis and troubleshooting. This book is an easy read with a plethora of figures and maps utilised in real-life situations found in each chapter to aid in the experience. Although getting closer to being two decades old this text stands the test of time and acts as a solid base for a foundation in simple analysis using a GIS to find patterns and relationships.

The only shortcoming of a text of this nature is that you cannot see how methods and techniques discussed are performed in a GIS. This is where the companion text GIS Tutorial 2: Spatial Analysis Workbook (for ArcGIS 10.3.x) comes in and aids in providing walkthroughs to further enhance your understanding of the underlying theory.

Next: see The ESRI Guide to GIS Analysis Volume 2: Spatial Measurements & Statistics

Creating a Density Heat Map with Leaflet

A Heat Map is a way of representing the density or intensity value of point data by assigning a colour gradient to a raster where the cell colour is based on clustering of points or an intensity value. The colour gradient usually ranges from cool/cold colours such as hues of blue, to warmer/hot colours such as yellow, orange and hot red. When mapping the density of points that represent the occurrence of an event, a raster cell is coloured red when many points are in close proximity and blue when points are dispersed (if using the colour range above). Therefore, the higher the concentration of points in an area the greater the ‘heat’ applied to the raster. When mapping intensity, points are assigned an intensity value, the higher the value the hotter the colour assigned to the raster, with lower intensity values assigned cooler colours. Here, we will look at mapping density.

I have created a Python script that creates 250 randomized points in a localised area and 250 in a larger region covering Co. Kildare in Ireland. The script is at the bottom of the post. The output from the script is a GeoJSON file with the point data and also a JavaScript file with the point data. The reason for two files is that GeoJSON coordinates are long, lat while Leaflet points are lat,long. The JavaScript file will be used to create the heat map raster while the GeoJSON file will be used to add the points to the map as a vector.

Unrealistic Scenario: A mad scientist has cloned the Great Irish Elk from the DNA of a carcass and bred 500 of them, the animals grew smarter and escaped. They are beginning to get out of control as they wreak havoc across the land destroying local flora and fauna. Luckily the scientist had the foresight to tag each elk with a GPS tracker. Each point represents the current location of an individual elk. The heat map will aid hunters in focusing their attentions on saving the flowers and critters that dwell amongst them.

Let’s get started and begin to build the HTML file for creating the WebMap. The template below creates a WebMap using Leaflet panned and zoomed to Kildare with an OSM basemap.

Note: Wordpress doesn’t seem to like script or div tags in the code below so if using the code fix the space added to the tags.

Save the code below as a HTML file.

<!DOCTYPE html>
<html>
<head>
 <title>
  Heatmap with Leaflet
 </title>
 <!-- link to leaflet css --> 
 <link rel="stylesheet" href="http://cdn.leafletjs.com/leaflet-0.7.3/leaflet.css" />
 <!-- map is full width and height of device -->
 <meta name="viewport" content="width=device-width, maximum-scale=1.0, user-scalable=no" />
 <style>
  body{
   padding: 0;
   margin: 0;
  }
  html, body, #map {
   height: 100%;
  }
 </style>
</head>
<body>
 <!-- leaflet js -->
 < script src="http://cdn.leafletjs.com/leaflet-0.7.2/leaflet.js"></ script>
 < div id="map">
 </ div>
 < script>
  var map = L.map('map',{center: [53.15, -6.7], zoom: 10});
 
  // OSM Baselayer
  L.tileLayer('http://{s}.tile.osm.org/{z}/{x}/{y}.png').addTo(map);
 </ script>
 
</body>
</html>
Leaflet OSM

Click map to open in browser

We need to access two plugins, Leaflet.heat and Leaflet.ajax. Download leaflet-heat.js from here and leaflet.ajax.js from here and place them in the same directory as the HTML file. Leaflet.heat allows for the heat map to be generated while Leaflet.ajax enables adding GeoJSON data to the map from an external source. Add the following below the reference to leaflet.js in the <body>. Remember to correct the spaces in the script tags.

< script src="leaflet-heat.js"></ script>
< script src="leaflet.ajax.min.js"></ script>
< script src="heat_points.js"></ script>

The third script tag above accesses the heat_points.js file created by running the Python script. Make sure that this file is also in the same directory as your HTML file.

Next we add the boundary of Co. Kildare to the map. The data is stored in an external GeoJSON file. You can access this file here, if following this tutorial save the file to the same directory as your HTML. Before adding the polygon to the map we will define a style for it. Under the line that adds the OSM base layer add the following.

var kildareStyle = {
 "fillColor": "#CC9933", 
 "color": "#000000",
 "weight": 2, 
 "fillOpacity": 0.2
 };

This will style the Kildare polygon with a translucent brown fill and a black outline. Let’s add the polygon to the map using the Leaflet.ajax plugin…

var kildare = new L.GeoJSON.AJAX('kildare.geojson', {style:kildareStyle}).addTo(map);

Save your HTML and open in a browser. You will notice that the polygon has not been added. This is because the files need to be accessed via a server. There are two things you can do here. Add the files to a hosted server like this map that shows where the tutorial is currently at, or use Python to act as a server for you. Open up the command prompt and set the current working directory to where your files are located. Type in the following to the command prompt and press enter…

python -m SimpleHTTPServer

Open up a browser window and type 127.0.0.1:8000/ followed by the name of your html. For example 127.0.0.1:8000/leaflet_heatmap.htmlThe map should open with the Kildare GeoJSON polygon styled and added to the map like below.

Kildare GeoJSON

The map was set up for a desktop so you might need to zoom and pan if using a mobile device. Let’s add the heat layer to the map with one line of code. Place the following under the code that added the Kildare polygon.

var heat = L.heatLayer(heat_points, {radius:12,blur:25,maxZoom:11}).addTo(map);

heat_points in the code above refers to the heat_points variable stored in the heat_points.js file created from the Python script. You can get these points here. This allows us to easily access lat, long as required by Leaflet. Save your file and open in a browser via the Python server. radius is the size of the points, the blur value makes points merge more fluidly, a lower value creates individual points and extreme high values will lose all heat. maxZoom is the zoom level where the map looks best. Other options are minOpacity which is the opacity that the heat will start at, and gradient which allows you to define the colour gradient of the heat map e.g  {gradient: {.4:”blue”,.6:”cyan”,.7:”lime”,.8:”yellow”,1:”red”}}. For more information see Chapter 3 from Leaflet.js Essentials from Packt Publishing and the Leaflet.heat page.

Leaflet Heatmap

There is a high density of elk focused on a location in the west of Kildare indicating that the culling should begin there. Populated areas such as Leixlip, Prosperous, and  Kildare town should also be a priority.

To complete the map let’s add the GeoJSON points to the map. The GeoJSON point file can be accessed here. First we define a style, the code below will be a simple black dot for each elk. Place this code under the kildareStyle used to style the Kildare polygon.

var pointStyle = {
 radius: 2,
 fillColor: "#000000",
 color: "#000000",
 weight: 1,
 fillOpacity: 1
 };

We now use the Leafet.ajax plugin to add the data to the map and the pointToLayer option for a GeoJSON layer. Place the code below under the code that adds the Kildare polygon to the map.

var points = new L.GeoJSON.AJAX('points.geojson', {pointToLayer: function (feature, latlng) {
   return L.circleMarker(latlng, pointStyle);
 }}).addTo(map);

Save the HTML and open in your browser (via a server). Click on the map below to open a hosted version of this example in your browser.

Leaflet Elk Points

Those elks don’t stand a chance unless they learn to shoot lasers out of their antlers.

This was an introduction to creating a density Heat Map with Leaflet.js. Play around with the options; radius, blur, maxZoom etc. to understand how they effect the heatmap layer and use a search engine and the sources below for additional information.

Sources

Leaflet.js Essentials (Packt Publishing)
Calvin Metcalf
Vladimir Agafonkin

Python Script for Coordinate Generation

import random

# create a list of 500 points
coordinates = []

for coord in range(250):
 x1 = round(random.uniform(-7.1650,-6.4700),4)
 y1 = round(random.uniform(52.8320,53.4670),4)
 x2 = round(random.uniform(-7.0510,-7.0680),4) 
 y2 = round(random.uniform(53.1800,53.1960),4)

 point1 = [x1,y1]
 point2 = [x2,y2]
 
 coordinates.append(point1)
 coordinates.append(point2)

# open a geojson file to add the points to
# change the path to suit your setup
f = open("C:/blog/leaflet/points.geojson", "w")

# opening bracket
f.write('[')

count = 1

# add data to the geojson file (formatted)
for coord in coordinates:
 if count == 1:
  f.write('{\n \
  \t"type": "Feature",\n \
  \t"geometry": {\n \
  \t\t"type": "Point",\n \
  \t\t"coordinates": [' + str(coord[0]) + ',' + str(coord[1]) + ']\n \
  \t}\n \
  }')
 
  count = count + 1
 
 else:
  f.write(',{\n \
  \t"type": "Feature",\n \
  \t"geometry": {\n \
  \t\t"type": "Point",\n \
  \t\t"coordinates": [' + str(coord[0]) + ',' + str(coord[1]) + ']\n \
  \t}\n \
  }')
 
  count = count + 1

# closing bracket
f.write(']')
# close the file
f.close()

# open a JavaScript file to store coordinates in lat,long
# change the path to suit your setup
f2 = open("C:/blog/leaflet/heat_points.js", "w")
# create a variable
f2.write('var heat_points = [')

count = 1

# write data to js file
for coord in coordinates:
 if count == 1:
  f2.write('[' + str(coord[1]) + ',' + str(coord[0]) + ']')

  count = count + 1
 
 else:
  f2.write(',[' + str(coord[1]) + ',' + str(coord[0]) + ']')

  count = count + 1

# closing bracket
f2.write(']')
# close file
f2.close()

Haversine Distances with Python, EURO 2016 & Historic Results

Irish JerseyIreland kick off their European Championship 2016 Group E on the 13th of June at the Stade de France, Saint-Denis, in the northern suburbs of Paris at 6pm local time (5pm on the Emerald Isle). Our opponents that evening will be Sweden, followed by a trek down south to Bordeaux to face Belgium on the 18th, and then a northern journey to Lille to take on Italy in the final group game on the 22nd.

Euro 2016 Overview

Click on map to enlarge

I have a flight from Dublin to Paris, a train from Paris to Bordeaux, Bordeaux to Lille, Lille to Paris, and a flight back to Dublin. Let’s use the simple Haversine formula to calculate approximately how far I will be travelling to follow the boys in green this summer. Just to note, I am not being pessimistic by not sticking around for the knockout stages, I have every faith that we will get out of the group, a two week holiday in France is going to take its toll on the bank account! But who knows…

The Haversine formula calculates the distance between two points on a sphere, also known as the Greater Circle distance, here’s the Python code similar to rosettacode.org

from math import radians, sin, cos, sqrt, asin

def haversine(coords_1, coords_2):

 lon1 = coords_1[0]
 lat1 = coords_1[1]
 lon2 = coords_2[0]
 lat2 = coords_2[1]
 
 # Earth radius in kilometers
 earth_radius = 6372.8

 # convert from degrees to radians
 dLon = radians(lon2 - lon1)
 dLat = radians(lat2 - lat1) 
 lat1 = radians(lat1)
 lat2 = radians(lat2)

 # mathsy stuff for you to research...if you're so inclined to
 a = sin(dLat/2)**2 + cos(lat1)*cos(lat2)*sin(dLon/2)**2
 c = 2*asin(sqrt(a))

 # return the distance rounded to 2 decimal places
 return round((earth_radius * c),2)

Now to use the function to calculate my approximate travelling distance. First define the coordinates as lists.

# lon - lat
dub_airport = [-6.2700, 53.4214]
cdg_airport = [2.5478, 49.0097]
paris = [2.3831, 48.8390]
bordeaux = [-0.5800, 44.8400]
lille = [3.0583, 50.6278]

And then use the coordinates as parameters for the haversine function. Print the distance for each leg and the total distance.

leg_1 = haversine(dub_airport, cdg_airport)
print "Leg 1: " + str(leg_1) + " Km"

leg_2 = haversine(paris, bordeaux)
print "Leg 2: " + str(leg_2) + " Km"

leg_3 = haversine(bordeaux, lille)
print "Leg 3: " + str(leg_3) + " Km"

leg_4 = haversine(lille, paris)
print "Leg 4: " + str(leg_4) + " Km"

leg_5 = haversine(cdg_airport, dub_airport)
print "Leg 5: " + str(leg_5) + " Km"

total_dist = leg_1 + leg_2 + leg_3 + leg_4 + leg_5
print "Total Distance " + str(total_dist) + " Km"

The output from running the complete script is…

Leg 1: 785.3 Km
Leg 2: 498.57 Km
Leg 3: 698.71 Km
Leg 4: 204.79 Km
Leg 5: 785.3 Km
Total Distance 2972.67 Km

Just over 2,970 Km! Ok so I could have been more accurate with getting the road length from my house to the airport, using the Haversine to find the distance from Dublin Airport to Charles De Gaulle, and then using road and rail networks to calculate my internal travel in France but the idea here was to introduce you to the Haversine formula.

And here’s a little map to show my up coming travels. These maps were made by exporting data from ArcGIS as an emf and imported in CorelDraw, a graphic design package for styling.

Euro 2016 Travel

Click on the map to enlarge

That’s it for the Haversine, Python and GIS in general. If you have no interest in football you can stop reading here. From here down it’s just historic results against our opponents and opinions.

SWEDEN

IRE v SWE

We have faced Sweden in two world cup qualifying campaigns, in qualification for one European Championship, and four times in a friendly match. It doesn’t bode well that we have never beaten Sweden in a competitive game, losing four times and drawing twice.

vrs Sweden Competitive

Our dominance over Sweden in friendlies see us with a 75% win percentage, only losing once in four meetings.

v Sweden Friendlies

Overall: P10 W3 D2 L5 F13 A16 – WIN% 30

BELGIUM

BEL v IRE

Similar to our record against Sweden, Ireland have never beaten Belgium in a competitive fixture, although in the seven competitive games we have only been beaten twice, drawing the other five.

v Belgium Competitive

We have to look to the friendly results for Ireland to get the better of Belgium, with Ireland edging the score at four wins to three.

v Belgium Friendlies

Overall: P14 W4 D5 L5 F24 A25 – WIN% 28.57

ITALY

ITA v IRE

Well would you look at that!! We have beaten at least one of our upcoming opponents in a competitive match. Ok so one win out of eight but I am every the optimist and from 1994 onwards our record against Italy is pretty decent both competitively and in friendlies.

v Italy Competitive

Our overall record against Italy looks dismal but recent results give Ireland hope. Since and including our meeting with Italy at World Cup 94 the head-to-head is two wins each and three draws. If you offered my a draw now against Italy I’d take it. Remember there are now twenty-four teams in the competition meaning the best placed third team in each group qualifies for the second round.

v Italy Friendlies

Overall: P13 W2 D3 L8 F9 A20 – WIN% 16.67

Do we stand a chance of getting out of the group?

Let me know your thoughts. With three teams qualifying from four out of the six groups Ireland have every chance of progressing. A win and a draw will more than likely be the minimum requirement to achieve passage to the next phase, but let’s not forget that Ireland reached the Quarter Finals of Italia 90 having not won a single match! Beat Sweden in the opener and get a draw against Italy. If only it was that simple 🙂

#COYBIG

Data used to create maps was downloaded from Natural Earth.
Historic results collate from the FAI and Wikipedia.