(Open Source Geospatial Python)
The ‘What is it?’
See Mean Center.
The unweighted center is mainly used for events that occur at a place and time such as burglaries. The weighted center, however, is predominantly used for stationary features such as retail outlets or delineated areas for example (such as Census tracts). The Weighted Mean Center does not take into account distance between features in the dataset.
The weight needs to be a numerical attribute. The greater the value, the higher the weight for that feature.
The Formula!
The Weighted Mean Center is calculated by multiplying the x and y coordinate by the weight for that feature and summing all for both x and y individually, and then dividing this by the sum of all the weights.
For Point features the X and Y coordinates of each feature is used, for Polygons the centroid of each feature represents the X and Y coordinate to use, and for Linear features the mid-point of each line is used for the X and Y coordinate.
The Code…
from osgeo import ogr from shapely.geometry import MultiLineString from shapely import wkt import numpy as np import sys ## set the driver for the data driver = ogr.GetDriverByName("ESRI Shapefile") ## folder where the shapefile resides folder = r"C:\Users\glen.bambrick\Documents\GDAL\shp\\" ## name of the shapefile concatenated with folder shp = "{0}Census2011_Small_Areas_generalised20m.shp".format(folder) ## open the shapefile ds = driver.Open(shp, 0) ## reference the only layer in the shapefile lyr = ds.GetLayer(0) ## create an output data source out_ds = driver.CreateDataSource("{0}{1}_wgt_mean_center.shp".format(folder,lyr.GetName())) ## output mean center weighted filename output_fc = "{0}{1}_wgt_mean_center".format(folder,lyr.GetName()) ## field that has numerical weight weight_fld = "TOTAL2011" try: first_feat = lyr.GetFeature(1) xy_arr = np.ndarray((len(lyr), 2), dtype=np.float) wgt_arr = np.ndarray((len(lyr), 1), dtype=np.float) ## use the centroid for points and polygons if first_feat.geometry().GetGeometryName() in ["POINT", "MULTIPOINT", "POLYGON", "MULTIPOLYGON"]: for i, pt in enumerate(lyr): ft_geom = pt.geometry() weight = pt.GetField(weight_fld) xy_arr[i] = (ft_geom.Centroid().GetX() * weight, ft_geom.Centroid().GetY() * weight) wgt_arr[i] = weight ## midpoint of lines elif first_feat.geometry().GetGeometryName() in ["LINESTRING", "MULTILINESTRING"]: for i, ln in enumerate(lyr): line_geom = ln.geometry().ExportToWkt() weight = ln.GetField(weight_fld) shapely_line = MultiLineString(wkt.loads(line_geom)) midpoint = shapely_line.interpolate(shapely_line.length/2) xy_arr[i] = (midpoint.x * weight, midpoint.y * weight) wgt_arr[i] = weight except Exception: print "Unknown geometry or Incorrect field name for {}".format(input_lyr_name) sys.exit() ## do the maths sum_x, sum_y = np.sum(xy_arr, axis=0) sum_wgt = np.sum(wgt_arr) weighted_x, weighted_y = sum_x/sum_wgt, sum_y/sum_wgt print "Weighted Mean Center: {0}, {1}".format(weighted_x, weighted_y) ## create a new point layer with the same spatial ref as lyr out_lyr = out_ds.CreateLayer(output_fc, lyr.GetSpatialRef(), ogr.wkbPoint) ## define and create new fields x_fld = ogr.FieldDefn("X", ogr.OFTReal) y_fld = ogr.FieldDefn("Y", ogr.OFTReal) out_lyr.CreateField(x_fld) out_lyr.CreateField(y_fld) ## create a new point for the mean center weighted pnt = ogr.Geometry(ogr.wkbPoint) pnt.AddPoint(weighted_x, weighted_y) ## add the mean center weighted to the new layer feat_dfn = out_lyr.GetLayerDefn() feat = ogr.Feature(feat_dfn) feat.SetGeometry(pnt) feat.SetField("X", weighted_x) feat.SetField("Y", weighted_y) out_lyr.CreateFeature(feat) print "Created: {0}.shp".format(output_fc) ## free up resources del ds, out_ds, lyr, first_feat, feat, out_lyr
I’d like to give credit to Logan Byers from GIS StackExchange who aided in speeding up the computational time using NumPy and for forcing me to begin learning the wonders of NumPy (which is still a work in progress)
The Example:
I downloaded the Small Areas of Ireland from the CSO. You will have to acknowledge a disclaimer. The data contains population information for the 2011 Census. Once downloaded unzip Census2011_Small_Areas_generalised20m.zip to working folder.
Running the script from The Code section above calculates the Weighted Mean Center of all Small Areas based on the population count for each for 2011 and creates a point Shapefile as the output.
OSGP Weighted Mean Center: 238557.427484, 208347.116116
ArcGIS Weighted Mean Center: 238557.427484, 208347.116116
Also See…
Mean Center
Central Feature
Median Center
Initial Data Assessment
The Resources:
ESRI Guide to GIS Volume 2: Chapter 2 (I highly recommend this book)
see book review here.
The Usual 🙂
As always please feel free to comment to help make the code more efficient, highlight errors, or let me know if this was of any use to you.