(Open Source Geospatial Python)

**The ‘What is it?’**

See *Mean Center*.

The unweighted center is mainly used for events that occur at a place and time such as burglaries. The weighted center, however, is predominantly used for stationary features such as retail outlets or delineated areas for example (such as Census tracts). The *Weighted Mean Center* does not take into account distance between features in the dataset.

The weight needs to be a numerical attribute. The greater the value, the higher the weight for that feature.

**The Formula!**

The *Weighted Mean Center* is calculated by multiplying the x and y coordinate by the weight for that feature and summing all for both x and y individually, and then dividing this by the sum of all the weights.

For *Point* features the X and Y coordinates of each feature is used, for *Polygons* the centroid of each feature represents the X and Y coordinate to use, and for *Linear* features the mid-point of each line is used for the X and Y coordinate.

**The Code…**

from osgeo import ogr from shapely.geometry import MultiLineString from shapely import wkt import numpy as np import sys ## set the driver for the data driver = ogr.GetDriverByName("ESRI Shapefile") ## folder where the shapefile resides folder = r"C:\Users\glen.bambrick\Documents\GDAL\shp\\" ## name of the shapefile concatenated with folder shp = "{0}Census2011_Small_Areas_generalised20m.shp".format(folder) ## open the shapefile ds = driver.Open(shp, 0) ## reference the only layer in the shapefile lyr = ds.GetLayer(0) ## create an output data source out_ds = driver.CreateDataSource("{0}{1}_wgt_mean_center.shp".format(folder,lyr.GetName())) ## output mean center weighted filename output_fc = "{0}{1}_wgt_mean_center".format(folder,lyr.GetName()) ## field that has numerical weight weight_fld = "TOTAL2011" try: first_feat = lyr.GetFeature(1) xy_arr = np.ndarray((len(lyr), 2), dtype=np.float) wgt_arr = np.ndarray((len(lyr), 1), dtype=np.float) ## use the centroid for points and polygons if first_feat.geometry().GetGeometryName() in ["POINT", "MULTIPOINT", "POLYGON", "MULTIPOLYGON"]: for i, pt in enumerate(lyr): ft_geom = pt.geometry() weight = pt.GetField(weight_fld) xy_arr[i] = (ft_geom.Centroid().GetX() * weight, ft_geom.Centroid().GetY() * weight) wgt_arr[i] = weight ## midpoint of lines elif first_feat.geometry().GetGeometryName() in ["LINESTRING", "MULTILINESTRING"]: for i, ln in enumerate(lyr): line_geom = ln.geometry().ExportToWkt() weight = ln.GetField(weight_fld) shapely_line = MultiLineString(wkt.loads(line_geom)) midpoint = shapely_line.interpolate(shapely_line.length/2) xy_arr[i] = (midpoint.x * weight, midpoint.y * weight) wgt_arr[i] = weight except Exception: print "Unknown geometry or Incorrect field name for {}".format(input_lyr_name) sys.exit() ## do the maths sum_x, sum_y = np.sum(xy_arr, axis=0) sum_wgt = np.sum(wgt_arr) weighted_x, weighted_y = sum_x/sum_wgt, sum_y/sum_wgt print "Weighted Mean Center: {0}, {1}".format(weighted_x, weighted_y) ## create a new point layer with the same spatial ref as lyr out_lyr = out_ds.CreateLayer(output_fc, lyr.GetSpatialRef(), ogr.wkbPoint) ## define and create new fields x_fld = ogr.FieldDefn("X", ogr.OFTReal) y_fld = ogr.FieldDefn("Y", ogr.OFTReal) out_lyr.CreateField(x_fld) out_lyr.CreateField(y_fld) ## create a new point for the mean center weighted pnt = ogr.Geometry(ogr.wkbPoint) pnt.AddPoint(weighted_x, weighted_y) ## add the mean center weighted to the new layer feat_dfn = out_lyr.GetLayerDefn() feat = ogr.Feature(feat_dfn) feat.SetGeometry(pnt) feat.SetField("X", weighted_x) feat.SetField("Y", weighted_y) out_lyr.CreateFeature(feat) print "Created: {0}.shp".format(output_fc) ## free up resources del ds, out_ds, lyr, first_feat, feat, out_lyr

I’d like to give credit to Logan Byers from GIS StackExchange who aided in speeding up the computational time using *NumPy* and for forcing me to begin learning the wonders of *NumPy (*which is still a work in progress)

**The Example:**

I downloaded the Small Areas of Ireland from the CSO. You will have to acknowledge a disclaimer. The data contains population information for the 2011 Census. Once downloaded unzip *Census2011_Small_Areas_generalised20m.zip* to working folder.

Running the script from * The Code* section above calculates the

*Weighted*

*Mean Center*of all Small Areas based on the population count for each for 2011 and creates a point

*Shapefile*as the output.

**OSGP Weighted Mean Center:** 238557.427484, 208347.116116

**ArcGIS Weighted Mean Center**: 238557.427484, 208347.116116

**Also See…**

Mean Center

Central Feature

Median Center

Initial Data Assessment

**The Resources:**

ESRI Guide to GIS Volume 2: Chapter 2 (I highly recommend this book)

see book review here.

**The Usual 🙂**

As always please feel free to comment to help make the code more efficient, highlight errors, or let me know if this was of any use to you.