CSV to Shapefile with pyshp

Interested in learning ArcPy? check out this course.

In this post I will look at extracting point data from a CSV file and creating a Shapefile with the pyshp library. The data consists of the location of trees with various attributes generated by the Fingal County Council in Ireland. The data can be downloaded as a CSV file from dublinked.ie.

pyshp is a pure Python library designed to provide read and write support for the ESRI Shapefile (.shp) format and only utilizes Python’s standard library to achieve this. The library can be downloaded from https://code.google.com/p/pyshp/ and placed in the site-packages folder of your Python installation. Alternatively you can use easy-install…

easy_install pyshp

…or pip.

pip install pyshp

NOTE: You should make yourself familiar with the pyshp library by visiting Joel Lawhead’s examples and documents here.

The full code is at the bottom of the post, the following is a walkthrough. When ready to go open your favourite editor and import the modules required for the task at hand.

import shapefile, csv

We will use the getWKT_PRJ function discussed in a previous post.

def getWKT_PRJ (epsg_code):
 import urllib
 wkt = urllib.urlopen("http://spatialreference.org/ref/epsg/{0}/prettywkt/".format(epsg_code))
 remove_spaces = wkt.read().replace(" ","")
 output = remove_spaces.replace("\n", "")
 return output

Create an instance of the Shapefile Writer( ) class and declare the POINT geometry type.

trees_shp = shapefile.Writer(shapefile.POINT)

Set the autoBalance to 1. This enforces that for every record there must be a corresponding geometry.

trees_shp.autoBalance = 1

Create the field names and data types for each.

trees_shp.field("TREE_ID", "C")
trees_shp.field("ADDRESS", "C")
trees_shp.field("TOWN", "C")
trees_shp.field("TREE_SPEC", "C")
trees_shp.field("SPEC_DESC", "C")
trees_shp.field("COMMONNAME", "C")
trees_shp.field("AGE_DESC", "C")
trees_shp.field("HEIGHT", "C")
trees_shp.field("SPREAD", "C")
trees_shp.field("TRUNK", "C")
trees_shp.field("TRUNK_ACTL", "C")
trees_shp.field("CONDITION", "C")

Create a counter variable to keep track of the number of feature written to the Shapefile.

counter = 1

Open the CSV file in read mode.

with open('C:/csv_to_shp/Trees.csv', 'rb') as csvfile:
 reader = csv.reader(csvfile, delimiter=',')

Skip the header.

next(reader, None)

Loop through each row and assign each attribute in the row to a variable.

for row in reader:
 tree_id = row[0]
 address = row[1]
 town = row[2]
 tree_species = row[3]
 species_desc = row[4]
 common_name = row[5]
 age_desc = row[6]
 height = row[7]
 spread = row[8]
 trunk = row[9]
 trunk_actual = row[10]
 condition = row[11]
 latitude = row[12]
 longitude = row[13]

Set the geometry for each record based on the longitude and latitude vales.

trees_shp.point(float(longitude),float(latitude))

Create a matching record for the geometry using the attributes.

trees_shp.record(tree_id, address, town, tree_species, species_desc, common_name, age_desc,height, spread, trunk, trunk_actual, condition)

Print to screen the current feature number and increase the counter.

print "Feature " + str(counter) + " added to Shapefile."
 counter = counter + 1

Save the Shapefile to a location and name the file.

trees_shp.save("C:/csv_to_shp/Fingal_Trees")

Create a projection file (.prj)

prj = open("C:/csv_to_shp/Fingal_Trees.prj", "w")
epsg = getWKT_PRJ("4326")
prj.write(epsg)
prj.close()

Save and run the script. The number of features should be printed to the console.

Number of features

If you open the original CSV file you can see that there are also 33670 records. Navigate to the file location where you saved the Shapefile output. You should see four files shown below.

Shapfile

And just to make sure that the data is correct, here I have opened it up in QGIS.

QGIS - Fingal Trees

And the attribute table…

QGIS Attribute Table

And here’s the full code…

# import libraries
import shapefile, csv

# funtion to generate a .prj file
def getWKT_PRJ (epsg_code):
 import urllib
 wkt = urllib.urlopen("http://spatialreference.org/ref/epsg/{0}/prettywkt/".format(epsg_code))
 remove_spaces = wkt.read().replace(" ","")
 output = remove_spaces.replace("\n", "")
 return output

# create a point shapefile
trees_shp = shapefile.Writer(shapefile.POINT)

# for every record there must be a corresponding geometry.
trees_shp.autoBalance = 1

# create the field names and data type for each.
trees_shp.field("TREE_ID", "C")
trees_shp.field("ADDRESS", "C")
trees_shp.field("TOWN", "C")
trees_shp.field("TREE_SPEC", "C")
trees_shp.field("SPEC_DESC", "C")
trees_shp.field("COMMONNAME", "C")
trees_shp.field("AGE_DESC", "C")
trees_shp.field("HEIGHT", "C")
trees_shp.field("SPREAD", "C")
trees_shp.field("TRUNK", "C")
trees_shp.field("TRUNK_ACTL", "C")
trees_shp.field("CONDITION", "C")

# count the features
counter = 1

# access the CSV file
with open('C:/csv_to_shp/Trees.csv', 'rb') as csvfile:
 reader = csv.reader(csvfile, delimiter=',')
 # skip the header
 next(reader, None)

#loop through each of the rows and assign the attributes to variables
 for row in reader:
  tree_id = row[0]
  address = row[1]
  town = row[2]
  tree_species = row[3]
  species_desc = row[4]
  common_name = row[5]
  age_desc = row[6]
  height = row[7]
  spread = row[8]
  trunk = row[9]
  trunk_actual = row[10]
  condition = row[11]
  latitude = row[12]
  longitude = row[13]

  # create the point geometry
  trees_shp.point(float(longitude),float(latitude))
  # add attribute data
  trees_shp.record(tree_id, address, town, tree_species, species_desc, common_name, age_desc,height, spread, trunk, trunk_actual, condition)

  print "Feature " + str(counter) + " added to Shapefile."
  counter = counter + 1

# save the Shapefile
trees_shp.save("C:/csv_to_shp/Fingal_Trees")

# create a projection file
prj = open("C:/csv_to_shp/Fingal_Trees.prj", "w")
epsg = getWKT_PRJ("4326")
prj.write(epsg)
prj.close()

Any problems let me know.

29 thoughts on “CSV to Shapefile with pyshp

  1. Pingback: Generate a Projection File (.prj) using Python | Geospatiality

  2. Pingback: Reproject a Polygon Shapefile using PyShp and PyProj | Geospatiality

  3. Hi, thank you for your great tutorial. I’ve tried to follow your script, but why it only added 1 feature into the shapefile? look forward to hear your explanations. Thanks

    Like

  4. I’m converting into opposite direction from shp to csv. My problem is that when I read .shp file with shapefile Reader I get coordinates in meters. Is there way to get points in degrees?
    With pprint I get e.g:
    {‘m’: (-1.7976931348623157e+308,),
    ‘points’: [[339532.813818718, 6797730.64206192]],
    ‘shapeType’: 11,
    ‘z’: (0.0,)}

    Like

  5. For the remove_spaces = wkt.read().replace(” “,””) line I get the error “TypeError: a bytes-like object is required, not ‘str'”

    do you know why I am getting this error?

    Liked by 1 person

    • I got the same error,

      Got to working by modifying line 3 of function getWKT_PRJ to
      remove_spaces = wkt.read().decode(‘utf-8’).replace(” “,””)

      the wkt.read returns a byte array. Whereas need to read the response as string of utf-8 charecters
      I am new to python so cant say how it worked in past.

      also for python 3.5 I modified the function as below

      def getWKT_PRJ (epsg_code):
      from urllib.request import urlopen
      # access projection information
      with urlopen(“http://spatialreference.org/ref/epsg/{0}/prettywkt/”.format(epsg_code)) as wkt:
      # remove spaces between charachters
      remove_spaces = wkt.read().decode(‘utf-8’).replace(” “,””)
      # place all the text on one line
      output = remove_spaces.replace(“\n”, “”)
      return output

      Like

  6. Thanks for the helpful and insightful example. I’m trying to use this and when I successfully create a shapefile and try to add it into a map document in ArcMap, I get an error saying “The number of shapes does not match the number of table records”. Do you know how to rectify this issue? Many of the shapefile repair tools I’ve found are outdated….

    Thanks for any and all help

    Like

  7. Hi there,

    When I run a script written similar to yours and get to this section:

    trees_shp.point(float(easting),float(northing))

    I get the error: exceptions.ValueError: could not convert string to float: Line 1.00

    I’ve tried adding: easting = float(row[1]) and northing = float(row[2]) to get the string to be a float, but that does not seem to work.

    The CSV Easting value has 6 digits and 3 decimals, while the Northing value has 7 digits and 3 decimals. I’ve added those attributes in the respective .field(“Easting”, “F”, 6, 3) Thanks in advance!

    Like

  8. Hi, I have used your code before and it has worked perfectly. This time when I run the code I don’t get any errors pop up but when I open the attribute table of the shapefile it is blank. It will open in QGIS but not in ArcGIS. Do you know what may be causing this?

    Like

    • Unfortunately I have never encountered such so cannot really provide a solution I’m afraid. It’s been a long time since I used this but the main problems I used to have were making sure that every geometry (point) had a matching row in the attribute table. If you find a solution please post here. Cheers

      Like

  9. Hey, i also have the error with the conversion of the lat long values.

    output_shp.point(float(lon),float(lat)

    output_shp.point(float(lon),float(lat))
    TypeError: float() argument must be a string or a number, not ‘list’

    Like

  10. Traceback (most recent call last):
    File “D:\anzhuang\PyCharm Community Edition 2017.1.3\helpers\pydev\pydevd.py”, line 1585, in
    globals = debugger.run(setup[‘file’], None, None, is_module)
    File “D:\anzhuang\PyCharm Community Edition 2017.1.3\helpers\pydev\pydevd.py”, line 1015, in run
    pydev_imports.execfile(file, globals, locals) # execute the script
    File “D:\anzhuang\PyCharm Community Edition 2017.1.3\helpers\pydev\_pydev_imps\_pydev_execfile.py”, line 18, in execfile
    exec(compile(contents+”\n”, file, ‘exec’), glob, loc)
    File “F:/论文数据处理/上海/点成线.py”, line 34, in
    output_shp.record(id,speed,status,time)
    File “D:\anzhuang\anconda\lib\site-packages\shapefile.py”, line 1071, in record
    record = [recordList[i] for i in range(fieldCount)]
    File “D:\anzhuang\anconda\lib\site-packages\shapefile.py”, line 1071, in
    record = [recordList[i] for i in range(fieldCount)]
    IndexError: tuple index out of range

    thank you for your script very much.

    when I follow your script I got this error,I don’t know what happen? look forward your answer. thank you so much!!

    Like

  11. Hi and thank you, :)))

    I have the error with your code and it is below,

    Traceback (most recent call last):
    File “C:\…\csv2shp.py”, line 13, in
    trees_shp = shapefile.Writer(shapefile.POINT)
    File “C:\…\Python\Python37\lib\site-packages\shapefile.py”, line 1057, in __init__
    self.shp = self.__getFileObj(os.path.splitext(target)[0] + ‘.shp’)
    File “C:\…\Python\Python37\lib\ntpath.py”, line 202, in splitext
    p = os.fspath(p)
    TypeError: expected str, bytes or os.PathLike object, not int

    I couldn’t handle it.

    Like

  12. Traceback (most recent call last):
    File “C:\…\csv2shp.py”, line 67, in
    trees_shp.save(“C:/…/Fingal_Trees1”)
    AttributeError: ‘Writer’ object has no attribute ‘save’

    Like

    • Above I sent an error but I solve it in below
      I change ‘rb’ with ‘rt’
      with open(‘C:/…/fingaltrees.csv’, ‘rt’) as csvfile:

      and also
      trees_shp = shapefile.Writer(str(shapefile.POINT))
      I turn to ‘str’ inside the Writer.

      Now, this is my error in my code. I couldn’t over it.

      Like

Leave a comment