CSV to Shapefile with pyshp

Posted on January 9, 2016 by Glen Bambrick

Interested in learning ArcPy? check out this course.

In this post I will look at extracting point data from a CSV file and creating a Shapefile with the pyshp library. The data consists of the location of trees with various attributes generated by the Fingal County Council in Ireland. The data can be downloaded as a CSV file from dublinked.ie.

pyshp is a pure Python library designed to provide read and write support for the ESRI Shapefile (.shp) format and only utilizes Python’s standard library to achieve this. The library can be downloaded from https://code.google.com/p/pyshp/ and placed in the site-packages folder of your Python installation. Alternatively you can use easy-install…

easy_install pyshp

…or pip.

pip install pyshp

NOTE: You should make yourself familiar with the pyshp library by visiting Joel Lawhead’s examples and documents here.

The full code is at the bottom of the post, the following is a walkthrough. When ready to go open your favourite editor and import the modules required for the task at hand.

import shapefile, csv

We will use the getWKT_PRJ function discussed in a previous post.

def getWKT_PRJ (epsg_code):
 import urllib
 wkt = urllib.urlopen("http://spatialreference.org/ref/epsg/{0}/prettywkt/".format(epsg_code))
 remove_spaces = wkt.read().replace(" ","")
 output = remove_spaces.replace("\n", "")
 return output

Create an instance of the Shapefile Writer( ) class and declare the POINT geometry type.

trees_shp = shapefile.Writer(shapefile.POINT)

Set the autoBalance to 1. This enforces that for every record there must be a corresponding geometry.

trees_shp.autoBalance = 1

Create the field names and data types for each.

trees_shp.field("TREE_ID", "C")
trees_shp.field("ADDRESS", "C")
trees_shp.field("TOWN", "C")
trees_shp.field("TREE_SPEC", "C")
trees_shp.field("SPEC_DESC", "C")
trees_shp.field("COMMONNAME", "C")
trees_shp.field("AGE_DESC", "C")
trees_shp.field("HEIGHT", "C")
trees_shp.field("SPREAD", "C")
trees_shp.field("TRUNK", "C")
trees_shp.field("TRUNK_ACTL", "C")
trees_shp.field("CONDITION", "C")

Create a counter variable to keep track of the number of feature written to the Shapefile.

counter = 1

Open the CSV file in read mode.

with open('C:/csv_to_shp/Trees.csv', 'rb') as csvfile:
 reader = csv.reader(csvfile, delimiter=',')

Skip the header.

next(reader, None)

Loop through each row and assign each attribute in the row to a variable.

for row in reader:
 tree_id = row[0]
 address = row[1]
 town = row[2]
 tree_species = row[3]
 species_desc = row[4]
 common_name = row[5]
 age_desc = row[6]
 height = row[7]
 spread = row[8]
 trunk = row[9]
 trunk_actual = row[10]
 condition = row[11]
 latitude = row[12]
 longitude = row[13]

Set the geometry for each record based on the longitude and latitude vales.

trees_shp.point(float(longitude),float(latitude))

Create a matching record for the geometry using the attributes.

trees_shp.record(tree_id, address, town, tree_species, species_desc, common_name, age_desc,height, spread, trunk, trunk_actual, condition)

Print to screen the current feature number and increase the counter.

print "Feature " + str(counter) + " added to Shapefile."
 counter = counter + 1

Save the Shapefile to a location and name the file.

trees_shp.save("C:/csv_to_shp/Fingal_Trees")

Create a projection file (.prj)

prj = open("C:/csv_to_shp/Fingal_Trees.prj", "w")
epsg = getWKT_PRJ("4326")
prj.write(epsg)
prj.close()

Save and run the script. The number of features should be printed to the console.

If you open the original CSV file you can see that there are also 33670 records. Navigate to the file location where you saved the Shapefile output. You should see four files shown below.

And just to make sure that the data is correct, here I have opened it up in QGIS.

And the attribute table…

And here’s the full code…

# import libraries
import shapefile, csv

# funtion to generate a .prj file
def getWKT_PRJ (epsg_code):
 import urllib
 wkt = urllib.urlopen("http://spatialreference.org/ref/epsg/{0}/prettywkt/".format(epsg_code))
 remove_spaces = wkt.read().replace(" ","")
 output = remove_spaces.replace("\n", "")
 return output

# create a point shapefile
trees_shp = shapefile.Writer(shapefile.POINT)

# for every record there must be a corresponding geometry.
trees_shp.autoBalance = 1

# create the field names and data type for each.
trees_shp.field("TREE_ID", "C")
trees_shp.field("ADDRESS", "C")
trees_shp.field("TOWN", "C")
trees_shp.field("TREE_SPEC", "C")
trees_shp.field("SPEC_DESC", "C")
trees_shp.field("COMMONNAME", "C")
trees_shp.field("AGE_DESC", "C")
trees_shp.field("HEIGHT", "C")
trees_shp.field("SPREAD", "C")
trees_shp.field("TRUNK", "C")
trees_shp.field("TRUNK_ACTL", "C")
trees_shp.field("CONDITION", "C")

# count the features
counter = 1

# access the CSV file
with open('C:/csv_to_shp/Trees.csv', 'rb') as csvfile:
 reader = csv.reader(csvfile, delimiter=',')
 # skip the header
 next(reader, None)

#loop through each of the rows and assign the attributes to variables
 for row in reader:
  tree_id = row[0]
  address = row[1]
  town = row[2]
  tree_species = row[3]
  species_desc = row[4]
  common_name = row[5]
  age_desc = row[6]
  height = row[7]
  spread = row[8]
  trunk = row[9]
  trunk_actual = row[10]
  condition = row[11]
  latitude = row[12]
  longitude = row[13]

  # create the point geometry
  trees_shp.point(float(longitude),float(latitude))
  # add attribute data
  trees_shp.record(tree_id, address, town, tree_species, species_desc, common_name, age_desc,height, spread, trunk, trunk_actual, condition)

  print "Feature " + str(counter) + " added to Shapefile."
  counter = counter + 1

# save the Shapefile
trees_shp.save("C:/csv_to_shp/Fingal_Trees")

# create a projection file
prj = open("C:/csv_to_shp/Fingal_Trees.prj", "w")
epsg = getWKT_PRJ("4326")
prj.write(epsg)
prj.close()

Any problems let me know.

29 thoughts on “CSV to Shapefile with pyshp”

Pingback: Generate a Projection File (.prj) using Python | Geospatiality
Pingback: Reproject a Polygon Shapefile using PyShp and PyProj | Geospatiality
Reymonza on May 9, 2016 at 11:55 pm said:

Hi, thank you for your great tutorial. I’ve tried to follow your script, but why it only added 1 feature into the shapefile? look forward to hear your explanations. Thanks

LikeLike

Reply ↓
- Reymonza on May 10, 2016 at 12:58 am said:
  
  Just solved it, my indentation caused it only read 1 feature..thank you
  
  LikeLike
  
  Reply ↓
  - clubdebambos on May 10, 2016 at 7:03 am said:
    
    Excellent, it’s usually something small. Thanks for visiting the page, I hope it was helpful.
    
    LikeLike
    
    Reply ↓
    - Herran on June 8, 2016 at 4:53 pm said:
      
      Where exactly was the indentation incorrect?
      
      LikeLike
      
      Reply ↓
- Herran on June 8, 2016 at 4:54 pm said:
  
  Where exactly was the indentation incorrect?
  
  LikeLike
  
  Reply ↓
  - clubdebambos on June 9, 2016 at 6:47 am said:
    
    Under for row in reader: down to increasing the counter should have been indented in the full listing of the script and is fixed in the post now.
    
    LikeLike
    
    Reply ↓
    - Herran on June 9, 2016 at 1:09 pm said:
      
      Thank you! And like Reymonza, I have found this tutorial and example very insightful. Thank you again!
      
      LikeLiked by 1 person
      
      Reply ↓
tahveli on July 20, 2016 at 12:01 pm said:

I’m converting into opposite direction from shp to csv. My problem is that when I read .shp file with shapefile Reader I get coordinates in meters. Is there way to get points in degrees?
With pprint I get e.g:
{‘m’: (-1.7976931348623157e+308,),
‘points’: [[339532.813818718, 6797730.64206192]],
‘shapeType’: 11,
‘z’: (0.0,)}

LikeLike

Reply ↓
- clubdebambos on July 20, 2016 at 12:26 pm said:
  
  Have you got the capability to reproject the Shapefile into WGS84? You can easily do this with GIS software or if you want to try through Python modules similar to https://glenbambrick.com/2016/01/24/reproject-shapefile/. Your X, Y points in the Shapefile should then be in Long/Lat as decimal degrees. Let me know how it goes.
  
  LikeLike
  
  Reply ↓
Jonathan Barge on February 8, 2017 at 10:01 pm said:

For the remove_spaces = wkt.read().replace(” “,””) line I get the error “TypeError: a bytes-like object is required, not ‘str'”

do you know why I am getting this error?

LikeLiked by 1 person

Reply ↓
- clubdebambos on February 11, 2017 at 10:25 am said:
  
  Can you please send me the epsg code you are using and I’ll recreate and see if I get a similar error. Cheers
  
  LikeLike
  
  Reply ↓
- Vartika Sinha on February 22, 2017 at 11:46 am said:
  
  I got the same error,
  
  Got to working by modifying line 3 of function getWKT_PRJ to
  remove_spaces = wkt.read().decode(‘utf-8’).replace(” “,””)
  
  the wkt.read returns a byte array. Whereas need to read the response as string of utf-8 charecters
  I am new to python so cant say how it worked in past.
  
  also for python 3.5 I modified the function as below
  
  def getWKT_PRJ (epsg_code):
  from urllib.request import urlopen
  # access projection information
  with urlopen(“http://spatialreference.org/ref/epsg/{0}/prettywkt/”.format(epsg_code)) as wkt:
  # remove spaces between charachters
  remove_spaces = wkt.read().decode(‘utf-8’).replace(” “,””)
  # place all the text on one line
  output = remove_spaces.replace(“\n”, “”)
  return output
  
  LikeLike
  
  Reply ↓
  - clubdebambos on February 22, 2017 at 11:50 am said:
    
    Excellent. I haven’t used this function for quite some time. Thanks for taking the time to post this solution.
    
    LikeLike
    
    Reply ↓
chris vandenberg on February 25, 2017 at 5:35 am said:

Thanks for the helpful and insightful example. I’m trying to use this and when I successfully create a shapefile and try to add it into a map document in ArcMap, I get an error saying “The number of shapes does not match the number of table records”. Do you know how to rectify this issue? Many of the shapefile repair tools I’ve found are outdated….

Thanks for any and all help

LikeLike

Reply ↓
- clubdebambos on February 27, 2017 at 11:03 am said:
  
  Have you made sure that you have set autoBalance = 1 for the shapefile? See Geometry and Record Balancing in the documentation https://pypi.python.org/pypi/pyshp
  
  In the code example above it is trees_shp.autoBalance = 1
  
  LikeLike
  
  Reply ↓
BR on July 6, 2017 at 11:40 pm said:

Hi there,

When I run a script written similar to yours and get to this section:

trees_shp.point(float(easting),float(northing))

I get the error: exceptions.ValueError: could not convert string to float: Line 1.00

I’ve tried adding: easting = float(row[1]) and northing = float(row[2]) to get the string to be a float, but that does not seem to work.

The CSV Easting value has 6 digits and 3 decimals, while the Northing value has 7 digits and 3 decimals. I’ve added those attributes in the respective .field(“Easting”, “F”, 6, 3) Thanks in advance!

LikeLike

Reply ↓
- clubdebambos on July 7, 2017 at 7:01 am said:
  
  It could possibly be reading an empty line. float(“”) will return the error ValueError: could not convert string to float:
  Make sure all rows have an Easting and Northing
  
  Or you could have non-digit values in your string such as “123O0” or “123,456” but this usually returns ValueError: invalid literal for float():
  
  You will need to catch the error in any case.
  
  See Stack Exchange for various solutions
  https://stackoverflow.com/search?q=ValueError%3A+could+not+convert+string+to+float%3A
  
  LikeLike
  
  Reply ↓
kiwirunnerblog on July 12, 2017 at 3:54 am said:

Hi, I have used your code before and it has worked perfectly. This time when I run the code I don’t get any errors pop up but when I open the attribute table of the shapefile it is blank. It will open in QGIS but not in ArcGIS. Do you know what may be causing this?

LikeLike

Reply ↓
- clubdebambos on July 13, 2017 at 6:50 am said:
  
  Unfortunately I have never encountered such so cannot really provide a solution I’m afraid. It’s been a long time since I used this but the main problems I used to have were making sure that every geometry (point) had a matching row in the attribute table. If you find a solution please post here. Cheers
  
  LikeLike
  
  Reply ↓
Vincze Istvan on July 21, 2017 at 11:25 am said:

Hey, i also have the error with the conversion of the lat long values.

output_shp.point(float(lon),float(lat)

output_shp.point(float(lon),float(lat))
TypeError: float() argument must be a string or a number, not ‘list’

LikeLike

Reply ↓
- clubdebambos on July 23, 2017 at 1:09 pm said:
  
  You are trying to put a list for lat and/or lon, you need to iterate through a list and use each value.
  
  LikeLike
  
  Reply ↓
caoliang on December 28, 2017 at 7:11 am said:

Traceback (most recent call last):
File “D:\anzhuang\PyCharm Community Edition 2017.1.3\helpers\pydev\pydevd.py”, line 1585, in
globals = debugger.run(setup[‘file’], None, None, is_module)
File “D:\anzhuang\PyCharm Community Edition 2017.1.3\helpers\pydev\pydevd.py”, line 1015, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File “D:\anzhuang\PyCharm Community Edition 2017.1.3\helpers\pydev\_pydev_imps\_pydev_execfile.py”, line 18, in execfile
exec(compile(contents+”\n”, file, ‘exec’), glob, loc)
File “F:/论文数据处理/上海/点成线.py”, line 34, in
output_shp.record(id,speed,status,time)
File “D:\anzhuang\anconda\lib\site-packages\shapefile.py”, line 1071, in record
record = [recordList[i] for i in range(fieldCount)]
File “D:\anzhuang\anconda\lib\site-packages\shapefile.py”, line 1071, in
record = [recordList[i] for i in range(fieldCount)]
IndexError: tuple index out of range

thank you for your script very much.

when I follow your script I got this error，I don’t know what happen? look forward your answer. thank you so much!!

LikeLike

Reply ↓
kbratsy on July 1, 2019 at 9:00 am said:

Hi and thank you, :)))

I have the error with your code and it is below,

Traceback (most recent call last):
File “C:\…\csv2shp.py”, line 13, in
trees_shp = shapefile.Writer(shapefile.POINT)
File “C:\…\Python\Python37\lib\site-packages\shapefile.py”, line 1057, in __init__
self.shp = self.__getFileObj(os.path.splitext(target)[0] + ‘.shp’)
File “C:\…\Python\Python37\lib\ntpath.py”, line 202, in splitext
p = os.fspath(p)
TypeError: expected str, bytes or os.PathLike object, not int

I couldn’t handle it.

LikeLike

Reply ↓
kbratsy on July 1, 2019 at 4:07 pm said:

Traceback (most recent call last):
File “C:\…\csv2shp.py”, line 67, in
trees_shp.save(“C:/…/Fingal_Trees1”)
AttributeError: ‘Writer’ object has no attribute ‘save’

LikeLike

Reply ↓
- kbratsy on July 1, 2019 at 4:11 pm said:
  
  Above I sent an error but I solve it in below
  I change ‘rb’ with ‘rt’
  with open(‘C:/…/fingaltrees.csv’, ‘rt’) as csvfile:
  
  and also
  trees_shp = shapefile.Writer(str(shapefile.POINT))
  I turn to ‘str’ inside the Writer.
  
  Now, this is my error in my code. I couldn’t over it.
  
  LikeLike
  
  Reply ↓
  - sherbert on July 24, 2019 at 6:18 pm said:
    
    Pyshp has changed, and now shapefile.Writer() takes as input a string that specifies the file path to save to. There’s no .save attribute either, it saves automatically. See here: https://pypi.org/project/pyshp/#writing-shapefiles
    
    LikeLike
    
    Reply ↓
Bill Foucault on January 2, 2020 at 10:02 pm said:

I get the error, AttributeError: ‘int’ object has no attribute ‘rfind’

LikeLike

Reply ↓