I cannot discover error, attempting to insert ...columns do not match - python

So the code runs until inserting new row, at which time I get>>>
'attempting to insert [ 3 item result] into these columns [ 5 items]. I have tried to discover where my code is causing a loss in results, but cannot. Any suggestions would be great.
Additional information, my feature class I am inserting has five fields and they are they same as the fields as the source fields. It reaches to my length != and my error prints. Please assist if you anyone would like.
# coding: utf8
import arcpy
import os, sys
from arcpy import env
arcpy.env.workspace = r"E:\Roseville\ScriptDevel.gdb"
arcpy.env.overwriteOutput = bool('TRUE')
# set as python bool, not string "TRUE"
fc_buffers = "Parcels" # my indv. parcel buffers
fc_Landuse = "Geology" # my land use layer
outputLayer = "IntersectResult" # output layer
outputFields = [f.name for f in arcpy.ListFields(outputLayer) if f.type not in ['OBJECTID', "Geometry"]] + ['SHAPE#']
landUseFields = [f.name for f in arcpy.ListFields(fc_Landuse) if f.type not in ['PTYPE']]
parcelBufferFields = [f.name for f in arcpy.ListFields(fc_buffers) if f.type not in ['APN']]
intersectionFeatureLayer = arcpy.MakeFeatureLayer_management(fc_Landuse, 'intersectionFeatureLayer').getOutput(0)
selectedBuffer = arcpy.MakeFeatureLayer_management(fc_buffers, 'selectedBuffer').getOutput(0)
def orderFields(luFields, pbFields):
ordered = []
for field in outputFields:
# append the matching field
if field in landUseFields:
ordered.append(luFields[landUseFields.index(field)])
if field in parcelBufferFields:
ordered.append(pbFields[parcelBufferFields.index(field)])
return ordered
print pbfields
with arcpy.da.SearchCursor(fc_buffers, ["OBJECTID", 'SHAPE#'] + parcelBufferFields) as sc, arcpy.da.InsertCursor(outputLayer, outputFields) as ic:
for row in sc:
oid = row[0]
shape = row[1]
print (oid)
print "Got this far"
selectedBuffer.setSelectionSet('NEW', [oid])
arcpy.SelectLayerByLocation_management(intersectionFeatureLayer,"intersect", selectedBuffer)
with arcpy.da.SearchCursor(intersectionFeatureLayer, ['SHAPE#'] + landUseFields) as intersectionCursor:
for record in intersectionCursor:
recordShape = record[0]
print "list made"
outputShape = shape.intersect(recordShape, 4)
newRow = orderFields(row[2:], record[1:]) + [outputShape]
if len(newRow) != len(outputFields):
print 'there is a problem. the number of columns in the record you are attempting to insert into', outputLayer, 'does not match the number of destination columns'
print '\tattempting to insert:', newRow
print '\tinto these columns:', outputFields
continue
# insert into the outputFeatureClass
ic.insertRow(newRow)

Your with statement where you define the cursors is creating a input cursor with 5 fields, but your row you are trying to feed it is only 3 fields. You need to make sure your insert cursor is the same length as the row. I suspect the problem is actually in the orderfields method. Or what you pass to it.

Related

keep calling an API until it is updated with latest item (Python)

I'm looking to call an API, and compare the data to my saved data in a CSV. If it has a new data point then I want to update my CSV and return the DataFrame... The mystery I have is why these two variables appear to be the same, yet the If statement moves to the Else instead of recognizing they are the same, if they are the same it should keep looping until an updated data point appears,(see second_cell == lastItem1 )
import pandas_datareader as pdr # https://medium.com/swlh/pandas-datareader-federal-reserve-economic-data-fred-a360c5795013
import datetime
def datagetter():
i = 1
while i < 120:
start = datetime.datetime (2005, 1, 1) ### Step 1: get data, and print last item
end = datetime.datetime (2040, 1, 1)
df = pdr.DataReader('PAYEMS', 'fred', start, end) ## This is the API
lastItem1 = df["PAYEMS"].iloc[-1] # find the last item in the data we have just downloaded
print ("Latest item from Fred API: " , lastItem1) ### Print the last item
with open('PAYEMS.csv', 'r') as logs: # So first we open the most recent CSV file
data = logs.readlines()
last_row = data[-1].split(',') # split is default on , as CSVs should be.
second_cell = last_row[1] # "second_cell" is our variable name for the saved datapoint from last month/week/day
print ("Last Item, in thousands" , second_cell)
if second_cell == lastItem1:
print ("CSV " , second_cell, "API ", lastItem1, " downloaded and stored items are the same, will re-loop until a new datapoint")
print("attempt no.", i)
i += 1
else:
df.to_csv("PAYEMS.csv")
print ("returning dataframe")
# print(df.tail())
return df
df = datagetter()
print(df.tail(3))
solved my own problem:
my CSV was returning a string, and the API an int... not quite sure why.
So
if second_cell == "": second_cell = 0 second_cell1 = int(float(second_cell))

Having trouble parsing a .CSV file into a dict

I've done some simple .csv parsing in python but have a new file structure that's giving me trouble. The input file is from a spreadsheet converted into a .CSV file. Here is an example of the input:
Layout
Each set can have many layouts, and each layout can have many layers. Each layer has only one layer and name.
Here is the code I am using to parse it in. I suspect it's a logic/flow control problem because I've parsed things in before, just not this deep. The first header row is skipped via code. Any help appreciated!
import csv
import pprint
def import_layouts_schema(layouts_schema_file_name = 'C:\\layouts\\LAYOUT1.csv'):
class set_template:
def __init__(self):
self.set_name =''
self.layout_name =''
self.layer_name =''
self.obj_name =''
def check_layout(st, row, layouts_schema):
c=0
if st.layout_name == '':
st.layer_name = row[c+2]
st.obj_name = row[c+3]
layer = {st.layer_name : st.obj_name}
layout = {st.layout_name : layer}
layouts_schema.update({st.set_name : layout})
else:
st.layout_name = row[c+1]
st.layer_name = row[c+2]
st.obj_name = row[c+3]
layer = {st.layer_name : st.obj_name}
layout = {st.layout_name : layer}
layouts_schema.update({st.set_name : layout})
return layouts_schema
def layouts_schema_parsing(obj_list_raw1): #, location_categories, image_schema, set_location):
#------ init -----------------------------------
skipfirst = True
c = 0
firstrow = True
layouts_schema = {}
end_flag = ''
st = set_template()
#---------- start parsing here -----------------
print('Now parsing layouts schema list')
for row in obj_list_raw1:
#print ('This is the row: ', row)
if skipfirst==True:
skipfirst=False
continue
if row[c] != '':
st.set_name = row[c]
st.layout_name = row[c+1]
st.layer_name = row[c+2]
st.obj_name = row[c+3]
print('FOUND A NEW SET. SET details below:')
print('Set name:', st.set_name, 'Layout name:', st.layout_name, 'Layer name:', st.layer_name, 'Object name:', st.obj_name)
if firstrow == True:
print('First row of layouts import!')
layer = {st.layer_name : st.obj_name}
layout = {st.layout_name : layer}
layouts_schema = {st.set_name : layout}
firstrow = False
check_layout(st, row, layouts_schema)
continue
elif firstrow == False:
print('Not the first row of layout import')
layer = {st.layer_name : st.obj_name}
layout = {st.layout_name : layer}
layouts_schema.update({st.set_name : layout})
check_layout(st, row, layouts_schema)
return layouts_schema
#begin subroutine main
layouts_schema_file_name ='C:\\Users\\jason\\Documents\\RAY\\layout_schemas\\ANIBOT_LAYOUTS_SCHEMA.csv'
full_path_to_file = layouts_schema_file_name
print('============ Importing LAYOUTS schema from: ', full_path_to_file , ' ==============')
openfile = open(full_path_to_file)
reader_ob = csv.reader(openfile)
layout_list_raw1 = list(reader_ob)
layouts_schema = layouts_schema_parsing(layout_list_raw1)
print('=========== End of layouts schema import =========')
return layouts_schema
layouts_schema = import_layouts_schema()
Feel free to throw any part away that doesn't work. I suspect I've inside my head a little bit here. A for loop or another while loop may do the trick. Ultimately I just want to parse the file into a dict with the same key structure shown. i.e. the final dict's first line would look like:
{'RESTAURANT': {'RR_FACING1': {'BACKDROP': 'restaurant1'}}}
And the rest on from there. Ultimately I am goign to use this key structure and the dict for other purposes. Just can't get the parsing down!
Wouaw, that's a lot of code !
Maybe try something simpler :
with open('file.csv') as f:
keys = f.readline().split(';') # assuming ";" is your csv fields separator
for line in f:
vals = line.split(';')
d = dict(zip(keys, vals))
print(d)
Then either make a better data file (without blanks), or have the parser remembering the previous values.
While I agree with #AK47 that the code review site may be the better approach, I received so many help from SO that I'll try to give back a little: IMHO you are overthinking the problem. Please find below an approach that should get you in the right direction and doesn't even require converting from Excel to CSV (I like the xlrd module, it's very easy to use). If you already have a CSV, just exchange the loop in the process_sheet() function. Basically, I just store the last value seen for "SET" and "LAYOUT" and if they are different (and not empty), I set the new value. Hope that helps. And yes, you should think about a better data structure (redundancy is not always bad, if you can avoid empty cells :-) ).
import xlrd
def process_sheet(sheet : xlrd.sheet.Sheet):
curr_set = ''
curr_layout = ''
for rownum in range(1, sheet.nrows):
row = sheet.row(rownum)
set_val = row[0].value.strip()
layout_val = row[1].value.strip()
if set_val != '' and set_val != curr_set:
curr_set = set_val
if layout_val != '' and layout_val != curr_layout:
curr_layout = layout_val
result = {curr_set: {curr_layout: {row[2].value: row[3].value}}}
print(repr(result))
def main():
# open a workbook (adapt your filename)
# then get the first sheet (index 0)
# and call the process function
wbook = xlrd.open_workbook('/tmp/test.xlsx')
sheet = wbook.sheet_by_index(0)
process_sheet(sheet)
if __name__ == '__main__':
main()

Retrieving data from namedtuple record structure fails

I need to create a lookup table to store tabular data and retrieve the records based on multiple field values.
I found an example post # 15418386 which does almost what I need, however it always returns the same record regardless of the argument being passed.
I listed the code at the bottom of this post, in casr the link does not work.
I have verified that the file is read correctly and the data table is being populated properly as well by using the debugger in the IDE (Im using PyCharm).
The test data included in the code is:
name,age,weight,height
Bob Barker,25,175,6ft 2in
Ted Kingston,28,163,5ft 10in
Mary Manson,27,140,5ft 6in
Sue Sommers,27,132,5ft 8in
Alice Toklas,24,124,5ft 6in
The function always returns the last record, I believe the problem is in these lines of code. But I don't understand how it works.
matches = [self.records[index]
for index in self.lookup_tables[field].get(value, []) ]
return matches if matches else None
I would like to understand how the code is supposed to work so I can edit it to be able to search on multiple parameters.
original code:
from collections import defaultdict, namedtuple
import csv
class DataBase(object):
def __init__(self, csv_filename, recordname):
# read data from csv format file int list of named tuples
with open(csv_filename, 'rb') as inputfile:
csv_reader = csv.reader(inputfile, delimiter=',')
self.fields = csv_reader.next() # read header row
self.Record = namedtuple(recordname, self.fields)
self.records = [self.Record(*row) for row in csv_reader]
self.valid_fieldnames = set(self.fields)
# create an empty table of lookup tables for each field name that maps
# each unique field value to a list of record-list indices of the ones
# that contain it.
self.lookup_tables = defaultdict(lambda: defaultdict(list))
def retrieve(self, **kwargs):
"""Fetch a list of records with a field name with the value supplied
as a keyword arg ( or return None if there aren't any)."""
if len(kwargs) != 1:
raise ValueError(
'Exactly one fieldname/keyword argument required for function '
'(%s specified)' % ', '.join([repr(k) for k in kwargs.keys()])
)
field, value = kwargs.items()[0] # get only keyword arg and value
if field not in self.valid_fieldnames:
raise ValueError('keyword arg "%s" isn\'t a valid field name' % field)
if field not in self.lookup_tables: # must create field look up table
for index, record in enumerate(self.records):
value = getattr(record, field)
self.lookup_tables[field][value].append(index)
matches = [self.records[index]
for index in self.lookup_tables[field].get(value, []) ]
return matches if matches else None
if __name__ == '__main__':
empdb = DataBase('employee.csv', 'Person')
print "retrieve(name='Ted Kingston'):", empdb.retrieve(name='Ted Kingston')
print "retrieve(age='27'):", empdb.retrieve(age='27')
print "retrieve(weight='150'):", empdb.retrieve(weight='150')
The variable value is overwritten in the following if .. for .. block:
field, value = kwargs.items()[0] # <--- `value` defined
...
if field not in self.lookup_tables:
for index, record in enumerate(self.records):
value = getattr(record, field) # <--- `value` overwritten
self.lookup_tables[field][value].append(index)
So, value refers the value of the last record. You need to use another name to prevent such overwriting.
if field not in self.lookup_tables:
for index, record in enumerate(self.records):
v = getattr(record, field)
self.lookup_tables[field][v].append(index)

how to loop list to pass string to function name in python

I'm trying to find the most efficient way to create different function name myfunction_a ,.. b , c with slightly different code ( input file name 'app/data/mydata_a.csv' ) so here below is the a function I got
def myfunction_a(request):
os.getcwd() # Should get this Django project root (where manage.py is)
fn = os.path.abspath(os.path.join(os.getcwd(),'app/data/mydata_a.csv'))
# TODO: Move to helper module
response_data = {}
data_format = 'tsv'
if data_format == 'json':
with open(fn, 'rb') as tsvin:
tsvin = csv.reader(tsvin, delimiter='\t')
for row in tsvin:
print 'col1 = %s col2 = %s' % (row[0], row[1])
response_data[row[0]] = row[1]
result = HttpResponse(json.dumps(response_data), content_type = 'application/json')
else:
with open(fn, 'rb') as tsvin:
buff = tsvin.read()
result = HttpResponse(buff, content_type = 'text/tsv')
return result
I want to be able to loop through my list and create multiple function name:
mylist = ['a','b','c' ... 'z' ]
def myfunction_a(request):
... ( 'app/data/mydata_a.csv' )
return request
to get final result of :
def myfunction_a => taking 'app/data/mydata_a.csv'
def myfunction_b => taking 'app/data/mydata_b.csv'
def myfunction_c => taking 'app/data/mydata_c.csv'
right now I just copy and past and change it. is there a better to do this ? Any recommendation would be appreciated. Thanks.
you can add a variable to a string with
"app/data/mydata_%s.csv" % (character)
so
for character in mylist:
print "app/data/mydata_%s.csv" % (character)
should append everytime another charcter at the place of %s
So since you want for every function use another string to get another file you can
do something like this:
def myfunction(label, request):
return "app/data/mydata_%s.csv" % (label)
so you get the function label at the end of your documentpath. Since you described that you
only want to change the name so that it equals to the function label, you only need another parameter and not a new function name
If you must have a special function name, you could do this. Though why you'd need to I'm not sure.
import functools, sys
namespace = sys._getframe(0).f_globals
def myfunction(label, request):
print request
return "app/data/mydata_%s.csv" % (label)
my_labels = ['a','b','c']
for label in my_labels:
namespace['myfunction_%s'%label] = functools.partial(myfunction, label)
print myfunction_a('request1')
print myfunction_b('request2')
Output is this:
request1
app/data/mydata_a.csv
request2
app/data/mydata_b.csv
Or possibly a better implementation would be:
class MyClass(object):
def __init__(self, labels):
for label in labels:
setattr(self, label, functools.partial(self._myfunction, label))
def _myfunction(self, label, request):
print request
return "app/data/mydata_%s.csv" % (label)
myfunction = MyClass(['a','b','c'])
print myfunction.c('request3')
Output is this:
request3
app/data/mydata_c.csv

Using Update Cursor to populate 2 fields for Feature Class Name and OID

I am currently trying to populate 2 fields. They are both already created within a table that I want to populate with data from existing feature classes. The idea is copy all data from desired feature classes that match a particular Project #. The rows that match the project # will copy over to blank template with the matching fields. So far all is good except I need to push the data from the OBJECT ID field and the Name of the Feature Class in to 2 fields within the table.
**def featureClassName(table_path):
arcpy.AddMessage("Calculating Feature Class Name...")
print "Calculating Feature Class Name..."
featureClass = "FeatureClass"
SDE_ID = "SDE_ID"
fc_desc = arcpy.Describe(table_path)
lists = arcpy.ListFields(table_path)
print lists
with arcpy.da.SearchCursor(table_path, featureClass = "\"NAME\"" + " Is NULL") as cursor:
for row in cursor:
print row
if row.FEATURECLASS = str.replace(row.FEATURECLASS, "*", fc):
cursor.updateRow(row)
print row
del cursor, row
else:
pass**
The Code above is my attempt, out of many to populate the field with the Name of the Feature class.
I have attemped to do the same with the OID.
**for fc in fcs:
print fc
if fc:
print "Making Layer..."
lyr = arcpy.MakeFeatureLayer_management (fc, r"in_memory\temp", whereClause)
fcCount = int(arcpy.GetCount_management(lyr).getOutput(0))
print fcCount
if fcCount > 0:
tbl = arcpy.CopyRows_management(lyr, r"in_memory\temp2")
arcpy.AddMessage("Checking for Feature Class Name...")
arcpy.AddMessage("Appending...")
print "Appending..."
arcpy.Append_management(tbl, table_path, "NO_TEST")
print "Checking for Feature Class Name..."
featureClassName(table_path)
del fc, tbl, lyr, fcCount
arcpy.Delete_management(r"in_memory\temp")
arcpy.Delete_management(r"in_memory\temp2")
else:
arcpy.AddMessage("Pass... " + fc)
print ("Pass... " + fc)
del fc, lyr, fcCount
arcpy.Delete_management(r"in_memory\temp")
pass**
This code is the main loop for the feature classes within the dataset that i create a new layer/table to use for copying the data to the table. The data for Feature Class Name and OID dont have data to push, so thats where I am stuck.
Thanks Everybody
You have a number of things wrong. First, you are not setting up the cursor correctly. It has to be a updateCursor if you are going to update, and you called a searchCursor, which you called incorrectly, by the way. Second, you used = (assignment) instead of == (equality comparison) in the line "if row.FEATURECLASS ... Then 2 lines below that, your indentation is messed up on several lines. And it's not clear at all that your function knows the value of fc. Pass that as an arg to be sure. Bunch of other problems exist, but let's just give you an example that will work, and you can study it:
def featureClassName(table_path, fc):
'''Will update the FEATURECLASS FIELD in table_path rows with
value of fc (string) where FEATURECLASS field is currently null '''
arcpy.AddMessage("Calculating Feature Class Name...")
print "Calculating Feature Class Name..."
#delimit field correctly for the query expression
df = arcpy.AddFieldDelimiters(fc, 'FEATURECLASS')
ex = df + " is NULL"
flds = ['FEATURECLASS']
#in case we don't get rows, del will bomb below unless we put in a ref
#to row
row = None
#do the work
with arcpy.da.UpdateCursor(table_path, flds, ex) as cursor:
for row in cursor:
row[0] = fc #or basename, don't know which you want
cursor.updateRow(row)
del cursor, row
Notice we are now passing the name of the fc as an arg, so you will have to deal with that in the rest of your code. Also it's best to use AddFieldDelimiter, since different fc's require different delimiters, and the docs are not clear at all on this (sometimes they are just wrong).
good luck, Mike

Categories

Resources