Retrieve data from sql database using python - python

I'm a newbie for python and SQL.
My database contain a field which stored path to images.
I want to assign each path to a variable in python.
I can only store only the path of first row.
This is my code
import pymysql
from arcpy import env
env.workspace = "D:/year 4 semester 1/Python/Data/"
conn= pymysql.connect(host='localhost',user='root',password='',db='research')
cursor = conn.cursor ()
cursor.execute ("select Path from sourcedata")
data = cursor.fetchmany()
for row in data :
a= row[0]
print a
But when I try following way including all other relevant text
for row in data :
a= row[0]
b = row[1]
c = row[2]
print a
Following error appear
IndexError: tuple index out of range

you need to indent the print a statement to make it part of the loop. Python is very sensitive about indentation. In other languages you use braces or BEGIN END blocks, but in python you use indentation.
The row[0], row[1] stuff refers to the elements within the retrieved row, not the different rows.
indent the print line and it will print the first returned field, Path, for each record.
The tuple index is out of range because there is only one field (element zero) returned.
Your current code will iterate through every row returned and set a to each one in turn, but will only print the last value for a after it comes out of the loop.

Related

Removing duplicate records based on hierarchy using arcpy and dictionaries

I'm attempting to flag duplicate records and delete them using a data dictionary with an arcpy update cursor, and I'm running into dictionary issues.
Essentially, my code iterates through the attribute table and adds a dictionary entry of FACE_ID:CHNG_TYPE for each new FACE_ID. If it encounters a FACE_ID that's already in the dictionary, it compares the CHNG_TYPE of the duplicate FACE_IDs to see which should be deleted (I've left the weighted comparison out as it isn't the issue).
To compare them, the cursor pulls the first change (change_a) CHNG_TYPE directly from the cursor row it's in. It also pulls the FACE_ID so that it can query the dictionary to get the CHNG_TYPE for the other FACE_ID.
When I print the dictionary, it looks like what I would expect. However, change_b = dict[row[0]] is calculating to be the same value every time, and I'm not sure why.
When I create the dictionary using this code but leave out the elif statement, I can pull the change_b value accurately with dict[FACE_ID].
Code below, and any help is appreciated!
with arcpy.da.UpdateCursor(fc, ['FACE_ID', 'CHNG_TYPE', 'RELATE']) as cursor:
dict = {}
for row in cursor:
if row[0] in dict:
change_a = row[1]
change_b = dict[row[0]]
print(change_a + ' ' + change_b)
elif row[0] not in dict:
dict[row[0]] = row[1]
To give an example, this statement creates the dictionary and returns the expected value:
with arcpy.da.UpdateCursor(fc, ['FACE_ID', 'CHNG_TYPE', 'RELATE']) as cursor:
dict = {}
for row in cursor:
if row[0] not in dict:
dict[row[0]] = row[1]
dict[123456]
Have you considered using the Delete Identical function or the Find Identical function available in ArcGIS Pro?
arcpy.management.DeleteIdentical(in_dataset, fields, {xy_tolerance}, {z_tolerance})
arcpy.management.FindIdentical(in_dataset, out_dataset, fields, {xy_tolerance}, {z_tolerance}, {output_record_option})
It could be a faster and more cost-effective solution than your way.

Python MySQLdb TypeError("not all arguments converted during string formatting")

I know this is a popular topic but I searched the various answers and didn't see a clear answer to my issue. I have a function that I want to use to insert records into my NDBC database that is giving me the error I mentioned in the title. The function is below:
def insertStdMet(station,cursor,data):
# This function takes in a station id, database cursor and an array of data. At present
# it assumes the data is a pandas dataframe with the datetime value as the index
# It may eventually be modified to be more flexible. With the parameters
# passed in, it goes row by row and builds an INSERT INTO SQL statement
# that assumes each row in the data array represents a new record to be
# added.
fields=list(data.columns) # if our table has been constructed properly, these column names should map to the fields in the data table
# Building the SQL string
strSQL1='REPLACE INTO std_met (station_id,date_time,'
strSQL2='VALUES ('
for f in fields:
strSQL1+=f+','
strSQL2+='%s,'
# trimming the last comma
strSQL1=strSQL1[:-1]
strSQL2=strSQL2[:-1]
strSQL1+=") " + strSQL2 + ")"
# Okay, now we have our SQL string. Now we need to build the list of tuples
# that will be passed along with it to the .executemany() function.
tuplist=[]
for i in range(len(data)):
r=data.iloc[i][:]
datatup=(station,r.name)
for f in r:
datatup+=(f,)
tuplist.append(datatup)
cursor.executemany(strSQL1,tuplist)
When we get to the cursor.executemany() call, strSQL looks like this:
REPLACE INTO std_met (station_id,date_time,WDIR,WSPD,GST,WVHT,DPD,APD,MWD,PRES,ATMP,WTMP,DEWP,VIS) VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s)'
I'm using % signs throughout and I am passing a list of tuples (~2315 tuples). Every value being passed is either a string,datetime, or number. I still have not found the issue. Any insights anyone cares to pass along would be sincerely appreciated.
Thanks!
You haven't given your SQL query a value for either station_id or date_time, so when it goes to unpack your arguments, there are two missing.
I suspect you want the final call to be something like:
REPLACE INTO std_met
(station_id,date_time,WDIR,WSPD,GST,WVHT,DPD,APD,MWD,
PRES,ATMP,WTMP,DEWP,VIS) VALUES (%s, %s, %s,%s,%s,%s,
%s,%s,%s,%s,%s,%s,%s,%s)'
Note the extra two %s. It looks like your tuple already contains values for station_id and date_time, so you could try this change:
strSQL1='REPLACE INTO std_met (station_id,date_time,'
strSQL2='VALUES (%s, %s, '

Modifying blank entries to Null

I'm trying to loop over items in a csv and for anything in the dataset that is blank, set it to "None" or "Null". I am using the modules csv and psycopg2, both already imported.
The overall goal here is to read any items that are blank in the csv and set them to Null. I'm using
item = "None" just to check if the items are found. From there I think I can set it to None.
Sample Data:
name, age, breed_name, species_name, shelter_name, adopted
Titchy, 12, mixed, cat, BCSPCA, 1
Ginger, 1, labradoodle, dog,,1
Sample Code:
import psycopg2
import csv
for new_pet in dictReader:
for item in new_pet:
item = item.capitalize()
if item is '':
print item # Used to check/debugging
item = "None"
I can't figure out where I am going wrong here. Any advice is greatly appreciated.
When you update the item inside the for loop it will have no effect on the list it came from. You are not modifying the list but the local "copy" of one item.
You can replace the whole inner for loop with a list comprehension:
import csv
for new_pet in dictReader:
new_pet = [value.capitalize() if value else None for value in new_pet]
print new_pet
This will take all items in new_pet, and run value.capitalize() if value else None on them.
That means: If value evaluates to False (empty strings do), return the value capitalized, if not, return None.
Remember to do your data processing per line inside the outer for loop.

SQLite output from query into Python script

I have this Python script:
s = stdscr.getstr(0,0, 20) #input length last number
c = db.execute("""SELECT "debit" FROM "members" WHERE "barcode" = '%s' LIMIT 1""" % (s,))
for row in c:
print row
if row == '(0,)':
#display cross
print 'Tick'
else:
#display tick
print 'Cross'
Where it is asking for a barcode input, and matching the debit field in the database.
The "print row" command returns "(0,)" but when I try to match it, I always get "Cross" as the output, which is not the intended result. Is there a semantic I'm obviously not observing?
Many thanks!
The variable row is a tuple, and '(0,)' is its string representation. Your are comparing a variable with its string representation, which cannot work.
You need to compare it to the tuple value
if row == (0,):
Simply remove the quote marks.
Alternatively, you can write
if row[0] == 0:
which will avoid the creation of a tuple just for the comparison. As noted by #CL., row will never be an empty tuple so extracting row[0] is safe.

populate SQL table by reading tweets from dictionary within a dictionary using Python

I am trying to read 1000 tweets from a file.
http://rasinsrv07.cstcis.cti.depaul.edu/CSC455/Twitter_2013_11_12.txt
The tweets are stored on line-by-line basis.
I have to create a SQL table for the 'geo' entry. 'Geo' is a dictionary inside the tweets dictionary. In some cases dictionary 'geo' is completely blank and in come cases it has values. I need to keep track of how many Geo dictionaries are blank, and how many have vlaues. I need to generate a unique ID for that table. In addition to the ID column, the geo table should have “type”, “longitude” and “latitude” columns. If Geo dictionary has values, it looks like this:
{u'type': u'Point', u'coordinates': [44.49241705, 11.33374359]}
Since I am new to Python and SQLITE, my code is basic (as I want to be able to understamd my code), and it is not working as expected. I am trying to do the insert in the Geo table if the length of geo dictionary is greater than 1, but it is not working. Any input will be greatly appreciated.
import urllib2, time, json, sqlite3
conn = sqlite3.connect('Tweets_Database_A6.db')
c = conn.cursor()
wFD = urllib2.urlopen('http://rasinsrv07.cstcis.cti.depaul.edu/CSC455/Twitter_2013_11_12.txt')
numLines = 1000
tweets = []
while numLines > 0:
line = wFD.readline()
numLines = numLines - 1
try:
tweets.append(json.loads(line))
except:
print line
wFD.close()
#create geo table using sqlite3
TblGeo = """create table Geo(Id number, Type text, Longitude number, latitude number);"""
c.execute(TblGeo)
HasGeo=0
NoGeo=0
for tweet in tweets:
tweet_geo = tweet['geo']
if len(tweet_geo) > 1:
HasGeo = HasGeo+1
try:
c.execute("insert into Geo(id, Type, Longitude, Latitude) values ('%s', '%s', '%s', '%s')" %(HasGeo, tweet_geo['type'], tweet_geo['coordinates'][0], tweet_geo['coordinates'][1]))
except:
print "no entry for " , i
else:
NoGeo = NoGeo+1
print HasGeo, " ", NoGeo
Your code is failing for a few reasons. Since this appears to be an assignment, I will not post the working code here, but I will attempt to point you in the right direction. Here are some of the things I've noticed while testing your code:
You made the assumption that tweet['geo'] would be an empty string. It actually is not. Essentially, the data sets this value to a json "null" when no geo information is available; this gets translated to the Nonetype in python, and not an empty string. Therefore, you should not be checking for the length of that value, but rather whether that value is True (hint: python considers '', "", [], ,{}, None as False)
I don't think your indentation on lines 28-31 is correct. Shouldn't that logic execute in the if block? right now, you are always executing that code, which I think is a logical error.
In your exception trapping at line 31, where do you define the variable "i"?
I hope this is helpful; feel free to ask additional clarifications if you are stumped.

Categories

Resources