Iterating through MySQL table and updating - python

I have a MySQL table which stores a couple thousands addresses. I need to parse them to geolocation API, get latitude and longitude and then put them back into corresponding address row (I made special columns for that). The question is what is the most efficient way to do it? Currently I am using python with mysql.connector and geopy for geolocations. So there is a simple code I use for geocoding:
cursor = conn.cursor()
cursor.execute("SELECT description FROM contacts WHERE kind = 'Home adress'")
row = cursor.fetchone()
while row is not None:
geocoded = geolocator.geocode(row, exactly_one=True)
if geocoded is not None:
lat = geocoded.latitude
lon = geocoded.longitude
row = cursor.fetchone()

You can use cursor.executemany() to update the table in one go. This requires that a list of update parameters be created which can then be passed to executemany(). The parameter list can be created from the results of the initial SELECT query. In the example below I have assumed that there is some primary key named key_id for the contacts table:
cursor = conn.cursor()
cursor.execute("SELECT key_id, description FROM contacts WHERE kind = 'Home adress'")
update_params = []
for key_id, description in cursor:
geocoded = geolocator.geocode(description, exactly_one=True)
if geocoded is not None:
lat = geocoded.latitude
lon = geocoded.longitude
update_params.append((lat, lon, key_id))
c.executemany("update contacts set lat = %s, lon = %s where key_id = %s", update_params)
As mentioned above this assumes existence of a primary key. If there is not one and description is a unique field in the table then you could use that. Just remove key_id from the SELECT query, and replace key_id with the description field for both the update_params list and the update query.

#mhavke, thanks a lot! Just what I needed. Here is a finally working code (I made some adjustments). Also I am aware that using '%s' is unsafe, but this goes for internal use only, so not really worried about it.
cursor = conn.cursor()
cursor.execute("SELECT key_id, description FROM contacts WHERE kind = 'Home address'")
update_params = []
for key_id, description in cursor:
geocoded = geolocator.geocode(description, exactly_one=True)
if geocoded is not None:
lat = geocoded.latitude
lon = geocoded.longitude
update_params.append((lat, lon, key_id))
cursor.executemany("update contacts set latitude = %s, longitude = %s where key_id = %s", update_params)
conn.commit()

Related

Creating a Search Record Function Python SQLite3

I am currently working on a coursework project for school and it is a database system with a user interface using Tkinter, Python and SQLite3. I have made a form to add, delete, update and search for customers. I am able to display the result from a single field, however, I am struggling to get the message box to display all the fields, which is what I would like it to do. I have attached photos of the form along with the code. Thank you in advance.
def SearchCustomer(self):
customerid = self.CustomerEntry.get();
with sqlite3.connect("LeeOpt.db") as db:
cursor = db.cursor()
search_customer = ('''SELECT * FROM Customer WHERE CustomerID = ?''')
cursor.execute(search_customer, [(customerid)])
results = cursor.fetchall()
if results:
for i in results:
tkinter.messagebox.showinfo("Notification",i[0])
It is because you showed only the first column (i[0]) from result.
Since there should be only one record for a specific customer ID, you should use fetchone() instead of fetchall(), then you can show the whole record as below:
def SearchCustomer(self):
customerid = self.CustomerEntry.get()
with sqlite3.connect("LeeOpt.db") as db:
cursor = db.cursor()
search_customer = "SELECT * FROM Customer WHERE CustomerID = ?"
cursor.execute(search_customer, [customerid])
result = cursor.fetchone() # there should be only one record for specific customer ID
if result:
tkinter.messagebox.showinfo("Notification", "\n".join(str(x) for x in result))

Retrieve data with a SELECT and treat it separately

I need to do a SELECT to an SQL Server table and treat the information I get separately.
For example, let's say I have this table named Table1
And I do this SELECT in python:
SELECT name, phone, date FROM Table1
In the print the result would be:
[['Sara Miller',611111111],['Jane Smith',622222222],['Amanda
Laurens',633333333]]
I need to treat each row and each name and phone number separately to send SMS... So, how can I access each one using Python?
For example, to send an SMS to the number 611111111 saying
"Dear Sara Miller, tomorrow (20/05/2020) you have an appointment in
the Clinic"
The SMS part I have covered, using an API, the problem is I can't figure out how to treat received data from SQL Server.
The code I have at the moment is:
conn = pypyodbc.connect("Connection parameters, waorking OK")
cursor = conn.cursor()
cursor.execute('SELECT name, phone, date FROM Table1')
result = cursor.fetchall()
final_result = [list(i) for i in result]
print(final_result)
If I need to clarify something please let me know.
I haven't really worked with pypyodbc so I'm not sure what the format of the data that cursor.fetchall returns so I have listed two approaches which should cover both scenarios.
conn = pypyodbc.connect("Connection parameters, waorking OK")
cursor = conn.cursor()
cursor.execute('SELECT name, phone, date FROM Table1')
for row in cursor.fetchall():
name = row[0]
phone = row[1]
date = row[2]
# do something with these variables
If the result returned is a dict instead of a list then it becomes:
for row in cursor.fetchall():
name = row['name']
phone = row['phone']
date = row['date']
# do something with these variables
Or as #DanGuzman mentions we can also do:
for row in cursor.fetchall():
name = row.name
phone = row.phone
date = row.date
# do something with these variables

Create/Insert Json in Postgres with requests and psycopg2

Just started a project with PostgreSQL. I would like to make the leap from Excel to a database and I am stuck on create and insert. Once I run this I will have to switch it to Update I believe so I don't continue to write over the current data. I know my connection is working but i get the following error.
My Error is: TypeError: not all arguments converted during string formatting
#!/usr/bin/env python
import requests
import psycopg2
conn = psycopg2.connect(database='NHL', user='postgres', password='postgres', host='localhost', port='5432')
req = requests.get('http://www.nhl.com/stats/rest/skaters?isAggregate=false&reportType=basic&isGame=false&reportName=skatersummary&sort=[{%22property%22:%22playerName%22,%22direction%22:%22ASC%22},{%22property%22:%22goals%22,%22direction%22:%22DESC%22},{%22property%22:%22assists%22,%22direction%22:%22DESC%22}]&cayenneExp=gameTypeId=2%20and%20seasonId%3E=20172018%20and%20seasonId%3C=20172018')
data = req.json()['data']
my_data = []
for item in data:
season = item['seasonId']
player = item['playerName']
first_name = item['playerFirstName']
last_Name = item['playerLastName']
playerId = item['playerId']
height = item['playerHeight']
pos = item['playerPositionCode']
handed = item['playerShootsCatches']
city = item['playerBirthCity']
country = item['playerBirthCountry']
state = item['playerBirthStateProvince']
dob = item['playerBirthDate']
draft_year = item['playerDraftYear']
draft_round = item['playerDraftRoundNo']
draft_overall = item['playerDraftOverallPickNo']
my_data.append([playerId, player, first_name, last_Name, height, pos, handed, city, country, state, dob, draft_year, draft_round, draft_overall, season])
cur = conn.cursor()
cur.execute("CREATE TABLE t_skaters (data json);")
cur.executemany("INSERT INTO t_skaters VALUES (%s)", (my_data,))
Sample of data:
[[8468493, 'Ron Hainsey', 'Ron', 'Hainsey', 75, 'D', 'L', 'Bolton', 'USA', 'CT', '1981-03-24', 2000, 1, 13, 20172018], [8471339, 'Ryan Callahan', 'Ryan', 'Callahan', 70, 'R', 'R', 'Rochester', 'USA', 'NY', '1985-03-21', 2004, 4, 127, 20172018]]
It seems like you want to create a table with one column named "data". The type of this column is JSON. (I would recommend creating one column per field, but it's up to you.)
In this case the variable data (that is read from the request) is a list of dicts. As I mentioned in my comment, you can loop over data and do the inserts one at a time as executemany() is not faster than multiple calls to execute().
What I did was the following:
Create a list of fields that you care about.
Loop over the elements of data
For each item in data, extract the fields into my_data
Call execute() and pass in json.dumps(my_data) (Converts my_data from a dict into a JSON-string)
Try this:
#!/usr/bin/env python
import requests
import psycopg2
import json
conn = psycopg2.connect(database='NHL', user='postgres', password='postgres', host='localhost', port='5432')
req = requests.get('http://www.nhl.com/stats/rest/skaters?isAggregate=false&reportType=basic&isGame=false&reportName=skatersummary&sort=[{%22property%22:%22playerName%22,%22direction%22:%22ASC%22},{%22property%22:%22goals%22,%22direction%22:%22DESC%22},{%22property%22:%22assists%22,%22direction%22:%22DESC%22}]&cayenneExp=gameTypeId=2%20and%20seasonId%3E=20172018%20and%20seasonId%3C=20172018')
# data here is a list of dicts
data = req.json()['data']
cur = conn.cursor()
# create a table with one column of type JSON
cur.execute("CREATE TABLE t_skaters (data json);")
fields = [
'seasonId',
'playerName',
'playerFirstName',
'playerLastName',
'playerId',
'playerHeight',
'playerPositionCode',
'playerShootsCatches',
'playerBirthCity',
'playerBirthCountry',
'playerBirthStateProvince',
'playerBirthDate',
'playerDraftYear',
'playerDraftRoundNo',
'playerDraftOverallPickNo'
]
for item in data:
my_data = {field: item[field] for field in fields}
cur.execute("INSERT INTO t_skaters VALUES (%s)", (json.dumps(my_data),))
# commit changes
conn.commit()
# Close the connection
conn.close()
I am not 100% sure if all of the postgres syntax is correct here (I don't have access to a PG database to test), but I believe that this logic should work for what you are trying to do.
Update For Separate Columns
You can modify your create statement to handle multiple columns, but it would require knowing the data type of each column. Here's some psuedocode you can follow:
# same boilerplate code from above
cur = conn.cursor()
# create a table with one column per field
cur.execute(
"""CREATE TABLE t_skaters (seasonId INTEGER, playerName VARCHAR, ...);"""
)
fields = [
'seasonId',
'playerName',
'playerFirstName',
'playerLastName',
'playerId',
'playerHeight',
'playerPositionCode',
'playerShootsCatches',
'playerBirthCity',
'playerBirthCountry',
'playerBirthStateProvince',
'playerBirthDate',
'playerDraftYear',
'playerDraftRoundNo',
'playerDraftOverallPickNo'
]
for item in data:
my_data = [item[field] for field in fields]
# need a placeholder (%s) for each variable
# refer to postgres docs on INSERT statement on how to specify order
cur.execute("INSERT INTO t_skaters VALUES (%s, %s, ...)", tuple(my_data))
# commit changes
conn.commit()
# Close the connection
conn.close()
Replace the ... with the appropriate values for your data.

Insert Data Into A Table Using The Same Foreign Key Value

I'm using SQL Server, Python, pypyodbc.
The tables I have are:
tbl_User: id, owner
tbl_UserPhone: id, number, user_id
user_id is the primary key of User and the foreign key of UserPhone.
I'm trying to insert 2 different phones to the same user_id using pypyodbc.
This is one of the things I tried that did not work:
cursor = connection.cursor()
SQLCommand = ("INSERT INTO tbl_UserPhones"
"(id,number,user_id)"
" VALUES (?,?,?)")
values = [userphone_index, user_phone,"((SELECT id from tbl_User where id = %d))" % user_id_index]
cursor.execute(SQLCommand, values)
cursor.commit()
Based on your comments, you have an identity column in tbl_UserPhones. Based on the column names I'm guessing it's the ID column.
The exception you get is very clear - you can't insert data into an identity column without specifically setting identity_insert to on before your insert statement. Basically, messing around with identity columns is bad practice. it's better to let Sql server to use it's built in capabilities and handle the insert to the identity column automatically.
You need to change your insert statement to not include the id column:
Instead of
SQLCommand = ("INSERT INTO tbl_UserPhones"
"(id,number,user_id)"
" VALUES (?,?,?)")
values = [userphone_index, user_phone,"((SELECT id from tbl_User where id = %d))" % user_id_index]
try this:
SQLCommand = ("INSERT INTO tbl_UserPhones"
"(number,user_id)"
" VALUES (?,?)")
values = [user_phone,"((SELECT id from tbl_User where id = %d))" % user_id_index]
SQLCommand = ("INSERT INTO tbl_UserPhones"
"(id,number,user_id)"
" VALUES (?,?,?)")
user_sqlCommand = cursor.execute("(SELECT id FROM tbl_User WHERE id = %d)" % user_index).fetchone()[0]
values = [userphone_index, user_phone, user_sqlCommand]
This was the solution.

Replace L, in SQL results in python

I'm running pyodbc connected to my db and when i run a simply query I get a load of results back such as
(7L, )(12L,) etc.
How do I replace the the 'L, ' with '' so I can pass the ids into another query
Thanks
Here's my code
import pyodbc
cnxn = pyodbc.connect('DSN=...;UID=...;PWD=...', ansi=True)
cursor = cnxn.cursor()
rows = cursor.execute("select id from orders")
for row in rows:
test = cursor.execute("select name from customer where order_id = %(id)s" %{'id':row})
print test
Use parameters:
...
test = cursor.execute("select name from customer where order_id = ?", row.id)
...
The L after the number indicates that the value is a long type.

Categories

Resources