Python Script to Execute SQL in Loop - python

I am trying to write python script which calls mysql SQL.
I am relying on first query result to run second query. Problem is that it's throwing error.
Error : cursor2.execute(field_sql,id)
import boto3
import importlib
import psutil
import pymysql
import pymysql.cursors
import subprocess
import sys
rdsConn = pymysql.connect(host = 'XXXX'),
db = 'XXXX',
user = 'XXXX',
password = 'XXXX',
charset = 'utf8mb4',
cursorclass=pymysql.cursors.DictCursor)
cursor1 = rdsConn.cursor()
cursor2 = rdsConn.cursor()
name = 'Test'
sql = "select id from Table1 where name = %s"
cursor1.execute(sql,name)
result = cursor1.fetchall()
for id in result:
field_sql= "select columnname from Table2 where id = %s"
cursor2.execute(field_sql,id)
fieldresult = cursor2.fetchall()
for fieldrow in fieldresult:
print(fieldrow)
cursor1.close()
cursor2.close()
rdsConn.close()

Your query uses a dict cursor, so it will return a list of dicts, e.g:
[{'id': 1}, {'id': '2'}, ...]
Which means your id* will be a dict, not a tuple as it would be otherwise. Which means you're passing your arguments to the second query as a dict. If you do so, you need to use named parameters using the pyformat style:
for rowdict in result:
field_sql = "select columnname from Table2 where id = %(id)s"
cursor2.execute(field_sql, rowdict)
fieldresult = cursor2.fetchall()
for fieldrow in fieldresult:
print(fieldrow)
You'll see that the printed fieldrows are also dicts.
Also, query parameters should be passed either as dict (named parameters) or as tuple (positional parameters). pymysql accepts the form cursor.execute(sql, "name"), other dbapi2 connectors don't. The canonical form would be cursor.execute(sql, ("name",)).
*btw, you shouldn't use id as name, it hides the builtin id function

Related

Retrieve query results as dict in SQLAlchemy

I am using flask SQLAlchemy and I have the following code to get users from database with raw SQL query from a MySQL database:
connection = engine.raw_connection()
cursor = connection.cursor()
cursor.execute("SELECT * from User where id=0")
results = cursor.fetchall()
results variable is a tuple and I want it to be of type dict(). Is there a way to achieve this?
when I was using pymysql to build the db connection I was able to do
cursor = connection.cursor(pymysql.cursors.DictCursor)
Is there something similar in SQLAlchemy?
Note: The reason I want to do this change is to get rid of using pymysql in my code, and only use SQLAlcehmy features, i.e. I do not want to have ´´´import pymysql´´´ in my code anywhere.
results is a tuple and I want it to be of type dict()
Updated answer for SQLAlchemy 1.4:
Version 1.4 has deprecated the old engine.execute() pattern and changed the way .execute() operates internally. .execute() now returns a CursorResult object with a .mappings() method:
import sqlalchemy as sa
# …
with engine.begin() as conn:
qry = sa.text("SELECT FirstName, LastName FROM clients WHERE ID < 3")
resultset = conn.execute(qry)
results_as_dict = resultset.mappings().all()
pprint(results_as_dict)
"""
[{'FirstName': 'Gord', 'LastName': 'Thompson'},
{'FirstName': 'Bob', 'LastName': 'Loblaw'}]
"""
(Previous answer for SQLAlchemy 1.3)
SQLAlchemy already does this for you if you use engine.execute instead of raw_connection(). With engine.execute, fetchone will return a SQLAlchemy Row object and fetchall will return a list of Row objects. Row objects can be accessed by key, just like a dict:
sql = "SELECT FirstName, LastName FROM clients WHERE ID = 1"
result = engine.execute(sql).fetchone()
print(type(result)) # <class 'sqlalchemy.engine.result.Row'>
print(result['FirstName']) # Gord
If you need a true dict object then you can just convert it:
my_dict = dict(result)
print(my_dict) # {'FirstName': 'Gord', 'LastName': 'Thompson'}
If raw_connection() is returning a PyMySQL Connection object then you can continue to use DictCursor like so:
engine = create_engine("mysql+pymysql://root:whatever#localhost:3307/mydb")
connection = engine.raw_connection()
cursor = connection.cursor(pymysql.cursors.DictCursor)
cursor.execute("SELECT 1 AS foo, 'two' AS bar")
result = cursor.fetchall()
print(result) # [{'foo': 1, 'bar': 'two'}]
you can use sqlalchemy cursor and cursor's description.
def rows_as_dicts(cursor):
"""convert tuple result to dict with cursor"""
col_names = [i[0] for i in cursor.description]
return [dict(zip(col_names, row)) for row in cursor]
db = SQLAlchemy(app)
# get cursor
cursor = db.session.execute(sql).cursor
# tuple result to dict
result = rows_as_dicts(cursor)
I would personally use pandas:
import pandas as pd
connection = engine.raw_connection()
df = pd.read_sql_query('SELECT * from User where id=0' , connection)
mydict = df.to_dict()

Create/Insert Json in Postgres with requests and psycopg2

Just started a project with PostgreSQL. I would like to make the leap from Excel to a database and I am stuck on create and insert. Once I run this I will have to switch it to Update I believe so I don't continue to write over the current data. I know my connection is working but i get the following error.
My Error is: TypeError: not all arguments converted during string formatting
#!/usr/bin/env python
import requests
import psycopg2
conn = psycopg2.connect(database='NHL', user='postgres', password='postgres', host='localhost', port='5432')
req = requests.get('http://www.nhl.com/stats/rest/skaters?isAggregate=false&reportType=basic&isGame=false&reportName=skatersummary&sort=[{%22property%22:%22playerName%22,%22direction%22:%22ASC%22},{%22property%22:%22goals%22,%22direction%22:%22DESC%22},{%22property%22:%22assists%22,%22direction%22:%22DESC%22}]&cayenneExp=gameTypeId=2%20and%20seasonId%3E=20172018%20and%20seasonId%3C=20172018')
data = req.json()['data']
my_data = []
for item in data:
season = item['seasonId']
player = item['playerName']
first_name = item['playerFirstName']
last_Name = item['playerLastName']
playerId = item['playerId']
height = item['playerHeight']
pos = item['playerPositionCode']
handed = item['playerShootsCatches']
city = item['playerBirthCity']
country = item['playerBirthCountry']
state = item['playerBirthStateProvince']
dob = item['playerBirthDate']
draft_year = item['playerDraftYear']
draft_round = item['playerDraftRoundNo']
draft_overall = item['playerDraftOverallPickNo']
my_data.append([playerId, player, first_name, last_Name, height, pos, handed, city, country, state, dob, draft_year, draft_round, draft_overall, season])
cur = conn.cursor()
cur.execute("CREATE TABLE t_skaters (data json);")
cur.executemany("INSERT INTO t_skaters VALUES (%s)", (my_data,))
Sample of data:
[[8468493, 'Ron Hainsey', 'Ron', 'Hainsey', 75, 'D', 'L', 'Bolton', 'USA', 'CT', '1981-03-24', 2000, 1, 13, 20172018], [8471339, 'Ryan Callahan', 'Ryan', 'Callahan', 70, 'R', 'R', 'Rochester', 'USA', 'NY', '1985-03-21', 2004, 4, 127, 20172018]]
It seems like you want to create a table with one column named "data". The type of this column is JSON. (I would recommend creating one column per field, but it's up to you.)
In this case the variable data (that is read from the request) is a list of dicts. As I mentioned in my comment, you can loop over data and do the inserts one at a time as executemany() is not faster than multiple calls to execute().
What I did was the following:
Create a list of fields that you care about.
Loop over the elements of data
For each item in data, extract the fields into my_data
Call execute() and pass in json.dumps(my_data) (Converts my_data from a dict into a JSON-string)
Try this:
#!/usr/bin/env python
import requests
import psycopg2
import json
conn = psycopg2.connect(database='NHL', user='postgres', password='postgres', host='localhost', port='5432')
req = requests.get('http://www.nhl.com/stats/rest/skaters?isAggregate=false&reportType=basic&isGame=false&reportName=skatersummary&sort=[{%22property%22:%22playerName%22,%22direction%22:%22ASC%22},{%22property%22:%22goals%22,%22direction%22:%22DESC%22},{%22property%22:%22assists%22,%22direction%22:%22DESC%22}]&cayenneExp=gameTypeId=2%20and%20seasonId%3E=20172018%20and%20seasonId%3C=20172018')
# data here is a list of dicts
data = req.json()['data']
cur = conn.cursor()
# create a table with one column of type JSON
cur.execute("CREATE TABLE t_skaters (data json);")
fields = [
'seasonId',
'playerName',
'playerFirstName',
'playerLastName',
'playerId',
'playerHeight',
'playerPositionCode',
'playerShootsCatches',
'playerBirthCity',
'playerBirthCountry',
'playerBirthStateProvince',
'playerBirthDate',
'playerDraftYear',
'playerDraftRoundNo',
'playerDraftOverallPickNo'
]
for item in data:
my_data = {field: item[field] for field in fields}
cur.execute("INSERT INTO t_skaters VALUES (%s)", (json.dumps(my_data),))
# commit changes
conn.commit()
# Close the connection
conn.close()
I am not 100% sure if all of the postgres syntax is correct here (I don't have access to a PG database to test), but I believe that this logic should work for what you are trying to do.
Update For Separate Columns
You can modify your create statement to handle multiple columns, but it would require knowing the data type of each column. Here's some psuedocode you can follow:
# same boilerplate code from above
cur = conn.cursor()
# create a table with one column per field
cur.execute(
"""CREATE TABLE t_skaters (seasonId INTEGER, playerName VARCHAR, ...);"""
)
fields = [
'seasonId',
'playerName',
'playerFirstName',
'playerLastName',
'playerId',
'playerHeight',
'playerPositionCode',
'playerShootsCatches',
'playerBirthCity',
'playerBirthCountry',
'playerBirthStateProvince',
'playerBirthDate',
'playerDraftYear',
'playerDraftRoundNo',
'playerDraftOverallPickNo'
]
for item in data:
my_data = [item[field] for field in fields]
# need a placeholder (%s) for each variable
# refer to postgres docs on INSERT statement on how to specify order
cur.execute("INSERT INTO t_skaters VALUES (%s, %s, ...)", tuple(my_data))
# commit changes
conn.commit()
# Close the connection
conn.close()
Replace the ... with the appropriate values for your data.

MySQL python normalization of tuple fails in query

When I try to pass a tuple to the IN argument of a WHERE clause, it gets double-quoted so my query fails. For example, if I do this,
# Connect to DB
import MySQLdb
cnxn = MySQLdb.connect(connectString)
curs = cnxn.cursor()
# Setup query
accounts = ('Hyvaco','TLC')
truck_type = 'fullsize'
query_args = (truck_type, accounts)
sql ='SELECT * FROM archive.incoming WHERE LastCapacity=%s AND Account IN %s'
# Run query and print
curs.execute(sql, query_args)
print(curs._executed)
then I get zero rows back, and the query prints out as
SELECT * FROM archive.incoming WHERE LastCapacity='fullsize'
AND Account IN ("'Hyvaco'", "'TLC'")
Switching accounts from a tuple to a list does not affect the result. How should I be passing these arguments?
How about you create the accounts as a string and then do this:
accounts = "('Hyvaco','TLC')"
sql ='SELECT * FROM archive.incoming WHERE LastCapacity=%s AND Account IN '+ accounts

How to store python dictionary in to mysql DB through python

I am trying to store the the following dictionary into mysql DB by converting the dictionary into a string and then trying to insert, but I am getting following error. How can this be solved, or is there any other way to store a dictionary into mysql DB?
dic = {'office': {'component_office': ['Word2010SP0', 'PowerPoint2010SP0']}}
d = str(dic)
# Sql query
sql = "INSERT INTO ep_soft(ip_address, soft_data) VALUES ('%s', '%s')" % ("192.xxx.xx.xx", d )
soft_data is a VARCHAR(500)
Error:
execution exception (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to
use near 'office': {'component_office': ['Word2010SP0', 'PowerPoint2010SP0' at line 1")
Any suggestions or help please?
First of all, don't ever construct raw SQL queries like that. Never ever. This is what parametrized queries are for. You've asking for an SQL injection attack.
If you want to store arbitrary data, as for example Python dictionaries, you should serialize that data. JSON would be good choice for the format.
Overall your code should look like this:
import MySQLdb
import json
db = MySQLdb.connect(...)
cursor = db.cursor()
dic = {'office': {'component_office': ['Word2010SP0', 'PowerPoint2010SP0']}}
sql = "INSERT INTO ep_soft(ip_address, soft_data) VALUES (%s, %s)"
cursor.execute(sql, ("192.xxx.xx.xx", json.dumps(dic)))
cursor.commit()
Change your code as below:
dic = {'office': {'component_office': ['Word2010SP0', 'PowerPoint2010SP0']}}
d = str(dic)
# Sql query
sql = """INSERT INTO ep_soft(ip_address, soft_data) VALUES (%r, %r)""" % ("192.xxx.xx.xx", d )
Try this:
dic = { 'office': {'component_office': ['Word2010SP0', 'PowerPoint2010SP0'] } }
"INSERT INTO `db`.`table`(`ip_address`, `soft_data`) VALUES (`{}`, `{}`)".format("192.xxx.xx.xx", str(dic))
Change db and table to the values you need.
It is a good idea to sanitize your inputs, and '.format' is useful when needing to use the same variable multiple times within a query. (Not that you to for this example)
dic = {'office': {'component_office': ['Word2010SP0', 'PowerPoint2010SP0']}}
ip = '192.xxx.xx.xx'
with conn.cursor() as cur:
cur.execute("INSERT INTO `ep_soft`(`ip_address`, `soft_data`) VALUES ({0}, '{1}')".format(cur.escape(ip),json.dumps(event)))
conn.commit()
If you do not use cur.escape(variable), you will need to enclose the placeholder {} in quotes.
This answer has some pseudo code regarding the connection object and the flavor of mysql is memsql, but other than that it should be straightforward to follow.
import json
#... do something
a_big_dict = getAHugeDict() #build a huge python dict
conn = getMeAConnection(...)
serialized_dict = json.dumps(a_big_dict) #serialize dict to string
#Something like this to hold the serialization...
qry_create = """
CREATE TABLE TABLE_OF_BIG_DICTS (
ROWID BIGINT NOT NULL AUTO_INCREMENT,
SERIALIZED_DICT BLOB NOT NULL,
UPLOAD_DT TIMESTAMP NULL DEFAULT CURRENT_TIMESTAMP,
KEY (`ROWID`) USING CLUSTERED COLUMNSTORE
);
"""
conn.execute(qry_create)
#Something like this to hold em'
qry_insert = """
INSERT INTO TABLE_OF_BIG_DICTS (SERIALIZED_DICT)
SELECT '{SERIALIZED_DICT}' as SERIALIZED_DICT;
"""
#Send it to db
conn.execute(qry_insert.format(SERIALIZED_DICT=serialized_dict))
#grab the latest
qry_read = """
SELECT a.SERIALIZED_DICT
from TABLE_OF_BIG_DICTS a
JOIN
(
SELECT MAX(UPLOAD_DT) AS MAX_UPLOAD_DT
FROM TABLE_OF_BIG_DICTS
) b
ON a.UPLOAD_DT = b.MAX_UPLOAD_DT
LIMIT 1
"""
#something like this to read the latest dict...
df_dict = conn.sql_to_dataframe(qry_read)
dict_str = df_dict.iloc[df_dict.index.min()][0]
#dicts never die they just get rebuilt
dict_better = json.loads(dict_str)

Using dict_cursor in django

To get a cursor in django I do:
from django.db import connection
cursor = connection.cursor()
How would I get a dict cursor in django, the equivalent of -
import MySQLdb
connection = (establish connection)
dict_cursor = connection.cursor(MySQLdb.cursors.DictCursor)
Is there a way to do this in django? When I tried cursor = connection.cursor(MySQLdb.cursors.DictCursor) I got a Exception Value: cursor() takes exactly 1 argument (2 given). Or do I need to connect directly with the python-mysql driver?
The django docs suggest using dictfetchall:
def dictfetchall(cursor):
"Returns all rows from a cursor as a dict"
desc = cursor.description
return [
dict(zip([col[0] for col in desc], row))
for row in cursor.fetchall()
]
Is there a performance difference between using this and creating a dict_cursor?
No there is no such support for DictCursor in Django.
But you can write a small function to that for you.
See docs: Executing custom SQL directly:
def dictfetchall(cursor):
"Returns all rows from a cursor as a dict"
desc = cursor.description
return [
dict(zip([col[0] for col in desc], row))
for row in cursor.fetchall()
]
>>> cursor.execute("SELECT id, parent_id from test LIMIT 2");
>>> dictfetchall(cursor)
[{'parent_id': None, 'id': 54360982L}, {'parent_id': None, 'id': 54360880L}]
Easily done with Postgres at least, i'm sure mysql has similar ( Django 1.11)
from django.db import connections
from psycopg2.extras import NamedTupleCursor
def scan_tables(app):
conn = connections['default']
conn.ensure_connection()
with conn.connection.cursor(cursor_factory=NamedTupleCursor) as cursor:
cursor.execute("SELECT table_name, column_name "
"FROM information_schema.columns AS c "
"WHERE table_name LIKE '{}_%'".format(app))
columns = cursor.fetchall()
for column in columns:
print(column.table_name, column.column_name)
scan_tables('django')
Obviously feel free to use DictCursor, RealDictCursor, LoggingCursor etc
The following code converts the result set into a dictionary.
from django.db import connections
cursor = connections['default'].cursor()
columns = (x.name for x in cursor.description)
result = cursor.fetchone()
result = dict(zip(columns, result))
If the result set has multiple rows, iterate over the cursor instead.
columns = [x.name for x in cursor.description]
for row in cursor:
row = dict(zip(columns, row))
The main Purpose of using RealDictCursor is to get data in list of dictionary format.
And the apt solution is this and that too without using django ORM
def fun(request):
from django.db import connections
import json
from psycopg2.extras import RealDictCursor
con = connections['default']
con.ensure_connection()
cursor= con.connection.cursor(cursor_factory=RealDictCursor)
cursor.execute("select * from Customer")
columns=cursor.fetchall()
columns=json.dumps(columns)
output:
[{...},{...},{......}]

Categories

Resources