Python Sqlite3 - how to work very very long WHERE IN() clause - python

[Using Python3.x]
The basic idea is that I have to run a first query to pull a long list of IDs (text) (about a million IDs) and use those IDs in an IN() clause in a WHERE statement in another query. I'm using python string formatting to make this happen, and works well if the number of IDs is small - say 100k - but gives me an error (pyodbc.Error: ('08S01', '[08S01] [MySQL][ODBC 5.2(a) Driver][mysqld-5.5.31-MariaDB-log]MySQL server has gone away (2006) (SQLExecDirectW)')) when the set is indeed about a million IDs long.
I tried to read into it a bit and think it might have something with the default(?) limits set by SQLite. Also I am wondering if I'm approaching this in the right way anyway.
Here's my code:
Step 1: Getting the IDs
def get_device_ids(con_str, query, tb_name):
local_con = lite.connect('temp.db')
local_cur = local_con.cursor()
local_cur.execute("DROP TABLE IF EXISTS {};".format(tb_name))
local_cur.execute("CREATE TABLE {} (id TEXT PRIMARY KEY, \
lang TEXT, first_date DATETIME);".format(tb_name))
data = create_external_con(con_str, query)
device_id_set = set()
with local_con:
for row in data:
device_id_set.update([row[0]])
local_cur.execute("INSERT INTO srv(id, lang, \
first_date) VALUES (?,?,?);", (row))
lid = local_cur.lastrowid
print("Number of rows inserted into SRV: {}".format(lid))
return device_id_set
Step 2: Generating the query with 'dynamic' IN() clause
def gen_queries(ids):
ids_list = str(', '.join("'" + id_ +"'" for id_ in ids))
query = """
SELECT e.id,
e.field2,
e.field3
FROM table e
WHERE e.id IN ({})
""".format(ids_list)
return query
Step 3: Using that query in another INSERT query
This is where things go wrong
def get_data(con_str, query, tb_name):
local_con = lite.connect('temp.db')
local_cur = local_con.cursor()
local_cur.execute("DROP TABLE IF EXISTS {};".format(tb_name))
local_cur.execute("CREATE TABLE {} (id TEXT, field1 INTEGER, \
field2 TEXT, field3 TEXT, field4 INTEGER, \
PRIMARY KEY(id, field1));".format(tb_name))
data = create_external_con(con_str, query) # <== THIS IS WHERE THAT QUERY IS INSERTED
device_id_set = set()
with local_con:
for row in data:
device_id_set.update(row[1])
local_cur.execute("INSERT INTO table2(id, field1, field2, field3, \
field4) VALUES (?,?,?,?,?);", (row))
lid = local_cur.lastrowid
print("Number of rows inserted into table2: {}".format(lid))
Any help is very much appreciated!
Edit
This is probably the right solution to my problem, however when I try to use "SET SESSION max_allowed_packet=104857600" I get the error: SESSION variable 'max_allowed_packet' is read-only. Use SET GLOBAL to assign the value (1621). Then when I try to change SESSION to GLOBAL i get an access denied message.

Insert the IDs into a (temporary) table in the same database, and then use:
... WHERE e.ID IN (SELECT ID FROM TempTable)

Related

Safely Inserting Strings Into a SQLite3 UNION Query Using Python

I'm aware that the best way to prevent sql injection is to write Python queries of this form (or similar):
query = 'SELECT %s %s from TABLE'
fields = ['ID', 'NAME']
cur.execute(query, fields)
The above will work for a single query, but what if we want to do a UNION of 2 SQL commands? I've set this up via sqlite3 for sake of repeatability, though technically I'm using pymysql. Looks as follows:
import sqlite3
conn = sqlite3.connect('dummy.db')
cur = conn.cursor()
query = 'CREATE TABLE DUMMY(ID int AUTO INCREMENT, VALUE varchar(255))'
query2 = 'CREATE TABLE DUMMy2(ID int AUTO INCREMENT, VALUE varchar(255)'
try:
cur.execute(query)
cur.execute(query2)
except:
print('Already made table!')
tnames = ['DUMMY1', 'DUMMY2']
sqlcmds = []
for i in range(0,2):
query = 'SELECT %s FROM {}'.format(tnames[i])
sqlcmds.append(query)
fields = ['VALUE', 'VALUE']
sqlcmd = ' UNION '.join(sqlcmds)
cur.execute(sqlcmd, valid_fields)
When I run this, I get a sqlite Operational Error:
sqlite3.OperationalError: near "%": syntax error
I've validated the query prints as expected with this output:
INSERT INTO DUMMY VALUES(%s) UNION INSERT INTO DUMMY VALUES(%s)
All looks good there. What is the issue with the string substitutions here? I can confirm that running a query with direct string substitution works fine. I've tried it with both selects and inserts.
EDIT: I'm aware there are multiple ways to do this with executemany and a few other. I need to do this with UNION for the purposes I'm using this for because this is a very, very simplified example fo the operational code I'm using
The code below executes few INSERTS at once
import sqlite3
conn = sqlite3.connect('dummy.db')
cur = conn.cursor()
query = 'CREATE TABLE DUMMY(ID int AUTO INCREMENT NOT NULL, VALUE varchar(255))'
try:
cur.execute(query)
except:
print('Already made table!')
valid_fields = [('ya dummy',), ('stupid test example',)]
cur.executemany('INSERT INTO DUMMY (VALUE) VALUES (?)',valid_fields)

use row as variable with python and sql

I am trying to update some values into a database. The user can give the row that should be changed. The input from the user, however is a string. When I try to parse this into the MySQL connector with python it gives an error because of the apostrophes. The code I have so far is:
import mysql.connector
conn = mysql.connector
conn = connector.connect(user=dbUser, password=dbPasswd, host=dbHost, database=dbName)
cursor = conn.cursor()
cursor.execute("""UPDATE Search SET %s = %s WHERE searchID = %s""", ('maxPrice', 300, 10,))
I get this error
mysql.connector.errors.ProgrammingError: 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''maxPrice' = 300 WHERE searchID = 10' at line 1
How do I get rid of the apostrophes? Because I think they are causing problems.
As noted, you can't prepare it using a field.
Perhaps the safest way is to allow only those fields that are expected, e.g.
#!/usr/bin/python
import os
import mysql.connector
conn = mysql.connector.connect(user=os.environ.get('USER'),
host='localhost',
database='sandbox',
unix_socket='/var/run/mysqld/mysqld.sock')
cur = conn.cursor(dictionary=True)
query = """SELECT column_name
FROM information_schema.columns
WHERE table_schema = DATABASE()
AND table_name = 'Search'
"""
cur.execute(query)
fields = [x['column_name'] for x in cur.fetchall()]
user_input = ['maxPrice', 300, 10]
if user_input[0] in fields:
cur.execute("""UPDATE Search SET {0} = {1} WHERE id = {1}""".format(user_input[0], '%s'),
tuple(user_input[1:]))
print cur.statement
Prints:
UPDATE Search SET maxPrice = 300 WHERE id = 10
Where:
mysql> show create table Search\G
*************************** 1. row ***************************
Search
CREATE TABLE `Search` (
`id` int(11) DEFAULT NULL,
`maxPrice` float DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1
A column name is not a parameter. Put the column name maxPrice directly into your SQL.
cursor.execute("""UPDATE Search SET maxPrice = %s WHERE searchID = %s""", (300, 10))
If you want to use the same code with different column names, you would have to modify the string itself.
sql = "UPDATE Search SET {} = %s WHERE searchID = %s".format('maxPrice')
cursor.execute(sql, (300,10))
But bear in mind that this is not safe from injection the way parameters are, so make sure your column name is not a user-input string or anything like that.
You cannot do it like that. You need to place the column name in the string before you call cursor.execute. Column names cannot be used when transforming variables in cursor.execute.
Something like this would work:
sql = "UPDATE Search SET {} = %s WHERE searchID = %s".format('maxPrice')
cursor.execute(sql, (300, 10,))
You cannot dynamically bind object (e.g., column) names, only values. If that's the logic you're trying to achieve, you'd have to resort to string manipulation/formatting (with all the risks of SQL-injection attacks that come with it). E.g.:
sql = """UPDATE Search SET {} = %s WHERE searchID = %s""".format('maxPrice')
cursor.execute(sql, (300, 10,))

Insert Data Into A Table Using The Same Foreign Key Value

I'm using SQL Server, Python, pypyodbc.
The tables I have are:
tbl_User: id, owner
tbl_UserPhone: id, number, user_id
user_id is the primary key of User and the foreign key of UserPhone.
I'm trying to insert 2 different phones to the same user_id using pypyodbc.
This is one of the things I tried that did not work:
cursor = connection.cursor()
SQLCommand = ("INSERT INTO tbl_UserPhones"
"(id,number,user_id)"
" VALUES (?,?,?)")
values = [userphone_index, user_phone,"((SELECT id from tbl_User where id = %d))" % user_id_index]
cursor.execute(SQLCommand, values)
cursor.commit()
Based on your comments, you have an identity column in tbl_UserPhones. Based on the column names I'm guessing it's the ID column.
The exception you get is very clear - you can't insert data into an identity column without specifically setting identity_insert to on before your insert statement. Basically, messing around with identity columns is bad practice. it's better to let Sql server to use it's built in capabilities and handle the insert to the identity column automatically.
You need to change your insert statement to not include the id column:
Instead of
SQLCommand = ("INSERT INTO tbl_UserPhones"
"(id,number,user_id)"
" VALUES (?,?,?)")
values = [userphone_index, user_phone,"((SELECT id from tbl_User where id = %d))" % user_id_index]
try this:
SQLCommand = ("INSERT INTO tbl_UserPhones"
"(number,user_id)"
" VALUES (?,?)")
values = [user_phone,"((SELECT id from tbl_User where id = %d))" % user_id_index]
SQLCommand = ("INSERT INTO tbl_UserPhones"
"(id,number,user_id)"
" VALUES (?,?,?)")
user_sqlCommand = cursor.execute("(SELECT id FROM tbl_User WHERE id = %d)" % user_index).fetchone()[0]
values = [userphone_index, user_phone, user_sqlCommand]
This was the solution.

How to store python dictionary in to mysql DB through python

I am trying to store the the following dictionary into mysql DB by converting the dictionary into a string and then trying to insert, but I am getting following error. How can this be solved, or is there any other way to store a dictionary into mysql DB?
dic = {'office': {'component_office': ['Word2010SP0', 'PowerPoint2010SP0']}}
d = str(dic)
# Sql query
sql = "INSERT INTO ep_soft(ip_address, soft_data) VALUES ('%s', '%s')" % ("192.xxx.xx.xx", d )
soft_data is a VARCHAR(500)
Error:
execution exception (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to
use near 'office': {'component_office': ['Word2010SP0', 'PowerPoint2010SP0' at line 1")
Any suggestions or help please?
First of all, don't ever construct raw SQL queries like that. Never ever. This is what parametrized queries are for. You've asking for an SQL injection attack.
If you want to store arbitrary data, as for example Python dictionaries, you should serialize that data. JSON would be good choice for the format.
Overall your code should look like this:
import MySQLdb
import json
db = MySQLdb.connect(...)
cursor = db.cursor()
dic = {'office': {'component_office': ['Word2010SP0', 'PowerPoint2010SP0']}}
sql = "INSERT INTO ep_soft(ip_address, soft_data) VALUES (%s, %s)"
cursor.execute(sql, ("192.xxx.xx.xx", json.dumps(dic)))
cursor.commit()
Change your code as below:
dic = {'office': {'component_office': ['Word2010SP0', 'PowerPoint2010SP0']}}
d = str(dic)
# Sql query
sql = """INSERT INTO ep_soft(ip_address, soft_data) VALUES (%r, %r)""" % ("192.xxx.xx.xx", d )
Try this:
dic = { 'office': {'component_office': ['Word2010SP0', 'PowerPoint2010SP0'] } }
"INSERT INTO `db`.`table`(`ip_address`, `soft_data`) VALUES (`{}`, `{}`)".format("192.xxx.xx.xx", str(dic))
Change db and table to the values you need.
It is a good idea to sanitize your inputs, and '.format' is useful when needing to use the same variable multiple times within a query. (Not that you to for this example)
dic = {'office': {'component_office': ['Word2010SP0', 'PowerPoint2010SP0']}}
ip = '192.xxx.xx.xx'
with conn.cursor() as cur:
cur.execute("INSERT INTO `ep_soft`(`ip_address`, `soft_data`) VALUES ({0}, '{1}')".format(cur.escape(ip),json.dumps(event)))
conn.commit()
If you do not use cur.escape(variable), you will need to enclose the placeholder {} in quotes.
This answer has some pseudo code regarding the connection object and the flavor of mysql is memsql, but other than that it should be straightforward to follow.
import json
#... do something
a_big_dict = getAHugeDict() #build a huge python dict
conn = getMeAConnection(...)
serialized_dict = json.dumps(a_big_dict) #serialize dict to string
#Something like this to hold the serialization...
qry_create = """
CREATE TABLE TABLE_OF_BIG_DICTS (
ROWID BIGINT NOT NULL AUTO_INCREMENT,
SERIALIZED_DICT BLOB NOT NULL,
UPLOAD_DT TIMESTAMP NULL DEFAULT CURRENT_TIMESTAMP,
KEY (`ROWID`) USING CLUSTERED COLUMNSTORE
);
"""
conn.execute(qry_create)
#Something like this to hold em'
qry_insert = """
INSERT INTO TABLE_OF_BIG_DICTS (SERIALIZED_DICT)
SELECT '{SERIALIZED_DICT}' as SERIALIZED_DICT;
"""
#Send it to db
conn.execute(qry_insert.format(SERIALIZED_DICT=serialized_dict))
#grab the latest
qry_read = """
SELECT a.SERIALIZED_DICT
from TABLE_OF_BIG_DICTS a
JOIN
(
SELECT MAX(UPLOAD_DT) AS MAX_UPLOAD_DT
FROM TABLE_OF_BIG_DICTS
) b
ON a.UPLOAD_DT = b.MAX_UPLOAD_DT
LIMIT 1
"""
#something like this to read the latest dict...
df_dict = conn.sql_to_dataframe(qry_read)
dict_str = df_dict.iloc[df_dict.index.min()][0]
#dicts never die they just get rebuilt
dict_better = json.loads(dict_str)

Python - automating MySQL query: passing parameter

The code in the sequence is working fine, but looking to improve the MySQL code to a more efficient format.
The first case is about a function that received a parameter and returns the customerID from MySQL db:
def clean_table(self,customerName):
getCustomerIDMySQL="""SELECT customerID
FROM customer
WHERE customerName = %s;"""
self.cursorMySQL.execute(getCustomerIDMySQL,(customerName))
for getID_row in self.cursorMySQL:
customerID=getID_row[0]
return customerID
In the case we know before hand that the result will be just one output, how to get the same thing into my getID_row, without using "for" statement?
For the second case, the function is running with the table name ('customer') on it...
def clean_tableCustomer(self):
cleanTableQuery = """TRUNCATE TABLE customer;"""
self.cursorMySQL.execute(cleanTableQuery)
setIndexQuery = """ALTER TABLE customer AUTO_INCREMENT = 1;"""
self.cursorMySQL.execute(setIndexQuery)
then, how to replace the table name as a parameter passed through the function? Here is how I tried to get this done:
def clean_table(self,tableName):
cleanTableQuery = """TRUNCATE TABLE %s;"""
self.cursorMySQL.execute(cleanTableQuery,(tableName))
setIndexQuery = """ALTER TABLE %s AUTO_INCREMENT = 1;"""
self.cursorMySQL.execute(setIndexQuery,(tableName))
But MySQL didn't work this time.
All comments and suggestions are highly appreciated.
For the first case (simple, but easy to get a KeyError when there is no row):
customerID = self.cursorMySQL.fetchone()[0]
More correct is to implement a new method for the cursor class:
def autofetch_value(self, sql, args=None):
""" return a single value from a single row or None if there is no row
"""
self.execute(sql, args)
returned_val = None
row = self.fetchone()
if row is not None:
returned_val = row[0]
return returned_val
For the second case:
def clean_table(self,tableName):
cleanTableQuery = """TRUNCATE TABLE %s;""" % (tableName,)
self.cursorMySQL.execute(cleanTableQuery)
setIndexQuery = """ALTER TABLE %s AUTO_INCREMENT = 1;""" % (tableName,)
self.cursorMySQL.execute(setIndexQuery)
Make sure you sanitize the data, since the cursor won't.
Unfortunately, you cannot parametrize the name of a table (see this post). You will have to use Python string operations to do what you are attempting here.
Hope this helps, it took me a while to find out when I ran into this issue.

Categories

Resources