Suggested way to run multiple sql statements in python? - python

What would be the suggested way to run something like the following in python:
self.cursor.execute('SET FOREIGN_KEY_CHECKS=0; DROP TABLE IF EXISTS %s; SET FOREIGN_KEY_CHECKS=1' % (table_name,))
For example, should this be three separate self.cursor.execute(...) statements? Is there a specific method that should be used other than cursor.execute(...) to do something like this, or what is the suggested practice for doing this? Currently the code I have is as follows:
self.cursor.execute('SET FOREIGN_KEY_CHECKS=0;')
self.cursor.execute('DROP TABLE IF EXISTS %s;' % (table_name,))
self.cursor.execute('SET FOREIGN_KEY_CHECKS=1;')
self.cursor.execute('CREATE TABLE %s select * from mytable;' % (table_name,))
As you can see, everything is run separately...so I'm not sure if this is a good idea or not (or rather -- what the best way to do the above is). Perhaps BEGIN...END ?

I would create a stored procedure:
DROP PROCEDURE IF EXISTS CopyTable;
DELIMITER $$
CREATE PROCEDURE CopyTable(IN _mytable VARCHAR(64), _table_name VARCHAR(64))
BEGIN
SET FOREIGN_KEY_CHECKS=0;
SET #stmt = CONCAT('DROP TABLE IF EXISTS ',_table_name);
PREPARE stmt1 FROM #stmt;
EXECUTE stmt1;
SET FOREIGN_KEY_CHECKS=1;
SET #stmt = CONCAT('CREATE TABLE ',_table_name,' as select * from ', _mytable);
PREPARE stmt1 FROM #stmt;
EXECUTE stmt1;
DEALLOCATE PREPARE stmt1;
END$$
DELIMITER ;
and then just run:
args = ['mytable', 'table_name']
cursor.callproc('CopyTable', args)
keeping it simple and modular. Of course you should do some kind of error checking and you could even have the stored procedure return a code to indicate success or failure.

In the documentation of MySQLCursor.execute(), they suggest to use the multi=True parameter:
operation = 'SELECT 1; INSERT INTO t1 VALUES (); SELECT 2'
for result in cursor.execute(operation, multi=True):
...
You can find another example in the module's source code.

I would not rely on any multi=True parameter of the execute function, which is very driver dependent nor attempt to try to split a string on the ; character, which might be embedded in a string literal. The most straightforward approach would be to create a function, execute_multiple, that takes a list of statements to be executed and a rollback_on_error parameter to determine what action to be performed if any of the statements results in an exception.
My experience with MySQLdb and PyMySQL has been that by default they start off in autocommit=0, in other words as if you are already in a transaction and an explicit commit is required. Anyway, that assumption holds for the code below. If that is not the case, then you should either 1. explicitly set autocommit=0 after connecting or 2. Modify this code to start a transaction following the try statement
def execute_multiple(conn, statements, rollback_on_error=True):
"""
Execute multiple SQL statements and returns the cursor from the last executed statement.
:param conn: The connection to the database
:type conn: Database connection
:param statements: The statements to be executed
:type statements: A list of strings
:param: rollback_on_error: Flag to indicate action to be taken on an exception
:type rollback_on_error: bool
:returns cursor from the last statement executed
:rtype cursor
"""
try:
cursor = conn.cursor()
for statement in statements:
cursor.execute(statement)
if not rollback_on_error:
conn.commit() # commit on each statement
except Exception as e:
if rollback_on_error:
conn.rollback()
raise
else:
if rollback_on_error:
conn.commit() # then commit only after all statements have completed successfully
You can also have a version that handles prepared statements with its parameter list:
def execute_multiple_prepared(conn, statements_and_values, rollback_on_error=True):
"""
Execute multiple SQL statements and returns the cursor from the last executed statement.
:param conn: The connection to the database
:type conn: Database connection
:param statements_and_values: The statements and values to be executed
:type statements_and_values: A list of lists. Each sublist consists of a string, the SQL prepared statement with %s placeholders, and a list or tuple of its parameters
:param: rollback_on_error: Flag to indicate action to be taken on an exception
:type rollback_on_error: bool
:returns cursor from the last statement executed
:rtype cursor
"""
try:
cursor = conn.cursor()
for s_v in statements_and_values:
cursor.execute(s_v[0], s_v[1])
if not rollback_on_error:
conn.commit() # commit on each statement
except Exception as e:
if rollback_on_error:
conn.rollback()
raise
else:
if rollback_on_error:
conn.commit() # then commit only after all statements have completed successfully
return cursor # return the cursor in case there are results to be processed
For example:
cursor = execute_multiple_prepared(conn, [('select * from test_table where count = %s', (2000,))], False)
Although, admittedly, the above call only had one SQL prepared statement with parameters.

I stuck multiple times in these types of problem in project. After the lot of research i found some points and suggestion.
execute() method work well with one query at a time. Because during the execution method take care of state.
I know cursor.execute(operation, params=None, multi=True) take multiple query. But parameters does not work well in this case and sometimes internal error exception spoil all results too. And code become massive and ambiguous. Even docs also mention this.
executemany(operation, seq_of_params) is not a good practice to implement every times. Because operation which produces one or more result sets constitutes undefined behavior, and the implementation is permitted (but not required) to raise an exception when it detects that a result set has been created by an invocation of the operation. [source - docs]
Suggestion 1-:
Make a list of queries like -:
table_name = 'test'
quries = [
'SET FOREIGN_KEY_CHECKS=0;',
'DROP TABLE IF EXISTS {};'.format(table_name),
'SET FOREIGN_KEY_CHECKS=1;',
'CREATE TABLE {} select * from mytable;'.format(table_name),
]
for query in quries:
result = self.cursor.execute(query)
# Do operation with result
Suggestion 2-:
Set with dict. [you can also make this by executemany for recursive parameters for some special cases.]
quries = [
{'DROP TABLE IF EXISTS %(table_name);':{'table_name': 'student'}},
{'CREATE TABLE %(table_name) select * from mytable;':
{'table_name':'teacher'}},
{'SET FOREIGN_KEY_CHECKS=0;': ''}
]
for data in quries:
for query, parameter in data.iteritems():
if parameter == '':
result = self.cursor.execute(query)
# Do something with result
else:
result = self.cursor.execute(query, parameter)
# Do something with result
You can also use split with script. Not recommended
with connection.cursor() as cursor:
for statement in script.split(';'):
if len(statement) > 0:
cursor.execute(statement + ';')
Note -: I use mostly list of query approach but in some complex place use make dictionary approach.

Beauty is in the eye of the beholder, so the best way to do something is subjective unless you explicitly tell us how to measure is. There are three hypothetical options I can see:
Use the multi option of MySQLCursor (not ideal)
Keep the query in multiple rows
Keep the query in a single row
Optionally, you can also change the query around to avoid some unnecessary work.
Regarding the multi option the MySQL documentation is quite clear on this
If multi is set to True, execute() is able to execute multiple statements specified in the operation string. It returns an iterator that enables processing the result of each statement. However, using parameters does not work well in this case, and it is usually a good idea to execute each statement on its own.
Regarding option 2. and 3. it is purely a preference on how you would like to view your code. Recall that a connection object has autocommit=FALSE by default, so the cursor actually batches cursor.execute(...) calls into a single transaction. In other words, both versions below are equivalent.
self.cursor.execute('SET FOREIGN_KEY_CHECKS=0;')
self.cursor.execute('DROP TABLE IF EXISTS %s;' % (table_name,))
self.cursor.execute('SET FOREIGN_KEY_CHECKS=1;')
self.cursor.execute('CREATE TABLE %s select * from mytable;' % (table_name,))
vs
self.cursor.execute(
'SET FOREIGN_KEY_CHECKS=0;'
'DROP TABLE IF EXISTS %s;' % (table_name,)
'SET FOREIGN_KEY_CHECKS=1;'
'CREATE TABLE %s select * from mytable;' % (table_name,)
)
Python 3.6 introduced f-strings that are super elegant and you should use them if you can. :)
self.cursor.execute(
'SET FOREIGN_KEY_CHECKS=0;'
f'DROP TABLE IF EXISTS {table_name};'
'SET FOREIGN_KEY_CHECKS=1;'
f'CREATE TABLE {table_name} select * from mytable;'
)
Note that this no longer holds when you start to manipulate rows; in this case, it becomes query specific and you should profile if relevant. A related SO question is What is faster, one big query or many small queries?
Finally, it may be more elegant to use TRUNCATE instead of DROP TABLE unless you have specific reasons not to.
self.cursor.execute(
f'CREATE TABLE IF NOT EXISTS {table_name};'
'SET FOREIGN_KEY_CHECKS=0;'
f'TRUNCATE TABLE {table_name};'
'SET FOREIGN_KEY_CHECKS=1;'
f'INSERT INTO {table_name} SELECT * FROM mytable;'
)

Look at the documentation for MySQLCursor.execute().
It claims that you can pass in a multi parameter that allows you to run multiple queries in one string.
If multi is set to True, execute() is able to execute multiple statements specified in the operation string.
multi is an optional second parameter to the execute() call:
operation = 'SELECT 1; INSERT INTO t1 VALUES (); SELECT 2'
for result in cursor.execute(operation, multi=True):

With import mysql.connector
you can do following command, just need to replace t1 and episodes, with your own tabaes
tablename= "t1"
mycursor.execute("SET FOREIGN_KEY_CHECKS=0; DROP TABLE IF EXISTS {}; SET FOREIGN_KEY_CHECKS=1;CREATE TABLE {} select * from episodes;".format(tablename, tablename),multi=True)
While this will run, you must be sure that the foreign key restraints that will be in effect after enabling it, will not cause problems.
if tablename is something that a user can enter, you should think about a whitelist of table names.
Prepared statemnts don't work with table and column names , so we have to use string replacement to get the correct tablenames at the right posistion, bit this will make your code vulnerable to sql injection
The multi=True is necessary to run 4 commands in the connector, when i tested it, the debugger demanded it.

executescript()
This is a convenience method for executing multiple SQL statements at once. It executes the SQL script it gets as a parameter.
Syntax:
sqlite3.connect.executescript(script)
Example code:
import sqlite3
# Connection with the DataBase
# 'library.db'
connection = sqlite3.connect("library.db")
cursor = connection.cursor()
# SQL piece of code Executed
# SQL piece of code Executed
cursor.executescript("""
CREATE TABLE people(
firstname,
lastname,
age
);
CREATE TABLE book(
title,
author,
published
);
INSERT INTO
book(title, author, published)
VALUES (
'Dan Clarke''s GFG Detective Agency',
'Sean Simpsons',
1987
);
""")
sql = """
SELECT COUNT(*) FROM book;"""
cursor.execute(sql)
# The output in fetched and returned
# as a List by fetchall()
result = cursor.fetchall()
print(result)
sql = """
SELECT * FROM book;"""
cursor.execute(sql)
result = cursor.fetchall()
print(result)
# Changes saved into database
connection.commit()
# Connection closed(broken)
# with DataBase
connection.close()
Output:
[(1,)]
[("Dan Clarke's GFG Detective Agency", 'Sean Simpsons', 1987)]
executemany()
It is often the case when, large amount of data has to be inserted into database from Data Files(for simpler case take Lists, arrays). It would be simple to iterate the code many a times than write every time, each line into database. But the use of loop would not be suitable in this case, the below example shows why. Syntax and use of executemany() is explained below and how it can be used like a loop:
Source: GeeksForGeeks: SQL Using Python
Check out this source.. this has lots of great stuff for you.

All the answers are completely valid so I'd add my solution with static typing and closing context manager.
from contextlib import closing
from typing import List
import mysql.connector
import logging
logger = logging.getLogger(__name__)
def execute(stmts: List[str]) -> None:
logger.info("Starting daily execution")
with closing(mysql.connector.connect()) as connection:
try:
with closing(connection.cursor()) as cursor:
cursor.execute(' ; '.join(stmts), multi=True)
except Exception:
logger.exception("Rollbacking changes")
connection.rollback()
raise
else:
logger.info("Finished successfully")
If I'm not mistaken connection or cursor might not be a context manager, depending on the version of mysql driver you're having, so that's a pythonic safe solution.

Related

Is there a valid way to use string formatting and still sanitize data in Python code for sqlite3?

import sqlite3
def delete_data(db_name, table, col, search_condition):
with sqlite3.connect(db_name) as conn:
with conn.cursor() as cur:
code_piece = (f"FROM {table} WHERE {col}={search_condition}",)
self.cur.execute("DELETE ?", code_piece)
Taking the above code, is the data the from the function arguments sanitized or is there still a possibility of an sql injection attack?
Understanding QStyle Parameters
Here's a fix for a bunch of syntactical errors in your code example that prevent it from running:
def delete_data(db_name, table, col, search_condition):
with sqlite3.connect(db_name) as conn:
cur = conn.cursor()
code_piece = (f"FROM {table} WHERE {col}={search_condition}",)
cur.execute("DELETE ?", code_piece)
If you would actually run this function, it would throw an exception on the last line that should read sth like the following:
sqlite3.OperationalError: near "?": syntax error
Why? As far as I know, you cannot use qstyle parameters to cover anything but what could slot in as a value in a valid SQL statement; you cannot use it to replace large parts of a statement; you also can't replace table names. The piece of code that is closest to your intent that could run without raising an exception, is the following code:
def delete_data(db_name, col, search_condition):
with sqlite3.connect(db_name) as conn:
cur = conn.cursor()
cur.execute("DELETE FROM TABLE_NAME WHERE ?=?;", (col, search_condition,))
However, imagine if your table had an actual column called PRICE, with integer values, and several entries had values 5 for that column. The following statement would not delete any of them, because the value of col is not interpreted as the name of a column, but slotted in as a string, so you end up comparing the string 'PRICE' with the integer 5 in the WHERE-clause, which would never be true:
delete_data("sqlite3.db", 'PRICE', 5) # DELETE FROM TABLE_NAME WHERE 'PRICE'=5;
So really, the only thing that your function can end up being, is the following... which is far away from the generic stuff that you were trying to do; however, it uses the qstyle parameters properly, and should be secure from SQL injection:
def delete_data(db_name, col, search_condition):
with sqlite3.connect(db_name) as conn:
cur = conn.cursor()
cur.execute("DELETE FROM TABLE_NAME WHERE PRICE=?;", (search_condition,))
delete_data("sqlite3.db", 5); # DELETE FROM TABLE_NAME WHERE PRICE=5;
But honestly, this is great, because you really don't want functions that can end up resulting in a bunch of unpredictable queries to your database. My general advise is to just wrap each query in a simple function, and keep it all as simple as possible.
Your Original Question and SQL Injection
But let's imagine that your original code would actually run as you intended it to. There is nothing that prevents an attacker from abusing any of the parameters to alter the intended purpose of the statement: if user input affects the table parameter, it can be used to delete the content of any table; and the col and search_condition parameters could be altered to delete all entries of a table.
However, it all depends on whether or not an attacker has the ability to alter the values of the parameter through user input. It is unlikely that user input is used directly to select the table or the column to be compared against. However, it would be likely that you would use user input to use as the value of the search_condition parameter. If so, then the following function call would be possible.
delete_data(db_name, "USERS", "NAME", "Marc OR 1=1"):
This would result in the following query to the database, resulting in the deletion of all entries of the USERS table.
DELETE FROM USERS WHERE NAME=Marc or 1=1;
So yeah, your code was still susceptible to SQL injection.

python pyodbc SQLite sql injections

I use pyodbc in my python flask Project for the SQLite DB connection.
I know and understand SQL Injections but this is my first time dealing with it.
I tried to execute some
I have a function which concatenates the SQL String in my database.py file:
def open_issue(self, data_object):
cursor = self.conn.cursor()
# data_object is the issue i get from the user
name = data_object["name"]
text = data_object["text"]
rating_sum = 0
# if the user provides an issue
if name:
# check if issue is already in db
test = cursor.execute(f'''SELECT name FROM issue WHERE name = "{name}"''')
data = test.fetchall()
# if not in db insert
if len(data) == 0:
# insert the issue
cursor.executescript(f'''INSERT INTO issue (name, text, rating_sum)
VALUES ("{name}", "{text}", {rating_sum})''')
else:
print("nothing inserted!")
In the api.py file the open_issue() function gets called:
#self.app.route('/open_issue')
def insertdata():
# data sent from client
# data_object = flask.request.json
# unit test dictionary
data_object = {"name": "injection-test-table",
"text": "'; CREATE TABLE 'injected_table-1337';--"}
DB().open_issue(data_object)
The "'; CREATE TABLE 'injected_table-1337';--" sql injection has not created the injected_table-1337, instead it got inserted normally like a string into the text column of the injection-test-table.
So i don't really know if i am safe for the standard ways of SQL injection (this project will only be hosted locally but good security is always welcome)
And secondary: are there ways with pyodbc to check if a string contains sql syntax or symbols, so that nothing will get inserted in my example or do i need to check the strings manually?
Thanks a lot
As it turns out, with SQLite you are at much less risk of SQL injection issues because by default neither Python's built-in sqlite3 module nor the SQLite ODBC driver allow multiple statements to be executed in a single .execute call (commonly known as an "anonymous code block"). This code:
thing = "'; CREATE TABLE bobby (id int primary key); --"
sql = f"SELECT * FROM table1 WHERE txt='{thing}'"
crsr.execute(sql)
throws this for sqlite3
sqlite3.Warning: You can only execute one statement at a time.
and this for SQLite ODBC
pyodbc.Error: ('HY000', '[HY000] only one SQL statement allowed (-1) (SQLExecDirectW)')
Still, you should follow best practices and use a proper parameterized query
thing = "'; CREATE TABLE bobby (id int primary key); --"
sql = "SELECT * FROM table1 WHERE txt=?"
crsr.execute(sql, (thing, ))
because this will also correctly handle parameter values that would cause errors if injected directly, e.g.,
thing = "it's good to avoid SQL injection"

Use of '.format()' vs. '%s' in cursor.execute() for mysql JSON field, with Python mysql.connector,

My objective is to store a JSON object into a MySQL database field of type json, using the mysql.connector library.
import mysql.connector
import json
jsonData = json.dumps(origin_of_jsonData)
cnx = mysql.connector.connect(**config_defined_elsewhere)
cursor = cnx.cursor()
cursor.execute('CREATE DATABASE dataBase')
cnx.database = 'dataBase'
cursor = cnx.cursor()
cursor.execute('CREATE TABLE table (id_field INT NOT NULL, json_data_field JSON NOT NULL, PRIMARY KEY (id_field))')
Now, the code below WORKS just fine, the focus of my question is the use of '%s':
insert_statement = "INSERT INTO table (id_field, json_data_field) VALUES (%s, %s)"
values_to_insert = (1, jsonData)
cursor.execute(insert_statement, values_to_insert)
My problem with that: I am very strictly adhering to the use of '...{}'.format(aValue) (or f'...{aValue}') when combining variable aValue(s) into a string, thus avoiding the use of %s (whatever my reasons for that, let's not debate them here - but it is how I would like to keep it wherever possible, hence my question).
In any case, I am simply unable, whichever way I try, to create something that stores the jsonData into the mySql dataBase using something that resembles the above structure and uses '...{}'.format() (in whatever shape or form) instead of %s. For example, I have (among many iterations) tried
insert_statement = "INSERT INTO table (id_field, json_data_field) VALUES ({}, {})".format(1, jsonData)
cursor.execute(insert_statement)
but no matter how I turn and twist it, I keep getting the following error:
ProgrammingError: 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '[some_content_from_jsonData})]' at line 1
Now my question(s):
1) Is there a way to avoid the use of %s here that I am missing?
2) If not, why? What is it that makes this impossible? Is it the cursor.execute() function, or is it the fact that it is a JSON object, or is it something completely different? Shouldn't {}.format() be able to do everything that %s could do, and more?
First of all: NEVER DIRECTLY INSERT YOUR DATA INTO YOUR QUERY STRING!
Using %s in a MySQL query string is not the same as using it in a python string.
In python, you just format the string and 'hello %s!' % 'world' becomes 'hello world!'. In SQL, the %s signals parameter insertion. This sends your query and data to the server separately. You are also not bound to this syntax. The python DB-API specification specifies more styles for this: DB-API parameter styles (PEP 249). This has several advantages over inserting your data directly into the query string:
Prevents SQL injection
Say you have a query to authenticate users by password. You would do that with the following query (of course you would normally salt and hash the password, but that is not the topic of this question):
SELECT 1 FROM users WHERE username='foo' AND password='bar'
The naive way to construct this query would be:
"SELECT 1 FROM users WHERE username='{}' AND password='{}'".format(username, password)
However, what would happen if someone inputs ' OR 1=1 as password. The formatted query would then become
SELECT 1 FROM users WHERE username='foo' AND password='' OR 1=1
which will allways return 1. When using parameter insertion:
execute('SELECT 1 FROM users WHERE username=%s AND password=%s', username, password)
this will never happen, as the query will be interpreted by the server separately.
Performance
If you run the same query many times with different data, the performance difference between using a formatted query and parameter insertion can be significant. With parameter insertion, the server only has to compile the query once (as it is the same every time) and execute it with different data, but with string formatting, it will have to compile it over and over again.
In addition to what was said above, I would like to add some details that I did not immediately understand, and that other (newbies like me ;)) may also find helpful:
1) "parameter insertion" is meant for only for values, it will not work for table names, column names, etc. - for those, the Python string substitution works fine in the sql syntax defintion
2) the cursor.execute function requires a tuple to work (as specified here, albeit not immediately clear, at least to me: https://dev.mysql.com/doc/connector-python/en/connector-python-api-mysqlcursor-execute.html)
EXAMPLE for both in one function:
def checkIfRecordExists(column, table, condition_name, condition_value):
...
sqlSyntax = 'SELECT {} FROM {} WHERE {} = %s'.format(column, table, condition_name)
cursor.execute(sqlSyntax, (condition_value,))
Note both the use of .format in the initial sql syntax definition and the use of (condition_value,) in the execute function.

Using the python MySQLDB SScursor with nested queries

The typical MySQLdb library query can use a lot of memory and perform poorly in Python, when a large result set is generated. For example:
cursor.execute("SELECT id, name FROM `table`")
for i in xrange(cursor.rowcount):
id, name = cursor.fetchone()
print id, name
There is an optional cursor that will fetch just one row at a time, really speeding up the script and cutting the memory footprint of the script a lot.
import MySQLdb
import MySQLdb.cursors
conn = MySQLdb.connect(user="user", passwd="password", db="dbname",
cursorclass = MySQLdb.cursors.SSCursor)
cur = conn.cursor()
cur.execute("SELECT id, name FROM users")
row = cur.fetchone()
while row is not None:
doSomething()
row = cur.fetchone()
cur.close()
conn.close()
But I can't find anything about using SSCursor with with nested queries. If this is the definition of doSomething():
def doSomething()
cur2 = conn.cursor()
cur2.execute('select id,x,y from table2')
rows = cur2.fetchall()
for row in rows:
doSomethingElse(row)
cur2.close()
then the script throws the following error:
_mysql_exceptions.ProgrammingError: (2014, "Commands out of sync; you can't run this command now")
It sounds as if SSCursor is not compatible with nested queries. Is that true? If so that's too bad because the main loop seems to run too slowly with the standard cursor.
This problem in discussed a bit in the MySQLdb User's Guide, under the heading of the threadsafety attribute (emphasis mine):
The MySQL protocol can not handle multiple threads using the same
connection at once. Some earlier versions of MySQLdb utilized locking
to achieve a threadsafety of 2. While this is not terribly hard to
accomplish using the standard Cursor class (which uses
mysql_store_result()), it is complicated by SSCursor (which uses
mysql_use_result(); with the latter you must ensure all the rows have
been read before another query can be executed.
The documentation for the MySQL C API function mysql_use_result() gives more information about your error message:
When using mysql_use_result(), you must execute mysql_fetch_row()
until a NULL value is returned, otherwise, the unfetched rows are
returned as part of the result set for your next query. The C API
gives the error Commands out of sync; you can't run this command now
if you forget to do this!
In other words, you must completely fetch the result set from any unbuffered cursor (i.e., one that uses mysql_use_result() instead of mysql_store_result() - with MySQLdb, that means SSCursor and SSDictCursor) before you can execute another statement over the same connection.
In your situation, the most direct solution would be to open a second connection to use while iterating over the result set of the unbuffered query. (It wouldn't work to simply get a buffered cursor from the same connection; you'd still have to advance past the unbuffered result set before using the buffered cursor.)
If your workflow is something like "loop through a big result set, executing N little queries for each row," consider looking into MySQL's stored procedures as an alternative to nesting cursors from different connections. You can still use MySQLdb to call the procedure and get the results, though you'll definitely want to read the documentation of MySQLdb's callproc() method since it doesn't conform to Python's database API specs when retrieving procedure outputs.
A second alternative is to stick to buffered cursors, but split up your query into batches. That's what I ended up doing for a project last year where I needed to loop through a set of millions of rows, parse some of the data with an in-house module, and perform some INSERT and UPDATE queries after processing each row. The general idea looks something like this:
QUERY = r"SELECT id, name FROM `table` WHERE id BETWEEN %s and %s;"
BATCH_SIZE = 5000
i = 0
while True:
cursor.execute(QUERY, (i + 1, i + BATCH_SIZE))
result = cursor.fetchall()
# If there's no possibility of a gap as large as BATCH_SIZE in your table ids,
# you can test to break out of the loop like this (otherwise, adjust accordingly):
if not result:
break
for row in result:
doSomething()
i += BATCH_SIZE
One other thing I would note about your example code is that you can iterate directly over a cursor in MySQLdb instead of calling fetchone() explicitly over xrange(cursor.rowcount). This is especially important when using an unbuffered cursor, because the rowcount attribute is undefined and will give a very unexpected result (see: Python MysqlDB using cursor.rowcount with SSDictCursor returning wrong count).

pyodbc, call stored procedure with table variable

I have to call a MS SQLServer stored procedure with a table variable parameter.
/* Declare a variable that references the type. */
DECLARE #TableVariable AS [AList];
/* Add data to the table variable. */
INSERT INTO #TableVariable (val) VALUES ('value-1');
INSERT INTO #TableVariable (val) VALUES ('value-2');
EXEC [dbo].[sp_MyProc]
#param = #TableVariable
Works well in the SQL Sv Management studio. I tried the following in python using PyOdbc:
cursor.execute("declare #TableVariable AS [AList]")
for a in mylist:
cursor.execute("INSERT INTO #TableVariable (val) VALUES (?)", a)
cursor.execute("{call dbo.sp_MyProc(#TableVariable)}")
With the following error: error 42000 : the table variable must be declared. THe variable does not survive the different execute steps.
I also tried:
sql = "DECLARE #TableVariable AS [AList]; "
for a in mylist:
sql = sql + "INSERT INTO #TableVariable (val) VALUES ('{}'); ".format(a)
sql = sql + "EXEC [dbo].[sp_MyProc] #param = #TableVariable"
cursor.execute(sql)
With the following error: No results. Previous SQL was not a query.
No more chance with
sql = sql + "{call dbo.sp_MyProc(#TableVariable)}"
does somebody knows how to handle this using Pyodbc?
Now the root of your problem is that a SQL Server variable has the scope of the batch it was defined in. Each call to cursor.execute is a separate batch, even if they are in the same transaction.
There are a couple of ways you can work around this. The most direct is to rewrite your Python code so that it sends everything as a single batch. (I tested this on my test server and it should work as long as you either add set nocount on or else step over the intermediate results with nextset.)
A more indirect way is to rewrite the procedure to look for a temp table instead of a table variable and then just create and populate the temp table instead of a table variable. A temp table that is not created inside a stored procedure has a scope of the session it was created in.
I believe this error has nothing to do with sql forgetting the table variable. I've experienced this recently, and the problem was that pyodbc doesnt know how to get a resultset back from the stored procedure if the SP also returns counts for the things affected.
In my case the fix for this was to simply put "SET NOCOUNT ON" at the start of the SP.
I hope this helps.
I am not sure if this works and I can't test it because I don't have MS SQL Server, but have you tried executing everything in a single statement:
cursor.execute("""
DECLARE #TableVariable AS [AList];
INSERT INTO #TableVariable (val) VALUES ('value-1');
INSERT INTO #TableVariable (val) VALUES ('value-2');
EXEC [dbo].[sp_MyProc] #param = #TableVariable;
""");
I had this same problem, but none of the answers here fixed it. I was unable to get "SET NOCOUNT ON" to work, and I was also unable to make a single batch operation working with a table variable. What did work was to use a temporary table in two batches, but it all day to find the right syntax. The code which follows creates and populates a temporary table in the first batch, then in the second, it executes a stored proc using the database name followed by two dots before the stored proc name. This syntax is important for avoiding the error, "Could not find stored procedure 'x'. (2812) (SQLExecDirectW))".
def create_incidents(db_config, create_table, columns, tuples_list, upg_date):
"""Executes trackerdb-dev mssql stored proc.
Args:
config (dict): config .ini file with mssqldb conn.
create_table (string): temporary table definition to be inserted into 'CREATE TABLE #TempTable ()'
columns (tuple): columns of the table table into which values will be inserted.
tuples_list (list): list of tuples where each describes a row of data to insert into the table.
upg_date (string): date on which the items in the list will be upgraded.
Returns:
None
"""
sql_create = """IF OBJECT_ID('tempdb..#TempTable') IS NOT NULL
DROP TABLE #TempTable;
CREATE TABLE #TempTable ({});
INSERT INTO #TempTable ({}) VALUES {};
"""
columns = '"{}"'.format('", "'.join(item for item in columns))
# this "params" variable is an egregious offense against security professionals everywhere. Replace it with parameterized queries asap.
params = ', '.join([str(tupl) for tupl in tuples_list])
sql_create = sql_create.format(
create_table
, columns
, params)
msconn.autocommit = True
cur = msconn.cursor()
try:
cur.execute(sql_create)
cur.execute("DatabaseName..TempTable_StoredProcedure ?", upg_date)
except pyodbc.DatabaseError as err:
print(err)
else:
cur.close()
return
create_table = """
int_column int
, name varchar(255)
, datacenter varchar(25)
"""
create_incidents(
db_config = db_config
, create_table = create_table
, columns = ('int_column', 'name', 'datacenter')
, cloud_list = tuples_list
, upg_date = '2017-09-08')
The stored proc uses IF OBJECT_ID('tempdb..#TempTable') IS NULL syntax to validate the temporary table has been created. If it has, the procedure selects data from it and continues. If the temporary table has not been created, the proc aborts. This forces the stored proc to use a copy of the #TempTable created outside the stored procedure itself but in the same session. The pyodbc session lasts until the cursor or connection is closed and the temporary table created by pyodbc has the scope of the entire session.
IF OBJECT_ID('tempdb..#TempTable') IS NULL
BEGIN
-- #TempTable gets created here only because SQL Server Management Studio throws errors if it isn't.
CREATE TABLE #TempTable (
int_column int
, name varchar(255)
, datacenter varchar(25)
);
-- This error is thrown so that the stored procedure requires a temporary table created *outside* the stored proc
THROW 50000, '#TempTable table not found in tempdb', 1;
END
ELSE
BEGIN
-- the stored procedure has now validated that the temporary table being used is coming from outside the stored procedure
SELECT * FROM #TempTable;
END;
Finally, note that "tempdb" is not a placeholder, like I thought when I first saw it. "tempdb" is an actual MS SQL Server database system object.
Set connection.autocommit = True and use cursor.execute() only once instead of multiple times. The SQL string that you pass to cursor.execute() must contain all 3 steps:
Declaring the table variable
Filling the table variable with data
Executing the stored procedure that uses that table variable as an input
You don't need semicolons between the 3 steps.
Here's a fully functional demo. I didn't bother with parameter passing since it's irrelevant, but it also works fine with this, for the record.
SQL Setup (execute ahead of time)
CREATE TYPE dbo.type_MyTableType AS TABLE(
a INT,
b INT,
c INT
)
GO
CREATE PROCEDURE dbo.CopyTable
#MyTable type_MyTableType READONLY
AS
BEGIN
SET NOCOUNT ON;
SELECT * INTO MyResultTable FROM #MyTable
END
python
import pyodbc
CONN_STRING = (
'Driver={SQL Server Native Client 11.0};'
'Server=...;Database=...;UID=...;PWD=...'
)
class DatabaseConnection(object):
def __init__(self, connection_string):
self.conn = pyodbc.connect(connection_string)
self.conn.autocommit = True
self.cursor = self.conn.cursor()
def __enter__(self):
return self.cursor
def __exit__(self, *args):
self.cursor.close()
self.conn.close()
sql = (
'DECLARE #MyTable type_MyTableType'
'\nINSERT INTO #MyTable VALUES'
'\n(11, 12, 13),'
'\n(21, 22, 23)'
'\nEXEC CopyTable #MyTable'
)
with DatabaseConnection(CONN_STRING) as cursor:
cursor.execute(sql)
If you want to spread the SQL across multiple calls to cursor.execute(), then you need to use a temporary table instead. Note that in that case, you still need connection.autocommit = True.
As Timothy pointed out the catch is to use nextset().
What I have found out is that when you execute() a multiple statement query, pyodbc checks (for any syntax errors) and executes only the first statement in the batch but not the entire batch unless you explicitly specify nextset().
say your query is :
cursor.execute('select 1 '
'select 1/0')
print(cursor.fetchall())
your result is:
[(1, )]
but as soon as you instruct it to move further in the batch that is the syntactically erroneous part via the command:
cursor.nextset()
there you have it:
pyodbc.DataError: ('22012', '[22012] [Microsoft][ODBC SQL Server Driver][SQL Server]Divide by zero error encountered. (8134) (SQLMoreResults)')
hence solves the issue that I encountered with working with variable tables in a multiple statement query.

Categories

Resources