What would be the suggested way to run something like the following in python:
self.cursor.execute('SET FOREIGN_KEY_CHECKS=0; DROP TABLE IF EXISTS %s; SET FOREIGN_KEY_CHECKS=1' % (table_name,))
For example, should this be three separate self.cursor.execute(...) statements? Is there a specific method that should be used other than cursor.execute(...) to do something like this, or what is the suggested practice for doing this? Currently the code I have is as follows:
self.cursor.execute('SET FOREIGN_KEY_CHECKS=0;')
self.cursor.execute('DROP TABLE IF EXISTS %s;' % (table_name,))
self.cursor.execute('SET FOREIGN_KEY_CHECKS=1;')
self.cursor.execute('CREATE TABLE %s select * from mytable;' % (table_name,))
As you can see, everything is run separately...so I'm not sure if this is a good idea or not (or rather -- what the best way to do the above is). Perhaps BEGIN...END ?
I would create a stored procedure:
DROP PROCEDURE IF EXISTS CopyTable;
DELIMITER $$
CREATE PROCEDURE CopyTable(IN _mytable VARCHAR(64), _table_name VARCHAR(64))
BEGIN
SET FOREIGN_KEY_CHECKS=0;
SET #stmt = CONCAT('DROP TABLE IF EXISTS ',_table_name);
PREPARE stmt1 FROM #stmt;
EXECUTE stmt1;
SET FOREIGN_KEY_CHECKS=1;
SET #stmt = CONCAT('CREATE TABLE ',_table_name,' as select * from ', _mytable);
PREPARE stmt1 FROM #stmt;
EXECUTE stmt1;
DEALLOCATE PREPARE stmt1;
END$$
DELIMITER ;
and then just run:
args = ['mytable', 'table_name']
cursor.callproc('CopyTable', args)
keeping it simple and modular. Of course you should do some kind of error checking and you could even have the stored procedure return a code to indicate success or failure.
In the documentation of MySQLCursor.execute(), they suggest to use the multi=True parameter:
operation = 'SELECT 1; INSERT INTO t1 VALUES (); SELECT 2'
for result in cursor.execute(operation, multi=True):
...
You can find another example in the module's source code.
I would not rely on any multi=True parameter of the execute function, which is very driver dependent nor attempt to try to split a string on the ; character, which might be embedded in a string literal. The most straightforward approach would be to create a function, execute_multiple, that takes a list of statements to be executed and a rollback_on_error parameter to determine what action to be performed if any of the statements results in an exception.
My experience with MySQLdb and PyMySQL has been that by default they start off in autocommit=0, in other words as if you are already in a transaction and an explicit commit is required. Anyway, that assumption holds for the code below. If that is not the case, then you should either 1. explicitly set autocommit=0 after connecting or 2. Modify this code to start a transaction following the try statement
def execute_multiple(conn, statements, rollback_on_error=True):
"""
Execute multiple SQL statements and returns the cursor from the last executed statement.
:param conn: The connection to the database
:type conn: Database connection
:param statements: The statements to be executed
:type statements: A list of strings
:param: rollback_on_error: Flag to indicate action to be taken on an exception
:type rollback_on_error: bool
:returns cursor from the last statement executed
:rtype cursor
"""
try:
cursor = conn.cursor()
for statement in statements:
cursor.execute(statement)
if not rollback_on_error:
conn.commit() # commit on each statement
except Exception as e:
if rollback_on_error:
conn.rollback()
raise
else:
if rollback_on_error:
conn.commit() # then commit only after all statements have completed successfully
You can also have a version that handles prepared statements with its parameter list:
def execute_multiple_prepared(conn, statements_and_values, rollback_on_error=True):
"""
Execute multiple SQL statements and returns the cursor from the last executed statement.
:param conn: The connection to the database
:type conn: Database connection
:param statements_and_values: The statements and values to be executed
:type statements_and_values: A list of lists. Each sublist consists of a string, the SQL prepared statement with %s placeholders, and a list or tuple of its parameters
:param: rollback_on_error: Flag to indicate action to be taken on an exception
:type rollback_on_error: bool
:returns cursor from the last statement executed
:rtype cursor
"""
try:
cursor = conn.cursor()
for s_v in statements_and_values:
cursor.execute(s_v[0], s_v[1])
if not rollback_on_error:
conn.commit() # commit on each statement
except Exception as e:
if rollback_on_error:
conn.rollback()
raise
else:
if rollback_on_error:
conn.commit() # then commit only after all statements have completed successfully
return cursor # return the cursor in case there are results to be processed
For example:
cursor = execute_multiple_prepared(conn, [('select * from test_table where count = %s', (2000,))], False)
Although, admittedly, the above call only had one SQL prepared statement with parameters.
I stuck multiple times in these types of problem in project. After the lot of research i found some points and suggestion.
execute() method work well with one query at a time. Because during the execution method take care of state.
I know cursor.execute(operation, params=None, multi=True) take multiple query. But parameters does not work well in this case and sometimes internal error exception spoil all results too. And code become massive and ambiguous. Even docs also mention this.
executemany(operation, seq_of_params) is not a good practice to implement every times. Because operation which produces one or more result sets constitutes undefined behavior, and the implementation is permitted (but not required) to raise an exception when it detects that a result set has been created by an invocation of the operation. [source - docs]
Suggestion 1-:
Make a list of queries like -:
table_name = 'test'
quries = [
'SET FOREIGN_KEY_CHECKS=0;',
'DROP TABLE IF EXISTS {};'.format(table_name),
'SET FOREIGN_KEY_CHECKS=1;',
'CREATE TABLE {} select * from mytable;'.format(table_name),
]
for query in quries:
result = self.cursor.execute(query)
# Do operation with result
Suggestion 2-:
Set with dict. [you can also make this by executemany for recursive parameters for some special cases.]
quries = [
{'DROP TABLE IF EXISTS %(table_name);':{'table_name': 'student'}},
{'CREATE TABLE %(table_name) select * from mytable;':
{'table_name':'teacher'}},
{'SET FOREIGN_KEY_CHECKS=0;': ''}
]
for data in quries:
for query, parameter in data.iteritems():
if parameter == '':
result = self.cursor.execute(query)
# Do something with result
else:
result = self.cursor.execute(query, parameter)
# Do something with result
You can also use split with script. Not recommended
with connection.cursor() as cursor:
for statement in script.split(';'):
if len(statement) > 0:
cursor.execute(statement + ';')
Note -: I use mostly list of query approach but in some complex place use make dictionary approach.
Beauty is in the eye of the beholder, so the best way to do something is subjective unless you explicitly tell us how to measure is. There are three hypothetical options I can see:
Use the multi option of MySQLCursor (not ideal)
Keep the query in multiple rows
Keep the query in a single row
Optionally, you can also change the query around to avoid some unnecessary work.
Regarding the multi option the MySQL documentation is quite clear on this
If multi is set to True, execute() is able to execute multiple statements specified in the operation string. It returns an iterator that enables processing the result of each statement. However, using parameters does not work well in this case, and it is usually a good idea to execute each statement on its own.
Regarding option 2. and 3. it is purely a preference on how you would like to view your code. Recall that a connection object has autocommit=FALSE by default, so the cursor actually batches cursor.execute(...) calls into a single transaction. In other words, both versions below are equivalent.
self.cursor.execute('SET FOREIGN_KEY_CHECKS=0;')
self.cursor.execute('DROP TABLE IF EXISTS %s;' % (table_name,))
self.cursor.execute('SET FOREIGN_KEY_CHECKS=1;')
self.cursor.execute('CREATE TABLE %s select * from mytable;' % (table_name,))
vs
self.cursor.execute(
'SET FOREIGN_KEY_CHECKS=0;'
'DROP TABLE IF EXISTS %s;' % (table_name,)
'SET FOREIGN_KEY_CHECKS=1;'
'CREATE TABLE %s select * from mytable;' % (table_name,)
)
Python 3.6 introduced f-strings that are super elegant and you should use them if you can. :)
self.cursor.execute(
'SET FOREIGN_KEY_CHECKS=0;'
f'DROP TABLE IF EXISTS {table_name};'
'SET FOREIGN_KEY_CHECKS=1;'
f'CREATE TABLE {table_name} select * from mytable;'
)
Note that this no longer holds when you start to manipulate rows; in this case, it becomes query specific and you should profile if relevant. A related SO question is What is faster, one big query or many small queries?
Finally, it may be more elegant to use TRUNCATE instead of DROP TABLE unless you have specific reasons not to.
self.cursor.execute(
f'CREATE TABLE IF NOT EXISTS {table_name};'
'SET FOREIGN_KEY_CHECKS=0;'
f'TRUNCATE TABLE {table_name};'
'SET FOREIGN_KEY_CHECKS=1;'
f'INSERT INTO {table_name} SELECT * FROM mytable;'
)
Look at the documentation for MySQLCursor.execute().
It claims that you can pass in a multi parameter that allows you to run multiple queries in one string.
If multi is set to True, execute() is able to execute multiple statements specified in the operation string.
multi is an optional second parameter to the execute() call:
operation = 'SELECT 1; INSERT INTO t1 VALUES (); SELECT 2'
for result in cursor.execute(operation, multi=True):
With import mysql.connector
you can do following command, just need to replace t1 and episodes, with your own tabaes
tablename= "t1"
mycursor.execute("SET FOREIGN_KEY_CHECKS=0; DROP TABLE IF EXISTS {}; SET FOREIGN_KEY_CHECKS=1;CREATE TABLE {} select * from episodes;".format(tablename, tablename),multi=True)
While this will run, you must be sure that the foreign key restraints that will be in effect after enabling it, will not cause problems.
if tablename is something that a user can enter, you should think about a whitelist of table names.
Prepared statemnts don't work with table and column names , so we have to use string replacement to get the correct tablenames at the right posistion, bit this will make your code vulnerable to sql injection
The multi=True is necessary to run 4 commands in the connector, when i tested it, the debugger demanded it.
executescript()
This is a convenience method for executing multiple SQL statements at once. It executes the SQL script it gets as a parameter.
Syntax:
sqlite3.connect.executescript(script)
Example code:
import sqlite3
# Connection with the DataBase
# 'library.db'
connection = sqlite3.connect("library.db")
cursor = connection.cursor()
# SQL piece of code Executed
# SQL piece of code Executed
cursor.executescript("""
CREATE TABLE people(
firstname,
lastname,
age
);
CREATE TABLE book(
title,
author,
published
);
INSERT INTO
book(title, author, published)
VALUES (
'Dan Clarke''s GFG Detective Agency',
'Sean Simpsons',
1987
);
""")
sql = """
SELECT COUNT(*) FROM book;"""
cursor.execute(sql)
# The output in fetched and returned
# as a List by fetchall()
result = cursor.fetchall()
print(result)
sql = """
SELECT * FROM book;"""
cursor.execute(sql)
result = cursor.fetchall()
print(result)
# Changes saved into database
connection.commit()
# Connection closed(broken)
# with DataBase
connection.close()
Output:
[(1,)]
[("Dan Clarke's GFG Detective Agency", 'Sean Simpsons', 1987)]
executemany()
It is often the case when, large amount of data has to be inserted into database from Data Files(for simpler case take Lists, arrays). It would be simple to iterate the code many a times than write every time, each line into database. But the use of loop would not be suitable in this case, the below example shows why. Syntax and use of executemany() is explained below and how it can be used like a loop:
Source: GeeksForGeeks: SQL Using Python
Check out this source.. this has lots of great stuff for you.
All the answers are completely valid so I'd add my solution with static typing and closing context manager.
from contextlib import closing
from typing import List
import mysql.connector
import logging
logger = logging.getLogger(__name__)
def execute(stmts: List[str]) -> None:
logger.info("Starting daily execution")
with closing(mysql.connector.connect()) as connection:
try:
with closing(connection.cursor()) as cursor:
cursor.execute(' ; '.join(stmts), multi=True)
except Exception:
logger.exception("Rollbacking changes")
connection.rollback()
raise
else:
logger.info("Finished successfully")
If I'm not mistaken connection or cursor might not be a context manager, depending on the version of mysql driver you're having, so that's a pythonic safe solution.
I am using mysqldb to try to update a lot of records in a database.
cur.executemany("""UPDATE {} set {} =%s Where id = %s """.format(table, ' = %s, '.join(col)),updates.values.tolist())
I get the error message...
You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near...
So I tried outputting the actual sql update statement as that error message wasn't helpful using the following code:
cur.execute('set profiling = 1')
try:
cur.executemany("""UPDATE {} set {} =%s Where id = %s """.format(table, ' = %s, '.join(col)),updates.values.tolist())
except Exception:
cur.execute('show profiles')
for row in cur:
print(row)
That print statement seems to cut off the update statement at 300 characters. I can't find anything in the documentation about limits so I am wondering is this the print statement limiting or is it mysqldb?
Is there a way I can generate the update statement with just python rather than mysqldb to see the full statement?
To see exactly what the cursor was executing, you can use the cursor.statement command as shown here in the API. That may help with the debugging.
I don't have experience with the mySQL adapter, but I work with the PostgreSQL adapter on a daily basis. At least in that context, it is recommended not to format your query string directly, but let the second argument in the cursor.execute statement do the substitution. This avoids problems with quoted strings and such. Here is an example, the second one is correct (at least for Postgres):
cur.execute("""UPDATE mytbl SET mycol = %s WHERE mycol2 = %s""".format(val, cond))
cur.execute("""UPDATE mytbl SET mycol = %(myval)s WHERE mycol2 = %(mycond)s""", {'myval': val, 'mycond': cond})
This can result in the query
UPDATE mytbl SET mycol = abc WHERE mycol2 = xyz
instead of
UPDATE mytbl SET mycol = 'abc' WHERE mycol2 = 'xyz'.
You would have needed to explicitly add those quotes if you do the value substitution in the query yourself, which becomes annoying and circumvents the type handling of the database adapter (keep in mind this was only a text example). See the API for a bit more information on this notation and the cursor.executemany command.
This is my code so far. I'm attempting to print No results found if no results are returned by MySQL however I can't figure it out. Perhaps I'm using incorrect arguments. Could anyone provide me with an example? Much appreciated!
def movie_function(film):
connection = mysql connection info
cursor = connection.cursor()
sql = "SELECT * FROM film_database WHERE film_name = '"+film+"' ORDER BY actor"
cursor.execute(sql)
rows = cursor.fetchall()
for row in rows:
print row[1]
When you execute a select statement, cursor.rowcount is set to the number of results retrieved. Also, there is no real need to call cursor.fetchall(); looping over the cursor directly is easier:
def movie_function(film):
connection = mysql connection info
cursor = connection.cursor()
sql = "SELECT * FROM film_database WHERE film_name = %s ORDER BY actor"
cursor.execute(sql, (film,))
if not cursor.rowcount:
print "No results found"
else:
for row in cursor:
print row[1]
Note that I also switched your code to use SQL parameters; there is no need to use string interpolation here, leave that to the database adapter. The %s placeholder is replaced for you by a correctly quoted value taken from the second argument to cursor.execute(), a sequence of values (here a tuple of one element).
Using SQL parameters also lets a good database reuse the query plan for the select statement, and leaving the quoting up to the database adapter prevents SQL injection attacks.
You could use cursor.rowcount after your code to see how many rows were actually returned. See here for more.
I guess, this should work.
def movie_function(film):
connection = mysql connection info
cursor = connection.cursor()
sql = "SELECT * FROM film_database WHERE film_name = %s ORDER BY actor"
cursor.execute(sql, [film])
rows = cursor.fetchall()
if not rows:
print 'No resulrs found'
return
for row in rows:
print row[1]
Note, that I changed the way the film parameter is passed to query. I don't know, how exactly it should be (this depends on what MySQL driver for python you use), but important thing to know, is that you should not pass your parameters directly to the query string, because of security reasons.
You can also use :
rows_affected=cursor.execute("SELECT ... ") -> you have directly the number of returned rows
I have the following code:
def executeQuery(conn, query):
cur = conn.cursor()
cur.execute(query)
return cur
def trackTagsGenerator(chunkSize, baseCondition):
""" Returns a dict of trackId:tag limited to chunkSize. """
sql = """
SELECT track_id, tag
FROM tags
WHERE {baseCondition}
""".format(baseCondition=baseCondition)
limit = chunkSize
offset = 0
while True:
trackTags = {}
# fetch the track ids with the coresponding tag
limitPhrase = " LIMIT %d OFFSET %d" % (limit, offset)
query = sql + limitPhrase
offset += limit
cur = executeQuery(smacConn, query)
rows = cur.fetchall()
if not rows:
break
for row in rows:
trackTags[row['track_id']] = row['tag']
yield trackTags
I want to use it like this:
for trackTags in list(trackTagsGenerator(DATA_CHUNK_SIZE, baseCondition)):
print trackTags
break
This code produces the following error without even fetching one chunk of track tags:
Exception _mysql_exceptions.ProgrammingError: (2014, "Commands out of sync; you can't run this command now") in <bound method SSDictCursor.__del__ of <MySQLdb.cursors.SSDictCursor object at 0x10b067b90>> ignored
I suspect it's because I have the query execute logic in the body of loop in the generator function.
Is someone able to tell me how to fetch chunks of data using mysqldb in such way?
I'm pretty sure this is because it can run into situations where you've got two queries
running simultaniously because of the yield. Depending on how you call the function (threads, async, etc..) I'm pretty sure your cursor might get clobbered too?
As well, you're opening yourself up to (sorry, but I can't sugar coat this part) horrific SQL injection holes by inserting baseConditional using essentially a printf. Take a look at the DB-API’s parameter substitution docs for help.
Yield isn't going to save you time or energy here at all, the full sql command will always need to run before you'll get a single result. (Hence you're using LIMIT and OFFSET to make it more friendly, kudos)
i.e. someone updates the table while you're yielding out some data, in this particular case - not the end of the world. In many others, it gets ugly.
If you're just goofing around and you want this to work 'right-now-dammit', it'd probably work to modify executeQuery as such:
def executeQuery(conn, query):
cur = conn.cursor()
cur.execute(query)
cur = executeQuery(smacConn, query)
rows = cur.fetchall()
cur.close()
return rows
One thing that also kinda jumps out at me - you define trackTags = {}, but then you update tagTrackIds, and yield trackTags.. Which will always be empty dict.
My suggestion would be to not bother yourself with the headache of hand writing SQL if you're just trying to get a hobby project working. Take a look at Elixir which is built on top of SQLAlchemy.
Using an ORM (object-relational-mapper) can be a much more friendly introduction to databases. Defining what your objects look like in Python, and having it automatically generate your schema for you - and being able to add/modify/delete things in a Pythonic manner is really nifty.
If you really need to be async, check out ultramysql python module.
You use a SSDictCursor, something that maps to mysql_use_result() on MySQL-API-side. This requires that you read out the complete result before you can issue a new command.
As this happens before you receive the first chunk of data after all: are yu sure that this doesn't happen in the context of the query before this part of code is executed? The results of that last query might be still in the line, and executing the next one (i. e., the fist one in this context) might break things...
I get
sqlite3.OperationalError: SQL logic error or missing database
when I run an application I've been working on. What follows is a narrowed-down but complete sample that exhibits the problem for me. This sample uses two tables; one to store users and one to record whether user information is up-to-date in an external directory system. (As you can imagine, the tables are a fair bit longer in my real application). The sample creates a bunch of random users, and then goes through a list of (random) users and adds them to the second table.
#!/usr/bin/env python
import sqlite3
import random
def random_username():
# Returns one of 10 000 four-letter placeholders for a username
seq = 'abcdefghij'
return random.choice(seq) + random.choice(seq) + \
random.choice(seq) + random.choice(seq)
connection = sqlite3.connect("test.sqlite")
connection.execute('''CREATE TABLE IF NOT EXISTS "users" (
"entry_id" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL ,
"user_id" INTEGER NOT NULL ,
"obfuscated_name" TEXT NOT NULL)''')
connection.execute('''CREATE TABLE IF NOT EXISTS "dir_x_user" (
"user_id" INTEGER PRIMARY KEY NOT NULL)''')
# Create a bunch of random users
random.seed(0) # get the same results every time
for i in xrange(1500):
connection.execute('''INSERT INTO users
(user_id, obfuscated_name) VALUES (?, ?)''',
(i, random_username()))
connection.commit()
#random.seed()
for i in xrange(4000):
username = random_username()
result = connection.execute(
'SELECT user_id FROM users WHERE obfuscated_name = ?',
(username, ))
row = result.fetchone()
if row is not None:
user_id = row[0]
print " %4d %s" % (user_id, username)
connection.execute(
'INSERT OR IGNORE INTO dir_x_user (user_id) VALUES(?)',
(user_id, ))
else:
print " ? %s" % username
if i % 10 == 0:
print "i = %s; committing" % i
connection.commit()
connection.commit()
Of particular note is the line near the end that says,
if i % 10 == 0:
In the real application, I'm querying the data from a network resource, and want to commit the users every now and then. Changing that line changes when the error occurs; it seems that when I commit, there is a non-zero chance of the OperationalError. It seems to be somewhat related to the data I'm putting in the database, but I can't determine what the problem is.
Most of the time if I read all the data and then commit only once, an error does not occur. [Yes, there is an obvious work-around there, but a latent problem remains.]
Here is the end of a sample run on my computer:
? cgha
i = 530; committing
? gegh
? aabd
? efhe
? jhji
? hejd
? biei
? eiaa
? eiib
? bgbf
759 bedd
i = 540; committing
Traceback (most recent call last):
File "sqlitetest.py", line 46, in <module>
connection.commit()
sqlite3.OperationalError: SQL logic error or missing database
I'm using Mac OS X 10.5.8 with the built-in Python 2.5.1 and Sqlite3 3.4.0.
As the "lite" part of the name implies, sqlite3 is meant for light-weight database use, not massive scalable concurrency like some of the Big Boys. Seems to me that what's happening here is that sqlite hasn't finished writing the last change you requested when you make another request
So, some options I see for you are:
You could spend a lot of time learning about file locking, concurrency, and transaction in sqlite3
You could add some more error-proofing simply by having your app retry the action after the first failure, as suggested by some on this Reddit post, which includes tips such as "If the code has an effective mechanism for simply trying again, most of sqlite's concurrency problems go away" and "Passing isolation_level=None to connect seems to fix it".
You could switch to using a more scalable database like PostgreSQL
(For my money, #2 or #3 are the way to go.)