Preferred method of adding data to MySQL database - python

My set-up:
MySQL server.
host running a python script.
(1) and (2) are different machines on the network.
The python script generates data which must be stored in a MySQL-database.
I use this (example-)code to achieve that:
def function sqldata(date,result):
con = mdb.connect('sql.lan', 'demouser', 'demo', 'demo')
with con:
cur = con.cursor()
cur.execute('INSERT INTO tabel(titel, nummer) VALUES( %s, %s)',(date, result))
The scipt generates one data-point approx. every minute. So this means that a new connection is opened and closed every minute. I'm wondering if it would be a better idea to open the connection at the start of the script and only close it when the script terminates. Effectively leaving the connection open indefinately.
This then obviously begs the question how to handle/recover when the SQL-server "leaves" the network (e.g. due to a reboot) for a while.
While typing my question this question appeared in the "Similar Questions" section. It is, however, from 2008 and possibly outdated and the 4 answers it received seem to contradict with each other.
What are the current insights in this matter?

Well the referred answer is right in it's point, but maybe not answering all your questions. I can not provide a full running python script for you here, but let me explain how i would go along with it:
Rule 1: Generally most mysql functions return values, that you should always check so that you can react on unwanted behavior.
Rule 2: Open a connection at the beginning of your script and use this one and only connection throughout your script.
Obviously you could check if there is an existing connection in your sqldata function, and if not then you could open a new one to the global con object.
if not con:
con = mdb.connect('sql.lan', 'demouser', 'demo', 'demo')
And if there is a connection already, you could check it's "up status" by performing a simple query with fixed expected result that you can check to see if the sql server is running.
if con:
cur = con.cursor()
returned = cur.execute('SELECT COUNT(*) FROM tabel')
if returned.with_rows:
....
Basically you could avoid this, because if you don't get a cursor back, and you check that first before using it, then you already know if the server is alive or not.
So CHECK, CHECK and CHECK. You should check everything you get back from a function to have a good error handling. Just using a connection or using a cursor without checking it first, can lead you talking to a NIL object and crashing your script.
And the last BIG HINT i can give you is to use multiple row inserts. You can actually insert hundreds of rows, if you just add the values comma seperated to your insert string:
# consider result would be filled like this
result = '("First Song",1),("Second Song",2),("Third Song",3)'
# then this will insert 3 rows with one call
returned = cur.execute('INSERT INTO tabel (titel, nummer) VALUES %s',(date, result), multi=True)
# since literally it will execute
returned = cur.execute('INSERT INTO tabel (titel, nummer) VALUES ("First Song",1),("Second Song",2),("Third Song",3)', multi=True)
# and now you can check returned for any error
if returned:
....

Related

How to choose a column as user input and pass it into database statement? [duplicate]

I'm to link my code to a MySQL database using pymysql. In general everything has gone smoothly but I'm having difficulty with the following function to find the minimum of a variable column.
def findmin(column):
cur = db.cursor()
sql = "SELECT MIN(%s) FROM table"
cur.execute(sql,column)
mintup = cur.fetchone()
if everything went smoothly this would return me a tuple with the minimum, e.g. (1,).
However, if I run the function:
findmin(column_name)
I have to put column name in "" (i.e. "column_name"), else Python sees it as an unknown variable. But if I put the quotation marks around column_name then SQL sees
SELECT MIN("column_name") FROM table
which just returns the column header, not the value.
How can I get around this?
The issue is likely the use of %s for the column name. That means the SQL Driver will try to escape that variable when interpolating it, including quoting, which is not what you want for things like column names, table names, etc.
When using a value in SELECT, WHERE, etc. then you do want to use %s to prevent SQL injections and enable quoting, among other things.
Here, you just want to interpolate using pure Python (assuming a trusted value; please see below for more information). That also means no bindings tuple passed to the execute method.
def findmin(column):
cur = db.cursor()
sql = "SELECT MIN({0}) FROM table".format(column)
cur.execute(sql)
mintup = cur.fetchone()
SQL fiddle showing the SQL working:
http://sqlfiddle.com/#!2/e70a41/1
In response to the Jul 15, 2014 comment from Colin Phipps (September 2022):
The relatively recent edit on this post by another community member brought it to my attention, and I wanted to respond to Colin's comment from many years ago.
I totally agree re: being careful about one's input if one interpolates like this. Certainly one needs to know exactly what is being interpolated. In this case, I would say a defined value within a trusted internal script or one supplied by a trusted internal source would be fine. But if, as Colin mentioned, there is any external input, then that is much different and additional precautions should be taken.

sqlite commit not saving changes with update using python?

Description
I have a database that I already built in python3 utilizing sqlite. Up till this point, I have not had any issues with commit saving changes (with insert commands and delete commands). However, I am trying to utilize an update command and I have not been able to save the changes (it only changes the DB in the working memory despite calling commit().
The goal of this code snippet is to replace the null values in the database with an empty string as I have another function that cannot handle null data. I found a solution to do that here: Find null values from table and replace it with space - sqlite.
Details
Here is the current code that I am trying to execute:
self.cursor.execute(f'UPDATE {tbl_name} SET {col_name} = IFNULL({col_name}, "")')
self.conn.commit()
This code basically goes through the entire database one column at a time and replaces the null values.
Note that self is defined as follows:
Database.conn = sqlite3.connect(self.location + self.name)
Database.cursor = sqlite3.connect(self.location + self.name).cursor()
As earlier stated this correctly operates; however, it will not commit the changes to the actual database. This is verified by both DB browser for sqlite and pulling the data again on a close and re-execute.
I will also note that if I close out of this program and reinitialize it to run it again it will error out for the DB still being locked despite the last line of my code being:
Database.conn.commit() # Save (commit) the changes
Database.conn.close() # Close database
Conclusion
Thanks in advance as I have been beating my head against the wall with this one and have yet to find a problem like this elsewhere!
Your database connecton has nothing to do with your cursor.
You do
Database.conn = sqlite3.connect(self.location + self.name)
Database.cursor = sqlite3.connect(self.location + self.name).cursor()
So by creating a new connection for the cursor, a following Database.conn.commit() won't commit any changes you did with cursor.
Create your cursor like this to have the connection between connection and cursor:
Database.conn = sqlite3.connect(self.location + self.name)
Database.cursor = Database.conn.cursor()

Can't create table lock in mssql from python

I'm using ceODBC to connect to sql-server 2014 from a centos 6 box from python 2.7.9.
In a critical part of our code, after inserting rows into a table, I want to do double check that all rows have arrived safely. I want to do this because sometimes an error happens, but ceODBC does not throw an error, and the table is empty.
To make sure that in between inserting data and doing a 'count statement' no other parts of the code do any inserts I want to lock the table. This is where I have my problem. It seems that there is a sp_getapplock build into sql-server, but when I do the following:
import ceodbc
conn = # Make connection here
cursor = conn.cursor()
cursor.execute("declare #result int; exec #result = sp_getapplock #Resource='Dim_Date', #LockMode='Exclusive'; select #result").fetchall()
The result sometimes is a 0, sometimes a -999, but never is the table locked for other connections.
Does anyone know what I''m doing wrong?
(I added the pyodbc tag because I think the two drivers are similar.)

Using the python MySQLDB SScursor with nested queries

The typical MySQLdb library query can use a lot of memory and perform poorly in Python, when a large result set is generated. For example:
cursor.execute("SELECT id, name FROM `table`")
for i in xrange(cursor.rowcount):
id, name = cursor.fetchone()
print id, name
There is an optional cursor that will fetch just one row at a time, really speeding up the script and cutting the memory footprint of the script a lot.
import MySQLdb
import MySQLdb.cursors
conn = MySQLdb.connect(user="user", passwd="password", db="dbname",
cursorclass = MySQLdb.cursors.SSCursor)
cur = conn.cursor()
cur.execute("SELECT id, name FROM users")
row = cur.fetchone()
while row is not None:
doSomething()
row = cur.fetchone()
cur.close()
conn.close()
But I can't find anything about using SSCursor with with nested queries. If this is the definition of doSomething():
def doSomething()
cur2 = conn.cursor()
cur2.execute('select id,x,y from table2')
rows = cur2.fetchall()
for row in rows:
doSomethingElse(row)
cur2.close()
then the script throws the following error:
_mysql_exceptions.ProgrammingError: (2014, "Commands out of sync; you can't run this command now")
It sounds as if SSCursor is not compatible with nested queries. Is that true? If so that's too bad because the main loop seems to run too slowly with the standard cursor.
This problem in discussed a bit in the MySQLdb User's Guide, under the heading of the threadsafety attribute (emphasis mine):
The MySQL protocol can not handle multiple threads using the same
connection at once. Some earlier versions of MySQLdb utilized locking
to achieve a threadsafety of 2. While this is not terribly hard to
accomplish using the standard Cursor class (which uses
mysql_store_result()), it is complicated by SSCursor (which uses
mysql_use_result(); with the latter you must ensure all the rows have
been read before another query can be executed.
The documentation for the MySQL C API function mysql_use_result() gives more information about your error message:
When using mysql_use_result(), you must execute mysql_fetch_row()
until a NULL value is returned, otherwise, the unfetched rows are
returned as part of the result set for your next query. The C API
gives the error Commands out of sync; you can't run this command now
if you forget to do this!
In other words, you must completely fetch the result set from any unbuffered cursor (i.e., one that uses mysql_use_result() instead of mysql_store_result() - with MySQLdb, that means SSCursor and SSDictCursor) before you can execute another statement over the same connection.
In your situation, the most direct solution would be to open a second connection to use while iterating over the result set of the unbuffered query. (It wouldn't work to simply get a buffered cursor from the same connection; you'd still have to advance past the unbuffered result set before using the buffered cursor.)
If your workflow is something like "loop through a big result set, executing N little queries for each row," consider looking into MySQL's stored procedures as an alternative to nesting cursors from different connections. You can still use MySQLdb to call the procedure and get the results, though you'll definitely want to read the documentation of MySQLdb's callproc() method since it doesn't conform to Python's database API specs when retrieving procedure outputs.
A second alternative is to stick to buffered cursors, but split up your query into batches. That's what I ended up doing for a project last year where I needed to loop through a set of millions of rows, parse some of the data with an in-house module, and perform some INSERT and UPDATE queries after processing each row. The general idea looks something like this:
QUERY = r"SELECT id, name FROM `table` WHERE id BETWEEN %s and %s;"
BATCH_SIZE = 5000
i = 0
while True:
cursor.execute(QUERY, (i + 1, i + BATCH_SIZE))
result = cursor.fetchall()
# If there's no possibility of a gap as large as BATCH_SIZE in your table ids,
# you can test to break out of the loop like this (otherwise, adjust accordingly):
if not result:
break
for row in result:
doSomething()
i += BATCH_SIZE
One other thing I would note about your example code is that you can iterate directly over a cursor in MySQLdb instead of calling fetchone() explicitly over xrange(cursor.rowcount). This is especially important when using an unbuffered cursor, because the rowcount attribute is undefined and will give a very unexpected result (see: Python MysqlDB using cursor.rowcount with SSDictCursor returning wrong count).

MySQL rows delete when program exits

I'm using Python and MySQLdb to add rows to my database. It seems that when my script exits, the rows get deleted. My last lines before the script exits do a "select *" on the table, which shows my one row. When I re-run the script, the first lines (after opening the connection) do the same "select *" and return zero results. I'm really at a loss here. I've been working for about 2 hours on this, and can't understand what could be accessing my database.
Also, between running the scripts, I run the "select *" manually from a terminal with zero results.
If I manually add a row from the terminal, it seems to last.
The query to insert the row:
cursor.execute("INSERT INTO sessions(username, id, ip) VALUES (%s, %s, %s)", (username, SessionID, IP]))
The query I use to check the data:
cursor.execute("select * from sessions")
print cursor.fetchall()
This shows the row before the program exits, then shows nothing when the program is run again.
Thanks in advance for all the help.
Looks like you need to connection.commit() your changes after you execute the query (replace connection with your DB connection variable).
http://docs.python.org/library/sqlite3.html
Connection.commit():
This method commits the current transaction. If you don’t call this method, anything you did since the last call to commit() is not visible from other database connections. If you wonder why you don’t see the data you’ve written to the database, please check you didn’t forget to call this method.
Check this other question: Python MySQLdb update query fails
You can find some examples on how to commit, how to connect using autocommit, etc.

Categories

Resources