Problems INSERTing record if similar doesn't already exist

Problems INSERTing record if similar doesn't already exist - python

I'm trying to check whether a record already exists in the database (by similar title), and insert it if not. I've tried it two ways and neither quite works.
More elegant way (?) using IF NOT EXISTS
if mode=="update":
#check if book is already present in the system
cursor.execute('IF NOT EXISTS (SELECT * FROM book WHERE TITLE LIKE "%s") INSERT INTO book (title,author,isbn) VALUES ("%s","%s","%s") END IF;' % (title,title,author,isbn))
cursor.execute('SELECT bookID FROM book WHERE TITLE LIKE "%s";' % (title))
bookID = cursor.fetchall()
print('found the bookid %s' % (bookID))
#cursor.execute('INSERT INTO choice (uid,catID,priority,bookID) VALUES ("%d","%s","%s","%s");' % ('1',cat,priority,bookID)) #commented out because above doesn't work
With this, I get an error on the IF NOT EXISTS query saying that "author" isn't defined (although it is).
Less elegant way using count of matching records
if mode=="update":
#check if book is already present in the system
cursor.execute('SELECT COUNT(*) FROM book WHERE title LIKE "%s";' % (title))
anyresults = cursor.fetchall()
print('anyresults looks like %s' % (anyresults))
if anyresults[0] == 0: # if we didn't find a bookID
print("I'm in the loop for adding a book")
cursor.execute('INSERT INTO book (title,author,isbn) VALUES ("%s","%s","%s");' % (title,author,isbn))
cursor.execute('SELECT bookID FROM book WHERE TITLE LIKE "%s";' % (title))
bookID = cursor.fetchall()
print('found the bookid %s' % (bookID))
#cursor.execute('INSERT INTO choice (uid,catID,priority,bookID) VALUES ("%d","%s","%s","%s");' % ('1',cat,priority,bookID)) #commented out because above doesn't work
In this version, anyresults is a tuple that looks like (0L,) but I can't find a way of matching it that gets me into that "loop for adding a book." if anyresults[0] == 0, 0L, '0', '0L' -- none of these seem to get me into the loop.
I think I may not be using IF NOT EXISTS correctly--examples I've found are for separate procedures, which aren't really in the scope of this small project.
ADDITION:
I think unutbu's code will work great, but I'll still getting this dumb NameError saying author is undefined which prevents the INSERT from being tried, even when I am definitely passing it in.
if form.has_key("title"):
title = form['title'].value
mode = "update"
if form.has_key("author"):
author = form['author'].value
mode = "update"
print("I'm in here")
if form.has_key("isbn"):
isbn = form['isbn'].value
mode = "update"
It never prints that "I'm in here" test statement. What would stop it getting in there? It seems so obvious--I keep checking my indentation, and I'm testing it on the command line and definitely specifying all three parameters.

If you set up a UNIQUE index on book, then inserting unique rows is easy.
For example,
mysql> ALTER IGNORE TABLE book ADD UNIQUE INDEX book_index (title,author);
WARNING: if there are rows with non-unique (title,author) pairs, all but one such row will be dropped.
If you want just the author field to be unique, then just change (title,author) to (author).
Depending on how big the table, this may take a while...
Now, to insert a unique record,
sql='INSERT IGNORE INTO book (title,author,isbn) VALUES (%s, %s, %s)'
cursor.execute(sql,[title,author,isbn])
If (title,author) are unique, the triplet (title,author,isbn) is inserted into the book table.
If (title,author) are not unique, then the INSERT command is ignored.
Note, the second argument to cursor.execute. Passing arguments this way helps prevent SQL injection.

This doesn't answer your question since it's for Postgresql rather than MySQL, but I figured I'd drop it in for people searching their way here.
In Postgres, you can batch insert items if they don't exist:
CREATE TABLE book (title TEXT, author TEXT, isbn TEXT);
# Create a row of test data:
INSERT INTO book (title,author,isbn) VALUES ('a', 'b', 'c');
# Do the real batch insert:
INSERT INTO book
SELECT add.* FROM (VALUES
('a', 'b', 'c'),
('d', 'e', 'f'),
('g', 'h', 'i'),
) AS add (title, author, isbn)
LEFT JOIN book ON (book.title = add.title)
WHERE book.title IS NULL;
This is pretty simple. It selects the new rows as if they're a table, then left joins them against the existing data. The rows that don't already exist will join against a NULL row; we then filter out the ones that already exist (where book.title isn't NULL). This is extremely fast: it takes only a single database transaction to do a large batch of inserts, and lets the database backend do a bulk join, which it's very good at.
By the way, you really need to stop formatting your SQL queries directly (unless you really have to and really know what you're doing, which you don't here). Use query substitution, eg. cur.execute("SELECT * FROM table WHERE title=? and isbn=?", (title, isbn)).

Related

MySQL & Python select last corresponding row instead of first

I had a quick question. I want to count how many times a user is logged into the system. To achieve this i add a 1 to the third part of the result. The only thing is that every time the user logs in the code fetches the first corresponding row. Thus resulting in the fact that the login_num will always be 2, since the first corresponding row always contains a 1.
On Stackoverflow i searched for several solutions. So i came up with the DESC at the end of the fetch syntax. However in every instance i tried this, i always end up getting an error in return. Does anyone have an idea why this is the case?
Python code:
cursor.execute("Select rfid_uid, name, login_num FROM users rfid_uid="+str(id) + "ORDER BY id DESC")
result = cursor.fetchone()
if cursor.rowcount >= 1:
print("Welkom " + result[1])
print(result)
result = (result[0], result[1], result[2] + 1)
sql_insert = "INSERT INTO users (rfid_uid, name, login_num) VALUES (%s, %s, %s)"
cursor.execute(sql_insert, (result))
db.commit()

Seems your SQL statement refers to table 'users'. I suppose it does contain info about users in general (a row per user), not user logins.
If you have each individual user login event registered in some table, I would let the database do the counting. Something like this:
SELECT COUNT(*) FROM user_logins WHERE rfid_uid='user_id';
You should get one row, which has your answer as an integer.

Can I use parameter insertion to specify column names for MySQL queries?

I want to know if it is possible to use parameter insertion for column names into MySQL queries using Python.
Consider the following two queries, both of which are passed to MySQLCursor.execute(). The first:
query = (
'SELECT username, COUNT(*)'
'FROM `entry`'
'GROUP BY username;'
)
cursor.execute(query)
And the second:
query = (
'SELECT %s, COUNT(*)'
'FROM `entry`'
'GROUP BY %s;'
)
data = ('username', 'username')
cursor.execute(query, data)
The first of these returns the results I expect (a count of each how many times a distinct value appears in the username column) and the second returns unexpected results, specifically [(u'username', n)] where n is the total number of rows in the database.
The problem in the second query is that the parameters are interpreted as a string by the query. Is there a way to insert them such that they can be interpreted as a non-string? I want to do this in a way that is safe from Injection attacks.

It is not recommended to use the 2nd syntax, the 1st one is ok and safe.

Insert list of dictionaries and variable into table

lst = [{'Fruit':'Apple','HadToday':2},{'Fruit':'Banana','HadToday':8}]
I have a long list of dictionaries of the form above.
I have two fixed variables.
person = 'Sam'
date = datetime.datetime.now()
I wish to insert this information into a mysql table.
How I do it currently
for item in lst:
item['Person'] = person
item['Date'] = date
cursor.executemany("""
INSERT INTO myTable (Person,Date,Fruit,HadToday)
VALUES (%(Person)s, %(Date)s, %(Fruit)s, %(HadToday)s)""", lst)
conn.commit()
Is their a way to do it, that bypasses the loop as the person and date variables are constant. I have tried
lst = [{'Fruit':'Apple','HadToday':2},{'Fruit':'Banana','HadToday':8}]
cursor.executemany("""
INSERT INTO myTable (Person,Date,Fruit,HadToday)
VALUES (%s, %s, %(Fruit)s, %(HadToday)s)""", (person,date,lst))
conn.commit()
TypeError: not enough arguments for format string

Your problem here is, that it tries to apply all of lst into %(Fruit)s and nothing is left for %(HadToday)s).
You should not fix it by hardcoding the fixed values into the statement as you get into troubles if you have a name like "Tim O'Molligan" - its better to let the db handle the correct formatting.
Not mysql, but you get the gist: http://initd.org/psycopg/docs/usage.html#the-problem-with-the-query-parameters - learned this myself just a week ago ;o)
The probably cleanest way would be to use
cursor.execute("SET #myname = %s", (person,))
cursor.execute("SET #mydate = %s", (datetime.datetime.now(),))
and use
cursor.executemany("""
INSERT INTO myTable (Person,Date,Fruit,HadToday)
VALUES (#myname, #mydate, %(Fruit)s, %(HadToday)s)""", lst)
I am not 100% about the syntax, but I hope you get the idea. Comment/edit the answer if I have a misspell in it.

Is it possible to create a query command that takes in a list of variables in python-mysql

I am trying to do a multiquery which utilizes executemany in MySQLDb library. After searching around, I found that I'll have to create a command that uses INSERT INTO along with ON DUPLICATE KEY instead of UPDATE in order to use executemany
All is good so far, but then I run into a problem which I can't set the SET part efficiently. My table has about 20 columns (whether you want to criticize the fatness of the table is up to you. It works for me so far) and I want to form the command string efficiently if possible.
Right now I have
update_query = """
INSERT INTO `my_table`
({all_columns}) VALUES({vals})
ON DUPLICATE KEY SET <should-have-each-individual-column-set-to-value-here>
""".format(all_columns=all_columns, vals=vals)
Where all_columns covers all the columns, and vals cover bunch of %s as I'm going to use executemany later.
However I have no idea how to form the SET part of string. I thought about using comma-split to separate them into elements in a list, but I'm not sure if I can iterate them.
Overall, the goal of this is to only call the db once for update, and that's the only way I can think of right now. If you happen to have a better idea, please let me know as well.
EDIT: adding more info
all_columns is something like 'id, id2, num1, num2'
vals right now is set to be '%s, %s, %s, %s'
and of course there are more columns than just 4

Assuming that you have a list of tuples for the set piece of your command:
listUpdate = [('f1', 'i'), ('f2', '2')]
setCommand = ', '.join([' %s = %s' % x for x in listUpdate])
all_columns = 'id, id2, num1, num2'
vals = '%s, %s, %s, %s'
update_query = """
INSERT INTO `my_table`
({all_columns}) VALUES({vals})
ON DUPLICATE KEY SET {set}
""".format(all_columns=all_columns, vals=vals, set=setCommand)
print(update_query)

SQLAlchemy: Operating on results

I'm trying to do something relatively simple, spit out the column names and respective column values, and possibly filter out some columns so they aren't shown.
This is what I attempted ( after the initial connection of course ):
metadata = MetaData(engine)
users_table = Table('fusion_users', metadata, autoload=True)
s = users_table.select(users_table.c.user_name == username)
results = s.execute()
if results.rowcount != 1:
return 'Sorry, user not found.'
else:
for result in results:
for x, y in result.items()
print x, y
I looked at the API on SQLAlchemy ( v.5 ) but was rather confused. my 'result' in 'results' is a RowProxy, yet I don't think it's returning the right object for the .items() invocation.
Let's say my table structure is so:
user_id user_name user_password user_country
0 john a9fu93f39uf usa
i want to filter and specify the column names to show ( i dont want to show the user_password obviously ) - how can I accomplish this?

A SQLAlchemy RowProxy object has dict-like methods -- .items() to get all name/value pairs, .keys() to get just the names (e.g. to display them as a header line, then use .values() for the corresponding values or use each key to index into the RowProxy object, etc, etc -- so it being a "smart object" rather than a plain dict shouldn't inconvenience you unduly.

You can use results instantly as an iterator.
results = s.execute()
for row in results:
print row
Selecting specific columns is done the following way:
from sqlalchemy.sql import select
s = select([users_table.c.user_name, users_table.c.user_country], users_table.c.user_name == username)
for user_name, user_country in s.execute():
print user_name, user_country
To print the column names additional to the values the way you have done it in your question should be the best because RowProxy is really nothing more than a ordered dictionary.
IMO the API documentation for SqlAlchemy is not really helpfull to learn how to use it. I would suggest you to read the SQL Expression Language Tutorial. It contains the most vital information about basic querying with SqlAlchemy.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Problems INSERTing record if similar doesn't already exist - python

Related

MySQL & Python select last corresponding row instead of first

Can I use parameter insertion to specify column names for MySQL queries?

Insert list of dictionaries and variable into table

Is it possible to create a query command that takes in a list of variables in python-mysql

SQLAlchemy: Operating on results

Categories

Resources