Access database UPSERT with pyodbc - python

I am using pyodbc to update an Access database.
I need the functionality of an UPSERT.
ON DUPLICATE KEY UPDATE doesn't exist in Access SQL, and REPLACE is not an option since I want to keep other fields.
There are a lot of suggestions out there how to solve that, so this is
the solution which I put together:
for table_name in data_source:
table = data_source[table_name]
for y in table:
if table_name == "whatever":
SQL_UPDATE = "UPDATE {} set [Part Name] = '{}', [value] = {}, [code] = {}, [tolerance] = {} WHERE [Unique Part Number]='{}'".\
format(table_name,y['PartName'],y['Value'],y['keycode'],y['Tolerance'], y['PartNum'])
SQL_INSERT = "INSERT INTO {} ([Part Name],[Unique Part Number], [value], [code], [tolerance]) VALUES ('{}','{}','{}',{},{},{});".\
format(table_name,y['PartName'],y['PartNum'],y['Value'],y['keycode'],y['Tolerance'])
elsif ....
9 more tables....
res = cursor.execute(SQL_UPDATE)
if res.rowcount == 0:
cursor.execute(SQL_INSERT)
Well I have to say, I am not a Python expert, and I didn't manage to understand the fundamental concept nor the Magic of SQL,
so I can just Google things together here.
I don't like my above solution because it is very hard to read and difficult to maintain (I have to to this for ~10 different tables). The other point is that I have to use 2 queries because I didn't manage to understand and run any other UPSERT approach I found.
Does anyone have a recommendation for me how to do this in a smarter, better maintainable way?

As noted in this question and others, Access SQL does not have an "upsert" statement, so you will need to use a combination of UPDATE and INSERT. However, you can improve your current implementation by
using proper parameters for your query, and
using Python string manipulation to build the SQL command text.
For example, to upsert into a table named [Donor]
Donor ID Last Name First Name
-------- --------- ----------
1 Thompson Gord
You can start with a list of the field names. The trick here is to put the key field(s) at the end, so the INSERT and UPDATE statements will refer to the fields in the same order (i.e., the UPDATE will refer to the ID field last because it will be in the WHERE clause).
data_fields = ['Last Name', 'First Name']
key_fields = ['Donor ID']
The parameter values will be the same for both the UPDATE and INSERT cases, e.g.
params = ('Elk', 'Anne', 2)
The UPDATE statement can be constructed like this
update_set = ','.join(['[' + x + ']=?' for x in data_fields])
update_where = ' AND '.join(['[' + x + ']=?' for x in key_fields])
sql_update = "UPDATE [Donor] SET " + update_set + " WHERE " + update_where
print(sql_update)
which shows us
UPDATE [Donor] SET [Last Name]=?,[First Name]=? WHERE [Donor ID]=?
Similarly, the INSERT statement can be constructed like this
insert_fields = ','.join(['[' + x + ']' for x in (data_fields + key_fields)])
insert_placeholders = ','.join(['?' for x in (data_fields + key_fields)])
sql_insert = "INSERT INTO [Donor] (" + insert_fields + ") VALUES (" + insert_placeholders + ")"
print(sql_insert)
which prints
INSERT INTO [Donor] ([Last Name],[First Name],[Donor ID]) VALUES (?,?,?)
So, to perform our upsert, all we need to do is
crsr.execute(sql_update, params)
if crsr.rowcount > 0:
print('Existing row updated.')
else:
crsr.execute(sql_insert, params)
print('New row inserted.')
crsr.commit()

Consider using parameterized queries from prepared statements that uses ? placeholders. The str.format is still needed for identifiers such as table and field names. Then unpack dictionary items with zip(*dict.items()) to pass as parameters in the cursor's execute call: cursor.execute(query, params).
for table_name in data_source:
table = data_source[table_name]
for y in table:
keys, values = zip(*y.items()) # UNPACK DICTIONARY INTO TWO TUPLES
if table_name == "whatever":
SQL_UPDATE = "UPDATE {} set [Part Name] = ?, [value] = ?, [code] = ?," + \
" [tolerance] = ? WHERE [Unique Part Number]= ?".format(table_name)
SQL_INSERT = "INSERT INTO {} ([Part Name], [Unique Part Number], [value]," + \
" [code], [tolerance]) VALUES (?, ?, ?, ?, ?);".format(table_name)
res = cursor.execute(SQL_UPDATE, values)
if res.rowcount == 0:
cursor.execute(SQL_INSERT, values)
...

Related

PYTHON - Dynamically update multiple columns using a custom MariaDB connector

While reading this question: SQL Multiple Updates vs single Update performance
I was wondering how could I dynamically implement an update for several variables at the same time using a connector like MariaDB's. Reading the official documentation I did not find anything similar.
This question is similar, and it has helped me to understand how to use parametrized queries with custom connectors but it does not answer my question.
Let's suppose that, from one of the views of the project, we receive a dictionary.
This dictionary has the following structure (simplified example):
{'form-0-input_file_name': 'nofilename', 'form-0-id': 'K0944', 'form-0-gene': 'GJXX', 'form-0-mutation': 'NM_0040(p.Y136*)', 'form-0-trix': 'ZSSS4'}
Assuming that each key in the dictionary corresponds to a column in a table of the database, if I'm not mistaken we would have to iterate over the dictionary and build the query in each iteration.
Something like this (semi pseudo-code, probably it's not correct):
query = "UPDATE `db-dummy`.info "
for key in a_dict:
query += "SET key = a_dict[key]"
It is not clear to me how to construct said query within a loop.
What is the most pythonic way to achieve this?
Although this could work.
query = "UPDATE `db-dummy`.info "
for index, key in enumerate(a_dict):
query = query + ("," if index != 0 else "") +" SET {0} = '{1}'".format(key,a_dict[key])
You should consider parameterized queries for safety and security. Moreover, a dynamic dictionary may also raise other concerns, it may be best to verify or filter on a set of agreed keys before attempting such an operation.
query = "UPDATE `db-dummy`.info "
for index, key in enumerate(a_dict):
query = query + ("," if index != 0 else "") +" SET {0} = ? ".format(key)
# Then execute with your connection/cursor
cursor.execute(query, tuple(a_dict.values()) )
This is what I did (inspired by #ggordon's answer)
query = "UPDATE `db-dummy`.info "
for index, key in enumerate(a_dict):
if index == 0:
query = query + "SET {0} = ?".format(key)
else:
query = query + ", {0} = ?".format(key)
query += " WHERE record_id = " + record_id
And it works!

Using WHERE in query PyMySQL

I am trying to create a program where a user can enter an operator i.e. <> or = and then a number for a database in pymysql. I have tried a number of different ways of doing this but unfortunately unsuccessful. I have two documents with display being one and importing display into the other document.
Docuemnt 1
def get_pop(op, pop):
if (not conn):
connect();
query = "SELECT * FROM city WHERE Population %s %s"
with conn:
cursor = conn.cursor()
cursor.execute(query, (op, pop))
x = cursor.fetchall()
return x
Document two
def city():
op = input("Enter < > or =: ")
population = input("Enter population: ")
pop = display.get_pop(op, population)
for p in pop:
print(pop)
I am getting the following error.
pymysql.err.ProgrammingError: (1064,......
Please help thanks
You can't do this. Parameterization works for values only, not operators or table names, or column names. You'll need to format the operator into the string. Do not confuse the %s placeholder here with Python string formatting; MySQL is awkward in that it uses %s for binding parameters, which clashes with regular Python string formatting.
The MySQL %s in a query string escapes the user input to protect against SQL Injection. In this case, I set up a basic test to see if the operation part submitted by the user was in a list of accepted operations.
def get_pop(op, pop):
query = "SELECT * FROM city WHERE Population {} %s" # Add a placeholder for format
with conn: # Where does this come from?
cursor = conn.cursor()
if op in ['=', '!=']:
cursor.execute(query.format(op), (pop,))
x = cursor.fetchall()
return x
You'll want to come up with some reasonable return value in the case that if op in ['=', '!='] is not True but that depends entirely on how you want this to behave.
After checking that op indeed contains either "<>" or "=" and that pop indeed contains a number you could try:
query = "SELECT * FROM city WHERE Population " + op + " %s";
Beware of SQL injection.
Then
cursor.execute(query, (pop))

Python, SQlite3 - Querying condition equals and not in (Multiple)

How can I query in python both for where a condition equals a value i.e. r.user = (given user id) and where a value is NOT IN (given list of movie ids) the result set.
This is what I currently have
placeholder = '?' # For SQLite. See DBAPI paramstyle.
placeholders = ', '.join(placeholder * len(l))
query = 'SELECT r.user, r.movie, r.rating, m.title FROM ratings r JOIN movies m ON (r.movie = m.id) ' \
'WHERE r.user = 405 AND r.rating >= 3 AND r.movie NOT IN (%s)' % placeholders
cursor.execute(query, ('405', l))
movies_table = cursor.fetchall()
l refers to an array of values i.e. so I can get the result set where movie id is not in the list of values.
Thanks very much,
I'm currently able to get one or the other but not both due to what seems the number of parameters applied or so.
You need to call cursor.execute() with one item per-placeholder.
Try something like this:
cursor.execute(query, tuple(l))
If you want to append the 405 to the list of values, then you can do something like:
cursor.execute(query, (405, *l))

How to select value after update at one atomic operation in MySQL

Here I want to incr field count, then select count result at one atomic operation.
the code are:
sql = ("update user_cout set count = count+1 "
"where username=%s")
cursor = connection.cursor()
cursor.execute(sql, (username, ))
cursor.execute("select count from user_count "
"where username=%s", (username, ))
Any recommendation way to do this?
You need to perform the UPDATE and the SELECT within a transaction (i.e. statements that are not automatically committed):
connection.autocommit(0);
One can then execute the existing statements and explicitly commit them to the database:
connection.commit();
Note that to achieve atomicity, the user_count table must use a transactional storage engine (such as InnoDB).
#linbo, you could create a function/procedure into your database (check how to do so here) and then run just one call in your code:
sql = ("SELECT MY_CUSTOM_FUNC(%s)")
cursor = connection.cursor()
cursor.execute(sql, (username, ))
There are many advantages of using this approach:
You centralize your logic, it can be called from many places in your application;
Less code;
MySQL uses atomicity by default in functions once it's a simple operation for the db;
I hope it helps!
You can do it like that:
cursor.execute("UPDATE tbl_name SET pos = #next_pos := pos + 1 WHERE some_id = %s; SELECT #next_pos;", (12345, ))
cursor.nextset()
result = cursor.fetchone()
Then result[0] will contain incremented value.

How to write multi column in clause with sqlalchemy

Please suggest is there way to write query multi-column in clause using SQLAlchemy?
Here is example of the actual query:
SELECT url FROM pages WHERE (url_crc, url) IN ((2752937066, 'http://members.aye.net/~gharris/blog/'), (3799762538, 'http://www.coxandforkum.com/'));
I have a table that has two columns primary key and I'm hoping to avoid adding one more key just to be used as an index.
PS I'm using mysql DB.
Update: This query will be used for batch processing - so I would need to put few hundreds pairs into the in clause. With IN clause approach I hope to know fixed limit of how many pairs I can stick into one query. Like Oracle has 1000 enum limit by default.
Using AND/OR combination might be limited by the length of the query in chars. Which would be variable and less predictable.
Assuming that you have your model defined in Page, here's an example using tuple_:
keys = [
(2752937066, 'http://members.aye.net/~gharris/blog/'),
(3799762538, 'http://www.coxandforkum.com/')
]
select([
Page.url
]).select_from(
Page
).where(
tuple_(Page.url_crc, Page.url).in_(keys)
)
Or, using the query API:
session.query(Page.url).filter(tuple_(Page.url_crc, Page.url).in_(keys))
I do not think this is currently possible in sqlalchemy, and not all RDMBS support this.
You can always transform this to a OR(AND...) condition though:
filter_rows = [
(2752937066, 'http://members.aye.net/~gharris/blog/'),
(3799762538, 'http://www.coxandforkum.com/'),
]
qry = session.query(Page)
qry = qry.filter(or_(*(and_(Page.url_crc == crc, Page.url == url) for crc, url in filter_rows)))
print qry
should produce something like (for SQLite):
SELECT pages.id AS pages_id, pages.url_crc AS pages_url_crc, pages.url AS pages_url
FROM pages
WHERE pages.url_crc = ? AND pages.url = ? OR pages.url_crc = ? AND pages.url = ?
-- (2752937066L, 'http://members.aye.net/~gharris/blog/', 3799762538L, 'http://www.coxandforkum.com/')
Alternatively, you can combine two columns into just one:
filter_rows = [
(2752937066, 'http://members.aye.net/~gharris/blog/'),
(3799762538, 'http://www.coxandforkum.com/'),
]
qry = session.query(Page)
qry = qry.filter((func.cast(Page.url_crc, String) + '|' + Page.url).in_(["{}|{}".format(*_frow) for _frow in filter_rows]))
print qry
which produces the below (for SQLite), so you can use IN:
SELECT pages.id AS pages_id, pages.url_crc AS pages_url_crc, pages.url AS pages_url
FROM pages
WHERE (CAST(pages.url_crc AS VARCHAR) || ? || pages.url) IN (?, ?)
-- ('|', '2752937066|http://members.aye.net/~gharris/blog/', '3799762538|http://www.coxandforkum.com/')
I ended up using the test() based solution: generated "(a,b) in ((:a1, :b1), (:a2,:b2), ...)" with named bind vars and generating dictionary with bind vars' values.
params = {}
for counter, r in enumerate(records):
a_param = "a%s" % counter
params[a_param] = r['a']
b_param = "b%s" % counter
params[b_param] = r['b']
pair_text = "(:%s,:%s)" % (a_param, b_param)
enum_pairs.append(pair_text)
multicol_in_enumeration = ','.join(enum_pairs)
multicol_in_clause = text(
" (a,b) in (" + multicol_in_enumeration + ")")
q = session.query(Table.id, Table.a,
Table.b).filter(multicol_in_clause).params(params)
Another option I thought about using mysql upserts but this would make whole included even less portable for the other db engine then using multicolumn in clause.
Update SQLAlchemy has sqlalchemy.sql.expression.tuple_(*clauses, **kw) construct that can be used for the same purpose. (I haven't tried it yet)

Categories

Resources