While reading this question: SQL Multiple Updates vs single Update performance
I was wondering how could I dynamically implement an update for several variables at the same time using a connector like MariaDB's. Reading the official documentation I did not find anything similar.
This question is similar, and it has helped me to understand how to use parametrized queries with custom connectors but it does not answer my question.
Let's suppose that, from one of the views of the project, we receive a dictionary.
This dictionary has the following structure (simplified example):
{'form-0-input_file_name': 'nofilename', 'form-0-id': 'K0944', 'form-0-gene': 'GJXX', 'form-0-mutation': 'NM_0040(p.Y136*)', 'form-0-trix': 'ZSSS4'}
Assuming that each key in the dictionary corresponds to a column in a table of the database, if I'm not mistaken we would have to iterate over the dictionary and build the query in each iteration.
Something like this (semi pseudo-code, probably it's not correct):
query = "UPDATE `db-dummy`.info "
for key in a_dict:
query += "SET key = a_dict[key]"
It is not clear to me how to construct said query within a loop.
What is the most pythonic way to achieve this?
Although this could work.
query = "UPDATE `db-dummy`.info "
for index, key in enumerate(a_dict):
query = query + ("," if index != 0 else "") +" SET {0} = '{1}'".format(key,a_dict[key])
You should consider parameterized queries for safety and security. Moreover, a dynamic dictionary may also raise other concerns, it may be best to verify or filter on a set of agreed keys before attempting such an operation.
query = "UPDATE `db-dummy`.info "
for index, key in enumerate(a_dict):
query = query + ("," if index != 0 else "") +" SET {0} = ? ".format(key)
# Then execute with your connection/cursor
cursor.execute(query, tuple(a_dict.values()) )
This is what I did (inspired by #ggordon's answer)
query = "UPDATE `db-dummy`.info "
for index, key in enumerate(a_dict):
if index == 0:
query = query + "SET {0} = ?".format(key)
else:
query = query + ", {0} = ?".format(key)
query += " WHERE record_id = " + record_id
And it works!
Related
I have inherited some code similar to below. I understand the concept of passing values to make a query dynamic(in this case field_id) but I don't understand what the benefit of taking the passed-in field_id list and putting it into a dictionary parameters = {"logical_field_id": field_id} before accessing the newly created dictionary to build the SQL statement. Along the same line why return parameters=parameters rather than just listing parameters in the return? I assume this is all the make the request more secure but I would like to better understand of why/how as I need to take on a similar task on a slightly more complex query that is below
def get_related_art(self, field_id):
parameters = {"logical_field_id": field_id}
sql = (
"SELECT a.id AS id,"
" a.name AS name,"
" a.description AS description,"
" a.type AS type,"
" a.subtype AS subtype "
" FROM ArtclTbl AS a INNER JOIN ("
" SELECT article_id AS id FROM LogFldArtclTbl"
" WHERE logical_field_id = %(logical_field_id)s"
" ORDER BY a.name"
)
return self.query(sql, parameters=parameters)
My reason for asking this question is I was asked to parameterize this
def get_group_fields(self, exbytes=None):
parameters = {}
where_clause = (
f"WHERE eig_eb.ebyte in ({', '.join(str(e) for e in ebytes)})" if ebytes else ""
)
sql = (
"SELECT l.id AS id, "
" eig_eb.ebyte AS ebyte, "
" eig.id AS instrument_group_id, "
" eig_lf.relationship_type AS relationship "
....
f" {where_clause}"
)
I started to modify code to iterate when setting the parameters and then accessing that value in the original location. This 'works' except now the query string returns ([ebyte1, ebyte2] instead of (ebyte1, ebyte2). I could modify the string to work around this but i really wanted to understand the why of this first.
parameters = {"exbytes": ', '.join(str(e) for e in exbytes)}
...
where_clause = (
f"WHERE eig_eb.exbyte in " + str(exbytes) if exbytes else ""
The benefit of using named parameter placeholders is so you can pass the parameter values as a dict, and you can add values to that dict in any order. There's no benefit in the first example you show, because you only have one entry in the dict.
There's no benefit in the second example either, because the parameters are part of an IN() list, and there are no other parameterized parts of the query. The order of values in an IN() list is irrelevant. So you could just use positional parameters instead of named parameters.
where_clause = (
f"WHERE eig_eb.ebyte in ({', '.join('%s' for e in ebytes)})" if ebytes else ""
)
Then you don't need a dict at all, you can just pass the ebytes list as the parameters.
Using the syntax parameters=parameters looks like a usage of keyword arguments to a Python function. I don't know the function self.query() in your example, but I suppose it accepts keyword arguments to implement optional arguments. The fact that your local variable is the same name as the keyword argument name is a coincidence.
I want to display a data in QTableWidget according to QComboBoxes. In case of select all gender or select all ages, I want apply select all in the column in sqlite3 query
I want gender to be all
gender = "select all both male and female"
connection.execute("SELECT * FROM child where region=? and hospital=? and ageInMonths=? and gender=?", (region,hospital,ageInMonths,gender))
Welcome to Stackoverflow.
While it's a little tedious, the most sensible way to attack this problem is to build a list of the conditions you want to apply, and another of the data values that need to be inserted. Something like the following (untested) code, in which I assume that the variables are set to None if they aren't required in the search.
conditions = []
values = []
if region is not None:
conditions.append('region=?')
values.append(region)
# And similar logic for each other value ...
if gender is not None:
conditions.append('gender=?')
values.append(gender)
query = 'SELECT * FROM child'
if conditions:
query = query + ' WHERE ' + ' AND '.join(conditions)
connection.execute(query, values)
This way, if you want to include all values of a column you simply exclude if from the conditions by setting it to None.
You can build your where clause and your parameter list conditionally.
Below I am assuming that the ageInMonths and gender variables actually contain the value 'all' when this is selected on your form. You can change this to whichever value is actually passed to your code, if it is something different.
When it comes to your actual query, the best way to get all values for a field is to simply exclude it from the where clause of your query entirely.
So something like:
query_parameters = []
query_string = "SELECT * FROM child where region=? and hospital=?"
query_parameters.append(region)
query_parameters.append(hospital)
if ageInMonths != 'all':
query_string += " and ageInMonths=?"
query_parameters.append(ageInMonths)
if gender != 'all':
query_string += " and gender=?"
query_parameters.append(gender)
connection.execute(query_string, query_parameters)
Basically, at the same time we are testing and building the dynamic parts of the SQL statement (in query_string), we are also dynamically defining the list of variables to pass to the query in query_parameters, which is a list object.
I am using pyodbc to update an Access database.
I need the functionality of an UPSERT.
ON DUPLICATE KEY UPDATE doesn't exist in Access SQL, and REPLACE is not an option since I want to keep other fields.
There are a lot of suggestions out there how to solve that, so this is
the solution which I put together:
for table_name in data_source:
table = data_source[table_name]
for y in table:
if table_name == "whatever":
SQL_UPDATE = "UPDATE {} set [Part Name] = '{}', [value] = {}, [code] = {}, [tolerance] = {} WHERE [Unique Part Number]='{}'".\
format(table_name,y['PartName'],y['Value'],y['keycode'],y['Tolerance'], y['PartNum'])
SQL_INSERT = "INSERT INTO {} ([Part Name],[Unique Part Number], [value], [code], [tolerance]) VALUES ('{}','{}','{}',{},{},{});".\
format(table_name,y['PartName'],y['PartNum'],y['Value'],y['keycode'],y['Tolerance'])
elsif ....
9 more tables....
res = cursor.execute(SQL_UPDATE)
if res.rowcount == 0:
cursor.execute(SQL_INSERT)
Well I have to say, I am not a Python expert, and I didn't manage to understand the fundamental concept nor the Magic of SQL,
so I can just Google things together here.
I don't like my above solution because it is very hard to read and difficult to maintain (I have to to this for ~10 different tables). The other point is that I have to use 2 queries because I didn't manage to understand and run any other UPSERT approach I found.
Does anyone have a recommendation for me how to do this in a smarter, better maintainable way?
As noted in this question and others, Access SQL does not have an "upsert" statement, so you will need to use a combination of UPDATE and INSERT. However, you can improve your current implementation by
using proper parameters for your query, and
using Python string manipulation to build the SQL command text.
For example, to upsert into a table named [Donor]
Donor ID Last Name First Name
-------- --------- ----------
1 Thompson Gord
You can start with a list of the field names. The trick here is to put the key field(s) at the end, so the INSERT and UPDATE statements will refer to the fields in the same order (i.e., the UPDATE will refer to the ID field last because it will be in the WHERE clause).
data_fields = ['Last Name', 'First Name']
key_fields = ['Donor ID']
The parameter values will be the same for both the UPDATE and INSERT cases, e.g.
params = ('Elk', 'Anne', 2)
The UPDATE statement can be constructed like this
update_set = ','.join(['[' + x + ']=?' for x in data_fields])
update_where = ' AND '.join(['[' + x + ']=?' for x in key_fields])
sql_update = "UPDATE [Donor] SET " + update_set + " WHERE " + update_where
print(sql_update)
which shows us
UPDATE [Donor] SET [Last Name]=?,[First Name]=? WHERE [Donor ID]=?
Similarly, the INSERT statement can be constructed like this
insert_fields = ','.join(['[' + x + ']' for x in (data_fields + key_fields)])
insert_placeholders = ','.join(['?' for x in (data_fields + key_fields)])
sql_insert = "INSERT INTO [Donor] (" + insert_fields + ") VALUES (" + insert_placeholders + ")"
print(sql_insert)
which prints
INSERT INTO [Donor] ([Last Name],[First Name],[Donor ID]) VALUES (?,?,?)
So, to perform our upsert, all we need to do is
crsr.execute(sql_update, params)
if crsr.rowcount > 0:
print('Existing row updated.')
else:
crsr.execute(sql_insert, params)
print('New row inserted.')
crsr.commit()
Consider using parameterized queries from prepared statements that uses ? placeholders. The str.format is still needed for identifiers such as table and field names. Then unpack dictionary items with zip(*dict.items()) to pass as parameters in the cursor's execute call: cursor.execute(query, params).
for table_name in data_source:
table = data_source[table_name]
for y in table:
keys, values = zip(*y.items()) # UNPACK DICTIONARY INTO TWO TUPLES
if table_name == "whatever":
SQL_UPDATE = "UPDATE {} set [Part Name] = ?, [value] = ?, [code] = ?," + \
" [tolerance] = ? WHERE [Unique Part Number]= ?".format(table_name)
SQL_INSERT = "INSERT INTO {} ([Part Name], [Unique Part Number], [value]," + \
" [code], [tolerance]) VALUES (?, ?, ?, ?, ?);".format(table_name)
res = cursor.execute(SQL_UPDATE, values)
if res.rowcount == 0:
cursor.execute(SQL_INSERT, values)
...
I have a list of a list in Python which I'm trying to either insert or update to Postgresql. I've tried both for-loops and while-loops but it gets messy when I'm trying to check wether the user already exists in postgres. I want to loop through the list of lists and insert/update every user in postgres.
I have below variables, which way would you recommend me to choose? My last shot was the if exists, which does not work with postgres..
userList = [user1, user2, user3]
i=0
while ( i < len(userList)):
query = "IF EXISTS (SELECT * FROM login WHERE email = '" + userList[i][2] + "') INSERT INTO login (created, username, password, email) VALUES (NOW(), '" + userList[i][0] + "','" + userList[i][1] + "','" + userList[i][2] + "')"
i += 1
settings.database_connection.execute_sql(query)
Thank you in advance!
There is new feature (so called UPSERT) in PostgreSQL 9.5. So you can either insert or update a row:
INSERT INTO login VALUES (<your values>)
ON CONFLICT(email) DO UPDATE SET password=..., email=...;
More details in official documentation: PostgreSQL 9.5.1 Documentation: INSERT
Please suggest is there way to write query multi-column in clause using SQLAlchemy?
Here is example of the actual query:
SELECT url FROM pages WHERE (url_crc, url) IN ((2752937066, 'http://members.aye.net/~gharris/blog/'), (3799762538, 'http://www.coxandforkum.com/'));
I have a table that has two columns primary key and I'm hoping to avoid adding one more key just to be used as an index.
PS I'm using mysql DB.
Update: This query will be used for batch processing - so I would need to put few hundreds pairs into the in clause. With IN clause approach I hope to know fixed limit of how many pairs I can stick into one query. Like Oracle has 1000 enum limit by default.
Using AND/OR combination might be limited by the length of the query in chars. Which would be variable and less predictable.
Assuming that you have your model defined in Page, here's an example using tuple_:
keys = [
(2752937066, 'http://members.aye.net/~gharris/blog/'),
(3799762538, 'http://www.coxandforkum.com/')
]
select([
Page.url
]).select_from(
Page
).where(
tuple_(Page.url_crc, Page.url).in_(keys)
)
Or, using the query API:
session.query(Page.url).filter(tuple_(Page.url_crc, Page.url).in_(keys))
I do not think this is currently possible in sqlalchemy, and not all RDMBS support this.
You can always transform this to a OR(AND...) condition though:
filter_rows = [
(2752937066, 'http://members.aye.net/~gharris/blog/'),
(3799762538, 'http://www.coxandforkum.com/'),
]
qry = session.query(Page)
qry = qry.filter(or_(*(and_(Page.url_crc == crc, Page.url == url) for crc, url in filter_rows)))
print qry
should produce something like (for SQLite):
SELECT pages.id AS pages_id, pages.url_crc AS pages_url_crc, pages.url AS pages_url
FROM pages
WHERE pages.url_crc = ? AND pages.url = ? OR pages.url_crc = ? AND pages.url = ?
-- (2752937066L, 'http://members.aye.net/~gharris/blog/', 3799762538L, 'http://www.coxandforkum.com/')
Alternatively, you can combine two columns into just one:
filter_rows = [
(2752937066, 'http://members.aye.net/~gharris/blog/'),
(3799762538, 'http://www.coxandforkum.com/'),
]
qry = session.query(Page)
qry = qry.filter((func.cast(Page.url_crc, String) + '|' + Page.url).in_(["{}|{}".format(*_frow) for _frow in filter_rows]))
print qry
which produces the below (for SQLite), so you can use IN:
SELECT pages.id AS pages_id, pages.url_crc AS pages_url_crc, pages.url AS pages_url
FROM pages
WHERE (CAST(pages.url_crc AS VARCHAR) || ? || pages.url) IN (?, ?)
-- ('|', '2752937066|http://members.aye.net/~gharris/blog/', '3799762538|http://www.coxandforkum.com/')
I ended up using the test() based solution: generated "(a,b) in ((:a1, :b1), (:a2,:b2), ...)" with named bind vars and generating dictionary with bind vars' values.
params = {}
for counter, r in enumerate(records):
a_param = "a%s" % counter
params[a_param] = r['a']
b_param = "b%s" % counter
params[b_param] = r['b']
pair_text = "(:%s,:%s)" % (a_param, b_param)
enum_pairs.append(pair_text)
multicol_in_enumeration = ','.join(enum_pairs)
multicol_in_clause = text(
" (a,b) in (" + multicol_in_enumeration + ")")
q = session.query(Table.id, Table.a,
Table.b).filter(multicol_in_clause).params(params)
Another option I thought about using mysql upserts but this would make whole included even less portable for the other db engine then using multicolumn in clause.
Update SQLAlchemy has sqlalchemy.sql.expression.tuple_(*clauses, **kw) construct that can be used for the same purpose. (I haven't tried it yet)