So I'm reading a list of part numbers from excel using Pandas, which can be just about anything, like:
287380274-87
or
ME982394-01
or
HOU8929
that changes randomly based on what the user is looking for and can contain some bad numbers as well. Such as blanks, invalid characters (<, >, or !), as well as phrases, like '12390-01 to 04'. I don't care about filtering the part numbers for all of the random conditions that throw synxtax errors in SQL. But I am attempting to query a SAP database WHERE part number IN (list):
import pandas as pd
from hdbcli import dbapi
userFile = r'T:\H01 Cell\Projects\Part Breakdown Update Spreadsheet Improvements\2022.03.21 Part Breakdown update - VK.xlsm'
# read input from excel for part numbers to WHERE in queries
partNums = pd.read_excel\
(io=userFile, sheet_name='Inputs', usecols=lambda x: 'Unnamed' not in x,\
skiprows=1, dtype={'Part List' : str})
# Open SAP database connection
conn = dbapi.connect(address="server", port=####, user="XXXXX", password="XXXXXXX")
# Function to convert
def listToString(s):
# use list comprehension
listToStr = ', '.join([str(elem) for elem in s])
# return string
return listToStr
partNumStr = listToString(partNums['Part List'].drop_duplicates().tolist())
# GetInvOnHand()
# DISTINCT List
queryIOHlist = [
"PART_NO",
"PLANT",
"LOCATION_DESCRIPTION",
"VALUATION_TYPE"
]
queryIOHstr = listToString(queryIOHlist)
# our SQL query, select all from ' '
queryInvOnHand = (
"SELECT DISTINCT " +
queryIOHstr +
" FROM ZWILLIAMS.ZV_WI_GU_INVENTORY_ON_HAND_IM_THIN INV" +
" WHERE PART_NO IN " +
"(" +
partNumStr +
")"
)
# pandas read SQL to store SQL table in dataframe
inventoryOnHand = pd.read_sql(queryInvOnHand, conn)
conn.close()
I'm running into synxtax errors for my SQL query because of these bad part numbers, such as:
(257, 'sql syntax error: incorrect syntax near "to": line 1 col 8120 (at pos 8120))
where the part number it doesn't like is: 62219-01 to -04
In SQL, is there a way to just skip that number if not found in the Part Numbers column in the table? Ideally, it would just be something like:
if syntaxError:
continue
and then just not record anything in my dataframe for that part number.
I am using Excel as a form for users to create, delete and update entries in a SQL Server Table. I am then taking this input into Python via a Data Frame, and creating a SQL string. I then execute it via pyodbc cursor. For instance, below is how I can get a valid and functional Update query.
ParamstoPass=len(ClassCheckMark.columns)
L_Cols=list()
L_Vals=list()
tableName=ClassCheckMark[ClassCheckMark.columns[1]][0]
SQL_Query='update ' + tableName + ' set '
for i in range(2, ParamstoPass):
L_Cols.append(ClassCheckMark[ClassCheckMark.columns[i]].name)
L_Vals.append(ClassCheckMark[ClassCheckMark.columns[i]][0])
for i in range(1, len(L_Cols)):
SQL_Query=SQL_Query+'[' + L_Cols[i] +']=' +"'" + str(L_Vals[i]) +"', "
SQL_Query=SQL_Query[:-2]+' where ID=' + "'" + str(L_Vals[0]) +"'"
cursor.execute(SQL_Query)
cnn.commit()
cnn.close()
But I know there are some undesirable characters that a user can may enter in Excel that will then make it into the query.
So what is the best way to validate the SQL String in python? Should I look for specific characters like '\', "\0", "\n", "\r", "'", '"', "\x1a"? Or what is the best industry method for this objective?
And I realize that in general this is not the best way to accomplish the goal of user interaction with a DB, but due to various constrains am going with this approach.
Thank you.
After building your L_Cols and L_Vals lists I would suggest validating the column names against the table metadata, constructing a parameterized SQL command, and then executing it. For example:
# test data
L_Cols = ['ID', 'FirstName', 'Photo']
L_Vals = [123, 'bob', None]
tablename = "People"
# validate list of column names
valid_column_names = [x.column_name for x in cursor.columns(tablename).fetchall()]
for col_name in L_Cols:
if col_name not in valid_column_names:
raise ValueError("[{0}] is not a valid column name for table [{1}]".format(col_name, tablename))
# build SQL command text
SQL_Query = "UPDATE [" + tablename + "] SET "
SQL_Query += ", ".join("[" + x + "]=?" for x in L_Cols[1:])
SQL_Query += " WHERE [" + L_Cols[0] + "]=?"
print(SQL_Query) # UPDATE [People] SET [FirstName]=?, [Photo]=? WHERE [ID]=?
# move ID value to the end of the list of parameters
params = L_Vals[1:] + L_Vals[0:1]
print(params) # ['bob', None, 123]
# (edit by OP)
# as in my case, some elements were of unicode markup, which threw
# ProgrammingError: ('Invalid parameter type. param-index=0
# param-type=numpy.int64', 'HY105').
# May need to add params=[str(x) for x in params]
cursor.execute(SQL_Query, params)
I'm looking to take an array list and attach it to a string.
Python 2.7.10, Windows 10
The list is loaded from a mySQL table and the output is this:
skuArray = [('000381001238',) ('000381001238',) ('000381001238',) ('FA200513652',) ('000614400967',)]
I'm wanting to take this list and attach it to a separate query
the problem:
query = "SELECT ItemLookupCode,Description, Quantity, Price, LastReceived "
query = query+"FROM Item "
query = query+"WHERE ItemLookupCode IN ("+skuArray+") "
query = query+"ORDER BY LastReceived ASC;"
I get the error:
TypeError: cannot concatenate 'str' and 'tuple' objects
My guess here is that I need to format the string as:
'000381001238', '000381001238', '000381001238', 'FA200513652','000614400967'
Ultimately the string needs to read:
query = query+"WHERE ItemLookupCode IN ('000381001238', '000381001238', '000381001238', 'FA200513652','000614400967') "
I have tried the following:
skuArray = ''.join(skuArray.split('(', 1))
skuArray = ''.join(skuArray.split(')', 1))
Second Try:
skus = [sku[0] for sku in skuArray]
stubs = ','.join(["'?'"]*len(skuArray))
msconn = pymssql.connect(host=r'*', user=r'*', password=r'*', database=r'*')
cur = msconn.cursor()
query ='''
SELECT ItemLookupCode,Description, Quantity, Price, LastReceived
FROM Item
WHERE ItemLookupCode IN { sku_params }
ORDER BY LastReceived ASC;'''.format(sku_params = stubs)
cur.execute(query, params=skus)
row = cur.fetchone()
print row[3]
cur.close()
msconn.close()
Thanks in advance for your help!
If you want to do the straight inline SQL you could use a list comprehension:
', '.join(["'{}'}.format(sku[0]) for sku in skuArray])
Note: You need to add commas between tuples (based on example)
That said, if you want to do some sql, I would encourage you to parameterize your request with ?
Here is an example of how you would do something like that:
skuArray = [('000381001238',), ('000381001238',), ('000381001238',), ('FA200513652',), ('000614400967',)]
skus = [sku[0] for sku in skuArray]
stubs = ','.join(["'?'"]*len(skuArray))
qry = '''
SELECT ItemLookupCode,Description, Quantity, Price, LastReceived
FROM Item
WHERE ItemLookupCode IN ({ sku_params })
ORDER BY LastReceived ASC;'''.format(sku_params = stubs)
#assuming pyodbc connection syntax may be off
conn.execute(qry, params=skus)
Why?
Non-parameterized queries are a bad idea because it leaves you vulnerable to sql injection and is easy to avoid.
Assuming that skuArray is a list, like this:
>>> skuArray = [('000381001238',), ('000381001238',), ('000381001238',), ('FA200513652',), ('000614400967',)]
You can format your string like this:
>>> ', '.join(["'{}'".format(x[0]) for x in skuArray])
"'000381001238', '000381001238', '000381001238', 'FA200513652', '000614400967'"
I am trying to use a dict to do a SQL INSERT. The logic would basically be:
INSERT INTO table (dict.keys()) VALUES dict.values()
However, I am having a tough time figuring out the correct syntax / flow to do this. This is what I currently have:
# data = {...}
sorted_column_headers_list = []
sorted_column_values_list = []
for k, v in data.items():
sorted_column_headers_list.append(k)
sorted_column_values_list.append(v)
sorted_column_headers_string = ', '.join(sorted_column_headers_list)
sorted_column_values_string = ', '.join(sorted_column_values_list)
cursor.execute("""INSERT INTO title (%s)
VALUES (%s)""",
(sorted_column_headers_string, sorted_column_values_string))
From this I get a SQL exception (I think related to the fact that commas are also included in some of the values that I have). What would be the correct way to do the above?
I think the comment on using this with MySQL is not quite complete. MySQLdb doesn't do parameter substitution in the columns, just the values (IIUC) - so maybe more like
placeholders = ', '.join(['%s'] * len(myDict))
columns = ', '.join(myDict.keys())
sql = "INSERT INTO %s ( %s ) VALUES ( %s )" % (table, columns, placeholders)
# valid in Python 2
cursor.execute(sql, myDict.values())
# valid in Python 3
cursor.execute(sql, list(myDict.values()))
You're not getting escaping on the columns though, so you might want to check them first....
See http://mail.python.org/pipermail/tutor/2010-December/080701.html for a more complete solution
You want to add parameter placeholders to the query. This might get you what you need:
qmarks = ', '.join('?' * len(myDict))
qry = "Insert Into Table (%s) Values (%s)" % (qmarks, qmarks)
cursor.execute(qry, myDict.keys() + myDict.values())
Always good answers here, but in Python 3, you should write the following:
placeholder = ", ".join(["%s"] * len(dict))
stmt = "insert into `{table}` ({columns}) values ({values});".format(table=table_name, columns=",".join(dict.keys()), values=placeholder)
cur.execute(stmt, list(dict.values()))
Don't forget to convert dict.values() to a list because in Python 3, dict.values() returns a view, not a list.
Also, do NOT pour the dict.values() in stmt because it tears a quote out of a string by joining it, which caused MySQL error in inserting it. So you should always put it in cur.execute() dynamically.
I'm a little late to the party but there is another way that I tend to prefer since my data is usually in the form of a dict already. If you list the bind variables in the form of %(columnName)s you can use a dictionary to bind them at execute. This partially solves the problem of column ordering since the variables are bound in by name. I say partially because you still have to make sure that the columns & values portion of the insert are mapped correctly; but the dictionary itself can be in any order (since dicts are sort of unordered anyway)
There is probably a more pythonic way to achieve all this, but pulling the column names into a list and working off it ensures we have a static ordering to build the columns & values clauses.
data_dict = {'col1': 'value 1', 'col2': 'value 2', 'col3': 'value 3'}
columns = data_dict.keys()
cols_comma_separated = ', '.join(columns)
binds_comma_separated = ', '.join(['%(' + item + ')s' for item in columns])
sql = f'INSERT INTO yourtable ({cols_comma_separated}) VALUES ({binds_comma_separated})'
cur.execute(sql, data_dict)
Now whether or not it is a good idea to dynamically build your columns & values clause like this is a topic for a SQL injection thread.
table='mytable'
columns_string= '('+','.join(myDict.keys())+')'
values_string = '('+','.join(map(str,myDict.values()))+')'
sql = """INSERT INTO %s %s
VALUES %s"""%(table, columns_string,values_string)
I tried #furicle's solution but it still inputs everything as a string - if your dict is a mixed one then this may not work as you would want it to. I had a similar issue and this is what I came up with - this is only a query builder and you could use it (with changes) to work with any database of your choice. Have a look!
def ins_query_maker(tablename, rowdict):
keys = tuple(rowdict)
dictsize = len(rowdict)
sql = ''
for i in range(dictsize) :
if(type(rowdict[keys[i]]).__name__ == 'str'):
sql += '\'' + str(rowdict[keys[i]]) + '\''
else:
sql += str(rowdict[keys[i]])
if(i< dictsize-1):
sql += ', '
query = "insert into " + str(tablename) + " " + str(keys) + " values (" + sql + ")"
print(query) # for demo purposes we do this
return(query) #in real code we do this
This is crude and still needs sanity checks, etc, but it works as intended.
for a dict:
tab = {'idnumber': 1, 'fname': 'some', 'lname': 'dude', 'dob': '15/08/1947', 'mobile': 5550000914, 'age' : 70.4}
running the query I get the following output
results of query generated by the suite
This code worked for me (Python 3):
fields = (str(list(dictionary.keys()))[1:-1])
values = (str(list(dictionary.values()))[1:-1])
sql = 'INSERT INTO Table (' + fields + ') VALUES (' + values + ')'
cursor.execute(sql)
It does rely on the dictionary outputting its keys and values in the same order. I'm unclear if this is always true :)
When constructing queries dynamically it's important to ensure that both identifiers and values are correctly quoted. Otherwise you risk
SQL injection if untrusted data is processed
Errors if the column names require quoting (for example embedded spaces)
Data corruption or errors if values are incorrectly quoted (for example 2021-07-11 unquoted may be evaluated as 2003)
Quoting values is best delegated to the DB-API connector. However connector packages don't always provide a way to quote identifiers, so you may need to do this manually. MySQL uses backticks (`) to quote identifiers.
This code quotes identifiers and values. It works for MySQLdb, mysql.connector and pymysql and works for Python 3.5+.
data = {'col1': val1, 'col2': val2, ...}
# Compose a string of quoted column names
cols = ','.join([f'`{k}`' for k in data.keys()])
# Compose a string of placeholders for values
vals = ','.join(['%s'] * len(data))
# Create the SQL statement
stmt = f'INSERT INTO `tbl` ({cols}) VALUES ({vals})'
# Execute the statement, delegating the quoting of values to the connector
cur.execute(stmt, tuple(data.values()))
This is based on other answers here, but it uses back ticks around column names for cases in which you are using reserved words as column names and it it ensures that column names only contain letters, numbers, and underscores to thwart SQL injection attacks.
I've also written a similar upsert that works the same way as the insert but which overwrites data that duplicates the primary key.
import mysql.connector
import re
cnx = mysql.connector.connect(...)
def checkColumnNames(data):
for name in data.keys():
assert re.match(r'^[a-zA-Z0-9_]+$',name), "Bad column name: " + name
def insert(table, data):
checkColumnNames(data)
assert table, "No table specified"
placeholders = ', '.join(['%s'] * len(data))
columns = '`,`'.join(data.keys())
sql = "INSERT INTO `%s` (`%s`) VALUES (%s);" % (table, columns, placeholders)
cnx.cursor().execute(sql, list(data.values()))
def upsert(table, data):
checkColumnNames(data)
assert table, "No table specified"
placeholders = ', '.join(['%s'] * len(data))
columns = '`,`'.join(data.keys())
updates = '`' + '`=%s,`'.join(data.keys()) + '`=%s'
sql = "INSERT INTO `%s` (`%s`) VALUES (%s) ON DUPLICATE KEY UPDATE %s" % (table, columns, placeholders, updates)
cnx.cursor().execute(sql, list(data.values()) + list(data.values()))
Example usage
insert("animals", {
"id": 1,
"name": "Bob",
"type": "Alligator"
})
cnx.commit()
I used this thread for my usage and tried to keep it much simpler
ins_qry = "INSERT INTO {tablename} ({columns}) VALUES {values};" .format(
tablename=my_tablename,
columns=', '.join(myDict.keys()),
values=tuple(myDict.values())
)
cursor.execute(ins_qry)
Make sure to commit the data inserted, either using db_connection.commit() and use cursor.lastrowid, if you need the primary key of the inserted row
This works for me
cursor.execute("INSERT INTO table (col) VALUES ( %(col_value) )",
{'col_value': 123})
if you have list in which there are number of dictionaries
for example: lst=[d1,d2,d3,d4]
then below one will worked for me:
for i in lst:
placeholders = ', '.join(['%s'] * len(i))
columns = ', '.join(i.keys())
sql = "INSERT INTO %s ( %s ) VALUES ( %s )" % (table, columns, placeholders)
cursor.execute(sql,list(i.values()))
conn.commit()
Note:Dont ever forget to commit otherwise you wont be able to see columns and values inserted in table
columns = ', '.join(str(x).replace('/', '_') for x in row_dict.keys())
values = ', '.join("'" + str(x).replace('/', '_') + "'" for x in row_dict.values())
sql = "INSERT INTO %s ( %s ) VALUES ( %s );" % ("tablename", columns, values)
applicable for python3
Let's say our data is:
data = {
"name" : "fani",
"surname": "dogru",
"number" : 271990
}
This is my shorter version:
tablo = "table_name"
cols = ','.join([f" {k}" for k in data.keys()])
vals = ','.join([f"'{k}'" for k in data.values()])
stmt = f'INSERT INTO {tablo} ({cols}) VALUES ({vals})'
What about:
keys = str(dict.keys())
keys.replace('[', '(')
keys.replace(']', ')')
keys.replace("'",'')
vals = str(dict.values())
vals.replace('[', '(')
vals.replace(']', ')')
cur.execute('INSERT INTO table %s VALUES %s' % (keys, vals))
For python 3:
keys = str(dict.keys())[9:].replace('[', '').replace(']', '')
vals = str(dict.values())[11:].replace('[', '').replace(']', '')
...
Hello everyone i currently have this:
import feedparser
d = feedparser.parse('http://store.steampowered.com/feeds/news.xml')
for i in range(10):
print d.entries[i].title
print d.entries[i].date
How would i go about making it so that the title and date are on the same line? Also it doesn't need to print i just have that in there for testing, i would like to dump this output into a mysql db with the title and date, any help is greatly appreciated!
If you want to print on the same line, just add a comma:
print d.entries[i].title, # <- comma here
print d.entries[i].date
To insert to MySQL, you'd do something like this:
to_db = []
for i in range(10):
to_db.append((d.entries[i].title, d.entries[i].date))
import MySQLdb
conn = MySQLdb.connect(host="localhost",user="me",passwd="pw",db="mydb")
c = conn.cursor()
c.executemany("INSERT INTO mytable (title, date) VALUES (%s, %s)", to_db)
Regarding your actual question: if you want to join two strings with a comma you can use something like this:
print d.entries[i].title + ', ' + str(d.entries[i].date)
Note that I have converted the date to a string using str.
You can also use string formatting instead:
print '%s, %s' % (d.entries[i].title, str(d.entries[i].date))
Or in Python 2.6 or newer use str.format.
But if you want to store this in a database it might be better to use two separate columns instead of combining both values into a single string. You might want to consider adjusting your schema to allow this.