Issue in substituting string value in mysql insert statement python 3.7 - python

I am new in Python. I have written following piece of code to construct INSERT sql query:
sql = f"INSERT INTO my_table (col1, col2) VALUES (%s, %s)"
params = []
for k, v in my_dict.items():
params.append(str(k), v) ##AS COL1 is of type varchar I am typecasting first param k
for param in params:
curr_sql_query = sql % param
log.info(curr_sql_query)
db_obj.execute(sql)
But even after converting first param to string as above, sql string I am getting as follows:
INSERT INTO my_table (col1, col2) VALUES (1.2.3.4, 1) where value should have been ('1.2.3.4', 1)
The error I am getting is as follows while trying to run the query string using db_obj.execute(sql) :
error=(1064, 'You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '.3.4 at line 1')"}

str(k) does not add quotes('). You should do it yourself.
params.append((f"'{k}'", v))
If k is already str type, you can use repr that repesenting more detail for object,
rerp('test') returns "'test'" whereas str('test') returns"test".
params.append((repr(k), v))
But be careful with it, since if you already have ' in k, it repr wraps your string with ".
For example,
repr('\'Hello\' is string')
output:
"'Hello' is string"

You should use placeholders instead to avoid issues with quotes, escaped characters and possible code injection:
sql = "INSERT INTO my_table (col1, col2) VALUES (%s, %s)"
for k, v in my_dict.items():
db_obj.execute(sql, (str(k), v))

Related

Properly format SQL query when insert into variable number of columns

I'm using psycopg2 to interact with a PostgreSQL database. I have a function whereby any number of columns (from a single column to all columns) in a table could be inserted into. My question is: how would one properly, dynamically, construct this query?
At the moment I am using string formatting and concatenation and I know this is the absolute worst way to do this. Consider the below code where, in this case, my unknown number of columns (i.e. keys from a dict is in fact 2):
dictOfUnknownLength = {'key1': 3, 'key2': 'myString'}
def createMyQuery(user_ids, dictOfUnknownLength):
fields, values = list(), list()
for key, val in dictOfUnknownLength.items():
fields.append(key)
values.append(val)
fields = str(fields).replace('[', '(').replace(']', ')').replace("'", "")
values = str(values).replace('[', '(').replace(']', ')')
query = f"INSERT INTO myTable {fields} VALUES {values} RETURNING someValue;"
query = INSERT INTO myTable (key1, key2) VALUES (3, 'myString') RETURNING someValue;
This provides a correctly formatted query but is of course prone to SQL injections and the like and, as such, is not an acceptable method of achieving my goal.
In other queries I am using the recommended methods of query construction when handling a known number of variables (%s and separate argument to .execute() containing variables) but I'm unsure how to adapt this to accommodate an unknown number of variables without using string formatting.
How can I elegantly and safely construct a query with an unknown number of specified insert columns?
To add to your worries, the current methodology using .replace() is prone to edge cases where fields or values contain [, ], or '. They will get replaced no matter what and may mess up your query.
You could always use .join() to join a variable number of values in your list. To top it up, format the query appropriately with %s after VALUES and pass your arguments into .execute().
Note: You may also want to consider the case where the number of fields is not equal to the number values.
import psycopg2
conn = psycopg2.connect("dbname=test user=postgres")
cur = conn.cursor()
dictOfUnknownLength = {'key1': 3, 'key2': 'myString'}
def createMyQuery(user_ids, dictOfUnknownLength):
# Directly assign keys/values.
fields, values = list(dictOfUnknownLength.keys()), list(dictOfUnknownLength.values())
if len(fields) != len(values):
# Raise an error? SQL won't work in this case anyways...
pass
# Stringify the fields and values.
fieldsParam = ','.join(fields) # "key1, key2"
valuesParam = ','.join(['%s']*len(values))) # "%s, %s"
# "INSERT ... (key1, key2) VALUES (%s, %s) ..."
query = 'INSERT INTO myTable ({}) VALUES ({}) RETURNING someValue;'.format(fieldsParam, valuesParam)
# .execute('INSERT ... (key1, key2) VALUES (%s, %s) ...', [3, 'myString'])
cur.execute(query, values) # Anti-SQL-injection: pass placeholder
# values as second argument.

In python, How to do the postgresql 'LIKE %name%' query which mitigate SQL injection?

I want to execute below query,
select * from table where name LIKE %sachin%;
I created sql query in this way,
sql = "select * from table where %s like '\%%s\%'"
It gives me following error,
ValueError: unsupported format character ''' (0x27) at index 42
I want '%' symbol before and after the string.
How can I achive this? It should also mitigate SQL injection.
Your best option is to use placeholders and generate the right-hand value of the LIKE in SQL like as follows. The big difficulty is you are also expecting to pass in the identifier which means you will probably have to do something a little different:
sqltemplate = "select * from table where {} like '%%' || %s || '%%'"
Into this we fill in our identifier. Note it is important to whitelist the value.
allowed_columns = ['foo', 'bar', 'bar']
if colname in allowed_columns:
sql = sqltemplate.format(colname);
else:
raise ValueError('Bad column name!')
Then you can use a placeholder for %s and it will just work.
conn.execute(sql, (searchval,));
Note: In psycopg2, you use %s and %d for placeholders, and %% to represent a literal percent.
You can use % as the escape character.
sql = "select * from table where name like '%%sachin%%'"
For your case, The above query should work.

psql TypeError: not all arguments converted during string formatting

I have a trouble when I try to automatically generate pid
CREATE TABLE players(
pID SERIAL primary key,
pName VARCHAR(90) not null
);
and here is my function
def addPlayer(name):
conn = connect()
cur = conn.cursor()
cur.execute("INSERT INTO players(pName) VALUES(%s)",name)
conn.commit()
and I call the function with
addPlayer('Vera')
I keep getting the error that
cur.execute("INSERT INTO players(pName) VALUES(%s)",name)
TypeError: not all arguments converted during string formatting
I search for hours but still confused. Can anyone help me with this? Thanks a lot!
you need to pass a tuple or list as the second parameter to execute.
When having more than one replacement in the query the tuple looks "normal" like this: (name, age).
In your case you need to add a tuple with just one part. The short but a bit unusual way to write that is to use (name,) as second parameter.
thus:
cur.execute("INSERT INTO players(pName) VALUES(%s)",(name,))
Although it's more common to use ? as the replacement char which would look like this:
cur.execute("INSERT INTO players(pName) VALUES(?)",(name,))
I think you're confusing string interpolation with query variables.
String interpolation:
name = "Ben"
print("Hello, my name is %s" % name)
names = ["Adam", "Ben", "Charlie"]
print("Hello, our names are %s, %s and %s" % names)
Query variables:
values = [name]
cur.execute("INSERT INTO players(pName) VALUES(?)", values)
So changing your comma to a % would pass the variable into the string and would work, but the other way sanitises the input, so you should use the second example as is.

TypeError when inserting JSON data into MySQL using MySQL-Python

I'm trying to insert data from a JSON string to MySQL using MySQLdb. The total number of columns is fixed. Each row of data from the JSON string does not always have values for each column.
Here is my sample code:
vacant_building = 'http://data.cityofchicago.org/resource/7nii-7srd.json?%24where=date_service_request_was_received=%272014-06-02T00:00:00%27'
obj = urllib2.urlopen(vacant_building)
data = json.load(obj)
def insert_mysql(columns, placeholders, data):
sql = "INSERT INTO vacant_buildings (%s) VALUES (%s)" % (columns, placeholders)
db = MySQLdb.connect(host="localhost", user="xxxx", passwd="xxxx", db="chicago_data")
cur = db.cursor()
cur.execute(sql, data)
for row in data:
placeholders = ', '.join(['%s'] * len(row))
columns = ', '.join(c[:64] for c in row.keys())
row_data = ', '.join(str(value) for value in row.values())
insert_mysql(columns, placeholders, row_data)
I get the following error:
query = query % tuple([db.literal(item) for item in args])
TypeError: not all arguments converted during string formatting
I'm pretty sure the error has to do with the way I'm inserting the values. I've tried to change this to:
sql = "INSERT INTO vacant_buildings (%s) VALUES (%s) (%s)" % (columns, placeholders, data)
but I get a 1064 error. It's because the values are not enclosed by quotes (').
Thoughts to fix?
In order to parameterize your query using MySQLdb's cursor.execute method, the second argument to execute has to be a sequence of values; in your for loop, you're joining the values together into one string with the following line:
row_data = ', '.join(str(value) for value in row.values())
Since you generated a number of placeholders for your values equal to len(row), you need to supply that many values to cursor.execute. If you gave it only a single string, it will put that entire string into the first placeholder, leaving the others without any arguments. This will throw a TypeError - the message in this case would read, "not enough arguments for format string," but I'm going to assume you simply mixed up when copy/pasting because the opposite case (supplying too many arguments/too few placeholders) reads as you indicate, "not all arguments converted during string formatting."
In order to run an INSERT statement through MySQLdb with a variable set of columns, you could do just as you've done for the columns and placeholders, but I prefer to use mapping types with the extended formatting syntax supported by MySQLdb (e.g., %(name)s instead of %s) to make sure that I've constructed my query correctly and not put the values into any wrong order. I also like using advanced string formatting where possible in my own code.
You could prepare your inputs like this:
max_key_length = 64
columns = ','.join(k[:max_key_length] for k in row.keys())
placeholders = ','.join('%({})s'.format(k[:max_key_length]) for k in row.keys())
row_data = [str(v) for v in row.values()]
Noting that the order of the dict comprehensions is guaranteed, so long as you don't alter the dict in the meanwhile.
Generally speaking, this should work okay with the sort of code in your insert_mysql function. However, looking at the JSON data you're actually pulling from that URL, you should be aware that you may run into nesting issues; for example:
>>> pprint.pprint(data[0])
{u'address_street_direction': u'W',
u'address_street_name': u'61ST',
u'address_street_number': u'424',
u'address_street_suffix': u'ST',
u'any_people_using_property_homeless_childen_gangs_': True,
u'community_area': u'68',
u'date_service_request_was_received': u'2014-06-02T00:00:00',
u'if_the_building_is_open_where_is_the_entry_point_': u'FRONT',
u'is_building_open_or_boarded_': u'Open',
u'is_the_building_currently_vacant_or_occupied_': u'Vacant',
u'is_the_building_vacant_due_to_fire_': False,
u'latitude': u'41.78353874626324',
u'location': {u'latitude': u'41.78353874626324',
u'longitude': u'-87.63573355602661',
u'needs_recoding': False},
u'location_of_building_on_the_lot_if_garage_change_type_code_to_bgd_': u'Front',
u'longitude': u'-87.63573355602661',
u'police_district': u'7',
u'service_request_number': u'14-00827306',
u'service_request_type': u'Vacant/Abandoned Building',
u'ward': u'20',
u'x_coordinate': u'1174508.30988836',
u'y_coordinate': u'1864483.93566661',
u'zip_code': u'60621'}
The string representation of the u'location' column is:
"{u'latitude': u'41.78353874626324', u'needs_recoding': False, u'longitude': u'-87.63573355602661'}"
You may not want to put that into a database field, especially considering that there are atomic lat/lon fields already in the JSON object.

How can I insert NULL data into MySQL database with Python?

I'm getting a weird error when inserting some data from a Python script to MySQL. It's basically related to a variable being blank that I am inserting. I take it that MySQL does not like blank variables but is there something else I can change it to so it works with my insert statement?
I can successfully use an IF statement to turn it to 0 if its blank but this may mess up some of the data analytics I plan to do in MySQL later. Is there a way to convert it to NULL or something so MySQL accepts it but doesn't add anything?
When using mysqldb and cursor.execute(), pass the value None, not "NULL":
value = None
cursor.execute("INSERT INTO table (`column1`) VALUES (%s)", (value,))
Found the answer here
if the col1 is char, col2 is int, a trick could be:
insert into table (col1, col2) values (%s, %s) % ("'{}'".format(val1) if val1 else "NULL", val2 if val2 else "NULL");
you do not need to add ' ' to %s, it could be processed before pass value to sql.
this method works when execute sql with session of sqlalchemy, for example session.execute(text(sql))
ps: sql is not tested yet
Quick note about using parameters in SQL statements with Python. See the RealPython article on this topic - Preventing SQL Injection Attacks With Python. Here's another good article from TowardsDataScience.com - A Simple Approach To Templated SQL Queries In Python. These helped me with same None/NULL issue.
Also, I found that if I put "NULL" (without quotes) directly into the INSERT query in VALUES, it was interpreted appropriately in the SQL Server DB. The translation problem only exists if needing to conditionally add NULL or a value via string interpolation.
Examples:
cursor.execute("SELECT admin FROM users WHERE username = %s'", (username, ));
cursor.execute("SELECT admin FROM users WHERE username = %(username)s", {'username': username});
UPDATE: This StackOverflow discussion is more in line with what I'm trying to do and may help someone else.
Example:
import pypyodbc
myData = [
(1, 'foo'),
(2, None),
(3, 'bar'),
]
connStr = """
DSN=myDb_SQLEXPRESS;
"""
cnxn = pypyodbc.connect(connStr)
crsr = cnxn.cursor()
sql = """
INSERT INTO myTable VALUES (?, ?)
"""
for dataRow in myData:
print(dataRow)
crsr.execute(sql, dataRow)
cnxn.commit()
crsr.close()
cnxn.close()
Based on above answers I wrote a wrapper function for my use case, you can try and change the function according to your need.
def sanitizeData(value):
if value in ('', None):
return "NULL"
# This case handles the case where value already has ' in it (ex: O'Brien). This is how SQL skils single quotes
if type(value) is str:
return "'{}'".format(value.replace("'", "''"))
return value
Now call the sql query like so,
"INSERT INTO %s (Name, Email) VALUES (%s, %s)"%(table_name, sanitizeData(actual_name), sanitizeData(actual_email))
Why not set the variable equal to some string like 'no price' and then filter this out later when you want to do math on the numbers?
filter(lambda x: x != 'no price',list_of_data_from_database)
Do a quick check for blank, and if it is, set it equal to NULL:
if(!variable_to_insert)
variable_to_insert = "NULL"
...then make sure that the inserted variable is not in quotes for the insert statement, like:
insert = "INSERT INTO table (var) VALUES (%s)" % (variable_to_insert)
...
not like:
insert = "INSERT INTO table (var) VALUES ('%s')" % (variable_to_insert)
...

Categories

Resources