This question already has answers here:
imploding a list for use in a python MySQLDB IN clause
(8 answers)
Closed 1 year ago.
I want to insert a list in my database but I can't.
Here is an example of what I need:
variable_1 = "HELLO"
variable_2 = "ADIOS"
list = [variable_1,variable_2]
INSERT INTO table VALUES ('%s') % list
Can something like this be done? Can I insert a list as a value?
When I try it, an error says that is because of an error in MySQL syntax
The answer to your original question is: No, you can't insert a list like that.
However, with some tweaking, you could make that code work by using %r and passing in a tuple:
variable_1 = "HELLO"
variable_2 = "ADIOS"
varlist = [variable_1, variable_2]
print "INSERT INTO table VALUES %r;" % (tuple(varlist),)
Unfortunately, that style of variable insertion leaves your code vulnerable to SQL injection attacks.
Instead, we recommend using Python's DB API and building a customized query string with multiple question marks for the data to be inserted:
variable_1 = "HELLO"
variable_2 = "ADIOS"
varlist = [variable_1,variable_2]
var_string = ', '.join('?' * len(varlist))
query_string = 'INSERT INTO table VALUES (%s);' % var_string
cursor.execute(query_string, varlist)
The example at the beginning of the SQLite3 docs shows how to pass arguments using the question marks and it explains why they are necessary (essentially, it assures correct quoting of your variables).
Your question is not clear.
Do you want to insert the list as a comma-delimited text string into a single column in the database? Or do you want to insert each element into a separate column? Either is possible, but the technique is different.
Insert comma-delimited list into one column:
conn.execute('INSERT INTO table (ColName) VALUES (?);', [','.join(list)])
Insert into separate columns:
params = ['?' for item in list]
sql = 'INSERT INTO table (Col1, Col2. . .) VALUES (%s);' % ','.join(params)
conn.execute(sql, list)
both assuming you have established a connection name conn.
A few other suggestions:
Try to avoid INSERT statements that do not list the names and order of the columns you're inserting into. That kind of statement leads to very fragile code; it breaks if you add, delete, or move columns around in your table.
If you're inserting a comma-separted list into a single-field, that generally violates principals of database design and you should use a separate table with one value per record.
If you're inserting into separate fields and they have names like Word1 and Word2, that is likewise an indication that you should be using a separate table instead.
Never use direct string substitution to create SQL statements. It will break if one of the values is, for example o'clock. It also opens you to attacks by people using SQL injection techniques.
You can use json.dumps to convert a list to json and write the json to db.
For example:
insert table example_table(column_name) values(json.dumps(your_list))
Related
I am automating a task through Python that will run an SQL statement to insert into an existing table in a DB.
My CSV headers look as such:
ID,ACCOUNTID,CATEGORY,SUBCATEGORY,CREATION_DATE,CREATED_BY,REMARK,ISIMPORTANT,TYPE,ENTITY_TYPE
My values:
seq_addnoteid.nextval,123456,TEST,ADMN_TEST,sysdate,ME,This is a test,Y,1,A
NOTE: Currently, seq_addnote works from DB but in my code i added a small snippet to get the max ID and the rows will increase this by one for each iteration.
Sysdate could also be passed as format '19-MAY-22'
If i was to run from DB, this would work:
insert into notes values(seq_addnoteid.nextval,'123456','TEST','ADMN_TEST',sysdate,'ME','This is a test','Y',1,'A');
# Snippet to get function
cursor.execute("SELECT MAX(ID) from NOTES")
max = cursor.fetchone()
max = int(max[0])
with open ('sample.csv', 'r') as f:
reader = csv.reader(f)
columns = next(reader)
query = 'INSERT INTO NOTES({0}) values ({1})'
query = query.format(','.join(columns), ','.join('?' * len(columns)))
cursor = conn.cursor()
for data in reader:
cursor.execute(query, data)
conn.commit()
print("Records inserted successfully")
cursor.close()
conn.close()
Currently, i'm getting Oracle-Error-Message: ORA-01036: illegal variable name/number and i think its because of my query.format line. However, I'm looking for help to get this code to handle the data types properly.
Thanks!
Try printing your query before you execute it. I think you'll find that it's printing this:
INSERT INTO NOTES(ID,ACCOUNTID,CATEGORY,SUBCATEGORY,CREATION_DATE,CREATED_BY,REMARK,ISIMPORTANT,TYPE,ENTITY_TYPE)
values(seq_addnoteid.nextval,123456,TEST,ADMN_TEST,sysdate,ME,This is a test,Y,1,A);
Which will also give you a ORA-01036 if you try to run it manually.
The problem is that you want some of your column values to be literal values, and some of them to be strings escaped in single-quotes, and your code doesn't do that. I don't think there's an easy to way to do it with ','.join(), so you'll either need to modify your CSVs to quote the strings, like:
seq_addnoteid.nextval,"'123456'","'TEST'","'ADMN_TEST'",sysdate,"'ME'","'This is a test'","'Y'",1,"'A'"
Or modify your query.format to add the quotes around the parameters that you want to treat as strings:
query.format(','.join(columns), "?,'?','?','?',?,'?','?','?',?,'?'")
As the commenters mentioned, pandas does handle this all very nicely.
EDIT: I see what you're saying. I'm not sure pandas will help with the literal functions you want to pass to the insert. But yes, you should be able to change your CSV and then do:
query.format(','.join(columns) + ',ID,CREATION_DATE', "'?','?','?','?','?','?',?,'?',seq_addnoteid.nextval,sysdate")
As a side note, a lot of people do this sort of thing on the database side in a BEFORE INSERT trigger, e.g.:
create or replace trigger NOTES_INS_TRG
before insert on NOTES
for each row
begin
:NEW.ID := seq_addnoteid.nextval;
:NEW.CREATION_DATE := sysdate;
end;
/
Then you could leave those columns out of your insert entirely.
Edit again:
I'm not sure you can use ? for bind/substitution variables in cx_oracle (see documentation ). So where your raw query is currently:
INSERT INTO NOTES(ACCOUNTID,CATEGORY,SUBCATEGORY,CREATED_BY,REMARK,ISIMPORTANT,TYPE,ENTITY_TYPE,ID,CREATION_DATE)
values (seq_addnoteid.nextval,sysdate,'?','?','?','?','?','?',?,'?')
You'd need something like:
INSERT INTO NOTES(ACCOUNTID,CATEGORY,SUBCATEGORY,CREATED_BY,REMARK,ISIMPORTANT,TYPE,ENTITY_TYPE,ID,CREATION_DATE)
values (seq_addnoteid.nextval,sysdate,:1,:2,:3,:4,:5,:6,:7,:8)
We can probably do that by modifying the format string again to generate some bind variables:
query.format('ID,CREATION_DATE,' + ','.join(columns),
"seq_addnoteid.nextval,sysdate," + ','.join([':'+c for c in columns])
Again, try printing the query before executing it to make sure the column names and values are lining up correctly.
This question already has answers here:
How to get fields by name in query Python?
(3 answers)
Closed 3 years ago.
I would like to be able to use my SQL table in python.
I managed to connect and access my table using the code.
I would like to know if it was possible to retrieve the column names in my code.
I already have my colomns:
ACTION | MANAGER
123 XYZ
456 ABC
PersonnesQuery = "SELECT........"
cursor = mydb.cursor()
cursor.execute(PersonnesQuery)
records_Personnes = cursor.fetchall()
for row in records_Personnes:
if (row["MANAGER"] == "XYZ"):
..................
elif (row["MANAGER"] == "ABC"):
................
the way quoted does not seem to work. However, it works with the number, for example row[0]. But I would like by column name
In order to access a field by key, in this case "MANAGER", row should be a dict. But due to the fact that you can access row by index, row is type list.
Meaning you can't access by key when using .fetchall() because it returns a list.
I'm assuming you're using the pymysql library, in which case there is a specific kind of cursor called a DictCursor. This will return the results as a dictionary rather than a list.
cursor = mydb.cursor(pymysql.cursors.DictCursor)
cursor.execute(PersonnesQuery)
records_Personnes = cursor.fetchall()
This question already has answers here:
How do you escape strings for SQLite table/column names in Python?
(8 answers)
Closed 7 years ago.
I have a wide table in a sqlite3 database, and I wish to dynamically query certain columns in a Python script. I know that it's bad to inject parameters by string concatenation, so I tried to use parameter substitution instead.
I find that, when I use parameter substitution to supply a column name, I get unexpected results. A minimal example:
import sqlite3 as lite
db = lite.connect("mre.sqlite")
c = db.cursor()
# Insert some dummy rows
c.execute("CREATE TABLE trouble (value real)")
c.execute("INSERT INTO trouble (value) VALUES (2)")
c.execute("INSERT INTO trouble (value) VALUES (4)")
db.commit()
for row in c.execute("SELECT AVG(value) FROM trouble"):
print row # Returns 3
for row in c.execute("SELECT AVG(:name) FROM trouble", {"name" : "value"}):
print row # Returns 0
db.close()
Is there a better way to accomplish this than simply injecting a column name into a string and running it?
As Rob just indicated in his comment, there was a related SO post that contains my answer. These substitution constructions are called "placeholders," which is why I did not find the answer on SO initially. There is no placeholder pattern for column names, because dynamically specifying columns is not a code safety issue:
It comes down to what "safe" means. The conventional wisdom is that
using normal python string manipulation to put values into your
queries is not "safe". This is because there are all sorts of things
that can go wrong if you do that, and such data very often comes from
the user and is not in your control. You need a 100% reliable way of
escaping these values properly so that a user cannot inject SQL in a
data value and have the database execute it. So the library writers do
this job; you never should.
If, however, you're writing generic helper code to operate on things
in databases, then these considerations don't apply as much. You are
implicitly giving anyone who can call such code access to everything
in the database; that's the point of the helper code. So now the
safety concern is making sure that user-generated data can never be
used in such code. This is a general security issue in coding, and is
just the same problem as blindly execing a user-input string. It's a
distinct issue from inserting values into your queries, because there
you want to be able to safely handle user-input data.
So, the solution is that there is no problem in the first place: inject the values using string formatting, be happy, and move on with your life.
Why not use string formatting?
for row in c.execute("SELECT AVG({name}) FROM trouble".format(**{"name" : "value"})):
print row # => (3.0,)
I am having troubles finding out if I can even do this. Basically, I have a csv file that looks like the following:
1111,804442232,1
1112,312908721,1
1113,A*2434,1
1114,A*512343128760987,1
1115,3512748,1
1116,1111,1
1117,1234,1
This is imported into a sqlite database in memory for manipulation. I will be importing multiple files into this database after some manipulation. Sqlite is allowing me to keep constraints on the tables and receive errors where needed without creating additional functions just to check each constraint while using arrays in python. I want to do a few things but the first of which is to prepend field2 where all field2 strings match an entry in field1.
For example, in the above data field2 in entry 6 matches entry 1. In this case I would like to prepend field2 in entry 6 with '555'
If this is not possible I do believe I could make do using a regex and just do this on every row with 4 digits in field2... though... I have yet to successfully get REGEX working using python/sqlite as it always throws me an error.
I am working within Python using Sqlite3 to connect/manipulate my sqlite database.
EDIT: I am looking for a method to manipulate the resultant tables which reside in a sqlite database rather than manipulating just the csv data. The data above is just a simple representation of what is contained in the files I am working with. Would it be better to work with arrays containing the data from the csv files? These files have 10,000+ entries and about 20-30 columns.
If you must do it in SQLite, how about this:
First, get the column names of the table by running the following and parsing the result
def get_columns(table_name, cursor):
cursor.execute('pragma table_info(%s)' % table_name)
return [row[1] for row in cursor]
conn = sqlite3.connect('test.db')
columns = get_columns('test_table',conn.cursor())
For each of those columns, run the following update, that does your prepending
def prepend(column, reference, prefix, cursor):
query = '''
UPDATE %s
SET %s = 'prefix' || %s
WHERE %s IN (SELECT %s FROM %s)
''' % (table, column, column, column, reference, table)
cursor.execute(query)
reference = 'field1'
[prepend('test_table', column, reference, '555', conn.cursor())
for column in columns
if column != reference]
Note that this is expensive: O(n^2) for each column you want to do it for.
As per your edit and Nathan's answer, it might be better to simply work with python's builtin datastructures. You can always insert it into SQLite after.
10,000 entries is not really much so it might not matter in the end. It all depends on your reason for requiring it to be done in SQLite (which we don't have much visibility of).
There is no need to use regex expressions to do this, just throw the contents from the first column into a set and then iterate through the rows and update the second field.
first_col_values = set(row[0] for row in rows)
for row in rows:
if row[1] in first_col_values:
row[1] = '555' + row[1]
So... I found the answer to my own question after a ridiculous amount of my own searching and trial and error. My unfamiliarity with SQL had me stumped as I was trying all kinds of crazy things. In the end... this was the simple type of solution I was looking for:
prefix="555"
cur.execute("UPDATE table SET field2 = %s || field2 WHERE field2 IN (SELECT field1 FROM table)"% (prefix))
I kept the small amount of python in there but what I was looking for was the SQL statement. Not sure why nobody else came up with something that simple =/. Unsatisfied with the answers so far, I had been searching far and wide for this simple line >_<.
I am trying to select multiple columns, but not all of the columns, from the database. All of the columns I want to select are going to start with "word".
So in pseudocode I'd like to do this:
SELECT "word%" from searchterms where onstate = 1;
More or less. I am not finding any documentation on how to do this - is it possible in MySQL? Basically, I am trying to store a list of words in a single row, with an identifier, and I want to associate all of the words with that identifier when I pull the records. All of the words are going to be joined as a string and passed to another function in an array/dictionary with their identifier.
I am trying to make as FEW database calls as possible to keep speedy code.
Ok, here's another question for you guys:
There are going to be a variable number of columns with the name "word" in them. Would it be faster to do a separate database call for each row, with a generated Python query per row, or would it be faster to simply SELECT *, and only use the columns I needed? Is it possible to say SELECT * NOT XYZ?
No, SQL doesn't provide you with any syntax to do such a select.
What you can do is ask MySQL for a list of column names first, then generate the SQL query from that information.
SELECT column_name
FROM information_schema.columns
WHERE table_name = 'your_table'
AND column_name LIKE 'word%'
let's you select the column names. Then you can do, in Python:
"SELECT * FROM your_table WHERE " + ' '.join(['%s = 1' % name for name in columns])
Instead of using string concatenation, I would recommend using SQLAlchemy instead to do the SQL generating for you.
However, if all you are doing is limit the number of columns there is no need to do a dynamic query like this at all. The hard work for the database is selecting the rows; it makes little difference to send you 5 columns out of 10, or all 10.
In that case just use a "SELECT * FROM ..." and use Python to pick out the columns from the result set.
No, you cannot dynamically produce the list of columns to be selected. It will have to be hardcoded in your final query.
Your current query would produce a result set with one column and the value of that column would be the string "word%" in all rows that satisfy the condition.
You can generate the list of column names first by using
SHOW COLUMNS IN tblname LIKE "word%"
Then loop through the cursor and generate SQL statement uses all the columns from the query above.
"SELECT {0} FROM searchterms WHERE onstate = 1".format(', '.join(columns))
This could be helpful: MySQL wildcard in select
In conclusion it is not possible in MySQL directly.
What you could do as a dirty workaround is get all the column names from the table with an initial query (http://dev.mysql.com/doc/refman/5.0/en/show-columns.html) and then compare in python if the name matches your pattern. Afterwards you could do the MySQL select statement with the found column names like this:
SELECT word1, word2, word3 from searchterms where onstate = 1;