Elegant way to update multiple values for Sqlite3 Python - python

I have an Sqlite3 database called MYTABLE like this:
My objective is to update the values of COUNT column by simply adding the existing value with the new value.
There will be two inputs that I will recieve:
Firstly, a list of IDs that I need to update for COUNT column.
For example: ['1','3','2','5']
And Secondly, the number of count to be added to the every IDs in
the list above.
So far, the best I can come up with is:
#my input1, the list of IDs that need updating
id_list = ['1','2','5','3']
#my input2, the value to be added to the existing count value
new_count = 3
#empty list to store the original count values before updating
original_counts = []
#iterate through the input1 and retrieve the original count values
for item in id_list :
cursor = conn.execute("SELECT COUNT from MYTABLE where ID=?",[item])
for row in cursor:
original_counts.append(row[0])
#iterate through the input1 and update the count values
for i in range(len(id_list )):
conn.execute("UPDATE MYTABLE set COUNT = ? where ID=?",[original_counts[i]+new_count ,mylist[i])
Is there better/more elegent and more efficient way to achieve what I want?
UPDATE 1:
I have tried this based on N Reed's answer(not exactly the same) like this and it worked!
for item in mylist:
conn.execute("UPDATE MYTABLE set VAL=VAL+? where ID=?",[new_count,item])
Take Away for me is we can update a value in sqlite3 based on it's current value(which I didn't know)

You want to create a query that looks like this:
UPDATE MYTABLE set COUNT = COUNT + 5 where ID in (1, 2, 3, 4)
I don't know python that well, but you probably want code in python something like:
conn.execute("UPDATE MYTABLE set COUNT = COUNT + ? where ID in (?)", new_count , ",".join(mylist))
Keep in mind there is a limit to the number of items you can have in the Id list with sqllite (I think it is something like 1000)
Also be very careful about sql injection when you are creating queries this way. You probably will want to make sure that all the items in mylist have already been escaped somewhere else.
I also recommend against having a column called 'count' as it is a keyword in sql.

Related

Using two columns in an existing SQLite database to create a third column using Python

I have created a database with multiple columns and am wanting to use the data stored in two of the columns (named 'cost' and 'Mwe') to create a new column 'Dollar_per_KWh'. I have created two lists, one contains the rowid and the other contains the new value that I want to populate the new Dollar_per_KWh column. As it iterates through all the rows, the two lists are zipped together into a dictionary containing tuples. I then try to populate the new sqlite column. The code runs and I do not receive any errors. When I print out the dictionary it looks correct.
Issue: the new column in my database is not being updated with the new data and I am not sure why. The values in the new column are showing 'NULL'
Thank you for your help. Here is my code:
conn = sqlite3.connect('nuclear_builds.sqlite')
cur = conn.cursor()
cur.execute('''ALTER TABLE Construction
ADD COLUMN Dollar_per_KWh INTEGER''')
cur.execute('SELECT _rowid_, cost, Mwe FROM Construction')
data = cur.fetchall()
dol_pr_kW = dict()
key = list()
value = list()
for row in data:
id = row[0]
cost = row[1]
MWe = row[2]
value.append(int((cost*10**6)/(MWe*10**3)))
key.append(id)
dol_pr_kW = list(zip(key, value))
cur.executemany('''UPDATE Construction SET Dollar_per_KWh = ? WHERE _rowid_ = ?''', (dol_pr_kW[1], dol_pr_kW[0]))
conn.commit()
Not sure why it isn't working. Have you tried just doing it all in SQL?
conn = sqlite3.connect('nuclear_builds.sqlite')
cur = conn.cursor()
cur.execute('''ALTER TABLE Construction
ADD COLUMN Dollar_per_KWh INTEGER;''')
cur.execute('''UPDATE Construction SET Dollar_per_KWh = cast((cost/MWe)*1000 as integer);''')
It's a lot simpler just doing the calculation in SQL than pulling data to Python, manipulating it, and pushing it back to the database.
If you need to do this in Python for some reason, testing whether this works will at least give you some hints as to what is going wrong with your current code.
Update: I see a few more problems now.
First I see you are creating an empty dictionary dol_pr_kW before the for loop. This isn't necessary as you are re-defining it as a list later anyway.
Then you are trying to create the list dol_pr_kW inside the for loop. This has the effect of over-writing it for each row in data.
I'll give a few different ways to solve it. It looks like you were trying a few different things at once (using dict and list, building two lists and zipping into a third list, etc.) that is adding to your trouble, so I am simplifying the code to make it easier to understand. In each solution I will create a list called data_to_insert. That is what you will pass at the end to the executemany function.
First option is to create your list before the for loop, then append it for each row.
dol_pr_kW = list()
for row in data:
id = row[0]
cost = row[1]
MWe = row[2]
val = int((cost*10**6)/(MWe*10**3))
dol_pr_kW.append(id,val)
#you can do this or instead change above step to dol_pr_kW.append(val,id).
data_to_insert = [(r[1],r[0]) for r in dol_pr_kW]
The second way would be to zip the key and value lists AFTER the for loop.
key = list()
value = list()
for row in data:
id = row[0]
cost = row[1]
MWe = row[2]
value.append(int((cost*10**6)/(MWe*10**3)))
key.append(id)
dol_pr_kW = list(zip(key,value))
#you can do this or instead change above step to dol_pr_kW=list(zip(value,key))
data_to_insert = [(r[1],r[0]) for r in dol_pr_kW]
Third, if you would rather keep it as an actual dict you can do this.
dol_pr_kW = dict()
for row in data:
id = row[0]
cost = row[1]
MWe = row[2]
val = int((cost*10**6)/(MWe*10**3))
dol_pr_kW[id] = val
# convert to list
data_to_insert = [(dol_pr_kW[id], id) for id in dol_per_kW]
Then to execute call
cur.executemany('''UPDATE Construction SET Dollar_per_KWh = ? WHERE _rowid_ = ?''', data_to_insert)
cur.commit()
I prefer the first option since it's easiest for me to understand what's happening at a glance. Each iteration of the for loop just adds a (id, val) to the end of the list. It's a little more cumbersome to build two lists independently and zip them together to get a third list.
Also note that if the dol_pr_kW list had been created correctly, passing (dol_pr_kW[1],dol_pr_kW[0]) to executemany would pass the first two rows in the list instead of reversing (key,value) to (value,key). You need to do a list comprehension to accomplish the swap in one line of code. I just did this as a separate line and assigned it to variable data_to_insert for readability.

Why is `for...in` returning a tuple when trying to iterate through rows returned by query?

I select 1 column from a table in a database. I want to iterate through each of the results. Why is it when I do this it’s a tuple instead of a single value?
con = psycopg2.connect(…)
cur = con.cursor()
stmt = "SELECT DISTINCT inventory_pkg FROM {}.{} WHERE inventory_pkg IS NOT NULL;".format(schema, tableName)
cur.execute(stmt)
con.commit()
referenced = cur.fetchall()
for destTbl in referenced:#why is destTbl a single element tuple?
print('destTbl: '+str(referenced))
stmt = "SELECT attr_name, attr_rule FROM {}.{} WHERE ppm_table_name = {};".format(schema, tableName, destTbl)#this fails because the where clause gets messed up because ‘destTbl’ has a comma after it
cur.execute(stmt)
Because that's what the db api does: always returns a tuple for each row in the result.
It's pretty simple to refer to destTbl[0] wherever you need to.
Because you are getting rows from your database, and the API is being consistent.
If your query asked for * columns, or a specific number of columns that is greater than 1, you'd also need a tuple or list to hold those columns for each row.
In other words, just because you only have one column in this query doesn't mean the API suddenly will change what kind of object it returns to model a row.
Simply always treat a row as a sequence and use indexing or tuple assignment to get a specific value out. Use:
inventory_pkg = destTbl[0]
or
inventory_pkg, = destTbl
for example.

Obtaining data from PostgreSQL as Dictionary

I have a database table with multiple fields which I am querying and pulling out all data which meets certain parameters. I am using psycopg2 for python with the following syntax:
cur.execute("SELECT * FROM failed_inserts where insertid='%s' AND site_failure=True"%import_id)
failed_sites= cur.fetchall()
This returns the correct values as a list with the data's integrity and order maintained. However I want to query the list returned somewhere else in my application and I only have this list of values, i.e. it is not a dictionary with the fields as the keys for these values. Rather than having to do
desiredValue = failed_sites[13] //where 13 is an arbitrary number of the index for desiredValue
I want to be able to query by the field name like:
desiredValue = failed_sites[fieldName] //where fieldName is the name of the field I am looking for
Is there a simple way and efficient way to do this?
Thank you!
cursor.description will give your the column information (http://www.python.org/dev/peps/pep-0249/#cursor-objects). You can get the column names from it and use them to create a dictionary.
cursor.execute('SELECT ...')
columns = []
for column in cursor.description:
columns.append(column[0].lower())
failed_sites = {}
for row in cursor:
for i in range(len(row)):
failed_sites[columns[i]] = row[i]
if isinstance(row[i], basestring):
failed_sites[columns[i]] = row[i].strip()
The "Dictionary-like cursor", part of psycopg2.extras, seems what you're looking for.

Delete first X rows of MySQL table once the number of rows is greater than N

I have an insert-only table in MySQL named word. Once the number of rows exceeds 1000000, I would like to delete the first 100000 rows of the table.
I am using mysqldb in python, so I have a global variable:
wordcount = cursor.execute("select * from word")
will return the number of rows in the table in the python environment. I then increment the wordcount by 1 everytime I insert a new row. Then I check if the number of rows are greater than 1000000, if it is, I want to delete the first 100000 rows:
if wordcount > 1000000:
cursor.execute("delete from word limit 100000")
I got this idea from this thread:
Delete first X lines of a database
However, this SQL ends of deleting my ENTIRE table, what am I missing here?
Thank you.
I don't think that's the right way of getting the number of rows. You need to change your statement to have a count(*) and then use MySQLs cursor.fetchone() to get a tuple of the results, where the first position (kinda like wordcount = cursor.fetchone()[0]) will have the correct row count.
Your delete statement looks right, maybe you have explicit transactions? in which case you'd have to call commit() on your db object after the delete.
If your table "word" have ID field (key auto_increment field) you may write some stored procedure of deleting first 100000-rows. The key part of stored procedure is:
drop temporary table if exists tt_ids;
create temporary table tt_ids (id int not null);
insert into tt_ids -- taking first 100000 rows
select id from word
order by ID
limit 100000;
delete w
from word w
join tt_ids ids on w.ID = ids.ID;
drop temporary table if exists tt_ids;
Also you may build some indexes on tt_ids on ID-field for a speed-UP your query.

How to store two different values returned from query into list data types to be used later(plpy python)

I need to store two values, "id" and "name" returned from sql query into a variable which I can use later. Can I use list for this purpose. I want to store values from sql at once and then only to refer to the stored value. I was able to do so but with only one value (id), but now I need to store both id and name together. the purpose is to do string comparision and based on it, its corresponding id is to be assigned.
for example ,first i tried to retrieve data from database by
rv = plpy.execute (select id,name from aa)
Now I need to store these two values somewhere in two varaible for example, lets say id in storevalueID and name in storevalueName so later I can do things like,
if someXname = Replace(storeValueName("hello","")) then
assign its concerned id to some varaible lile xID = storevalueID,
I am not sure if we can do this , but i need to do something like this.
Any help will be appreciated..
I'm not sure I understand your question completely. But if you were previously storing a list of "id"s:
mylist = []
mylist.append(id1) # or however you get your id values
mylist.append(id2)
# ..
so mylist is something like [1, 2, 3], then you can simply use tuples to store more than one element that are associated together:
mylist = []
mylist.append( (id1, name1) )
mylist.append( (id2, name2) )
# ..
Now mylist is something like [ (1, 'Bob'), (2, 'Alice'), (3, 'Carol')]. You can perform string comparisons on the second element of each tuple in your list:
mylist[0][1] == 'Bob' # True
mylist[1][2] == 'Alice' # True
Update I just saw the updated question. In plypy, you should be able to access the variables like this:
for row in rv:
the_id = row['id']
name = row['name']
using the column names. See this page for more information.

Categories

Resources