Obtaining data from PostgreSQL as Dictionary - python

I have a database table with multiple fields which I am querying and pulling out all data which meets certain parameters. I am using psycopg2 for python with the following syntax:
cur.execute("SELECT * FROM failed_inserts where insertid='%s' AND site_failure=True"%import_id)
failed_sites= cur.fetchall()
This returns the correct values as a list with the data's integrity and order maintained. However I want to query the list returned somewhere else in my application and I only have this list of values, i.e. it is not a dictionary with the fields as the keys for these values. Rather than having to do
desiredValue = failed_sites[13] //where 13 is an arbitrary number of the index for desiredValue
I want to be able to query by the field name like:
desiredValue = failed_sites[fieldName] //where fieldName is the name of the field I am looking for
Is there a simple way and efficient way to do this?
Thank you!

cursor.description will give your the column information (http://www.python.org/dev/peps/pep-0249/#cursor-objects). You can get the column names from it and use them to create a dictionary.
cursor.execute('SELECT ...')
columns = []
for column in cursor.description:
columns.append(column[0].lower())
failed_sites = {}
for row in cursor:
for i in range(len(row)):
failed_sites[columns[i]] = row[i]
if isinstance(row[i], basestring):
failed_sites[columns[i]] = row[i].strip()

The "Dictionary-like cursor", part of psycopg2.extras, seems what you're looking for.

Related

Using two columns in an existing SQLite database to create a third column using Python

I have created a database with multiple columns and am wanting to use the data stored in two of the columns (named 'cost' and 'Mwe') to create a new column 'Dollar_per_KWh'. I have created two lists, one contains the rowid and the other contains the new value that I want to populate the new Dollar_per_KWh column. As it iterates through all the rows, the two lists are zipped together into a dictionary containing tuples. I then try to populate the new sqlite column. The code runs and I do not receive any errors. When I print out the dictionary it looks correct.
Issue: the new column in my database is not being updated with the new data and I am not sure why. The values in the new column are showing 'NULL'
Thank you for your help. Here is my code:
conn = sqlite3.connect('nuclear_builds.sqlite')
cur = conn.cursor()
cur.execute('''ALTER TABLE Construction
ADD COLUMN Dollar_per_KWh INTEGER''')
cur.execute('SELECT _rowid_, cost, Mwe FROM Construction')
data = cur.fetchall()
dol_pr_kW = dict()
key = list()
value = list()
for row in data:
id = row[0]
cost = row[1]
MWe = row[2]
value.append(int((cost*10**6)/(MWe*10**3)))
key.append(id)
dol_pr_kW = list(zip(key, value))
cur.executemany('''UPDATE Construction SET Dollar_per_KWh = ? WHERE _rowid_ = ?''', (dol_pr_kW[1], dol_pr_kW[0]))
conn.commit()
Not sure why it isn't working. Have you tried just doing it all in SQL?
conn = sqlite3.connect('nuclear_builds.sqlite')
cur = conn.cursor()
cur.execute('''ALTER TABLE Construction
ADD COLUMN Dollar_per_KWh INTEGER;''')
cur.execute('''UPDATE Construction SET Dollar_per_KWh = cast((cost/MWe)*1000 as integer);''')
It's a lot simpler just doing the calculation in SQL than pulling data to Python, manipulating it, and pushing it back to the database.
If you need to do this in Python for some reason, testing whether this works will at least give you some hints as to what is going wrong with your current code.
Update: I see a few more problems now.
First I see you are creating an empty dictionary dol_pr_kW before the for loop. This isn't necessary as you are re-defining it as a list later anyway.
Then you are trying to create the list dol_pr_kW inside the for loop. This has the effect of over-writing it for each row in data.
I'll give a few different ways to solve it. It looks like you were trying a few different things at once (using dict and list, building two lists and zipping into a third list, etc.) that is adding to your trouble, so I am simplifying the code to make it easier to understand. In each solution I will create a list called data_to_insert. That is what you will pass at the end to the executemany function.
First option is to create your list before the for loop, then append it for each row.
dol_pr_kW = list()
for row in data:
id = row[0]
cost = row[1]
MWe = row[2]
val = int((cost*10**6)/(MWe*10**3))
dol_pr_kW.append(id,val)
#you can do this or instead change above step to dol_pr_kW.append(val,id).
data_to_insert = [(r[1],r[0]) for r in dol_pr_kW]
The second way would be to zip the key and value lists AFTER the for loop.
key = list()
value = list()
for row in data:
id = row[0]
cost = row[1]
MWe = row[2]
value.append(int((cost*10**6)/(MWe*10**3)))
key.append(id)
dol_pr_kW = list(zip(key,value))
#you can do this or instead change above step to dol_pr_kW=list(zip(value,key))
data_to_insert = [(r[1],r[0]) for r in dol_pr_kW]
Third, if you would rather keep it as an actual dict you can do this.
dol_pr_kW = dict()
for row in data:
id = row[0]
cost = row[1]
MWe = row[2]
val = int((cost*10**6)/(MWe*10**3))
dol_pr_kW[id] = val
# convert to list
data_to_insert = [(dol_pr_kW[id], id) for id in dol_per_kW]
Then to execute call
cur.executemany('''UPDATE Construction SET Dollar_per_KWh = ? WHERE _rowid_ = ?''', data_to_insert)
cur.commit()
I prefer the first option since it's easiest for me to understand what's happening at a glance. Each iteration of the for loop just adds a (id, val) to the end of the list. It's a little more cumbersome to build two lists independently and zip them together to get a third list.
Also note that if the dol_pr_kW list had been created correctly, passing (dol_pr_kW[1],dol_pr_kW[0]) to executemany would pass the first two rows in the list instead of reversing (key,value) to (value,key). You need to do a list comprehension to accomplish the swap in one line of code. I just did this as a separate line and assigned it to variable data_to_insert for readability.

Insert dictionary into SQLlite3

I want to insert data from a dictionary into a sqlite table, I am using slqalchemy to do that, the keys in the dictionary and the column names are the same, and I want to insert the values into the same column name in the table. So this is my code:
#This is the class where I create a table from with sqlalchemy, and I want to
#insert my data into.
#I didn't write the __init__ for simplicity
class Sizecurve(Base):
__tablename__ = 'sizecurve'
XS = Column(String(5))
S = Column(String(5))
M = Column(String(5))
L = Column(String(5))
XL = Column(String(5))
XXL = Column(String(5))
o = Mapping() #This creates an object which is actually a dictionary
for eachitem in myitems:
# Here I populate the dictionary with keys from another list
# This gives me a dictionary looking like this: o={'S':None, 'M':None, 'L':None}
o[eachitem] = None
for eachsize in mysizes:
# Here I assign values to each key of the dictionary, if a value exists if not just None
# product_row is a class and size and stock are its attributes
if(product_row.size in o):
o[product_row.size] = product_row.stock
# I put the final object into a list
simplelist.append(o)
Now I want to put each the values from the dictionaries saved in simplelist into the right column in the sizecurve table. But I am stuck I don't know how to do that? So for example I have an object like this:
o= {'S':4, 'M':2, 'L':1}
And I want to see for the row for column S value 4, column M value 2 etc.
Yes, it's possible (though aren't you missing primary keys/foreign keys on this table?).
session.add(Sizecurve(**o))
session.commit()
That should insert the row.
http://docs.sqlalchemy.org/en/latest/core/tutorial.html#executing-multiple-statements
EDIT: On second read it seems like you are trying to insert all those values into one column? If so, I would make use of pickle.
https://docs.python.org/3.5/library/pickle.html
If performance is an issue (pickle is pretty fast, but if your doing 10000 reads per second it'll be the bottleneck), you should either redesign the table or use a database like PostgreSQL that supports JSON objects.
I have found this answer to a similar question, though this is about reading the data from a json file, so now I am working on understanding the code and also changing my data type to json so that I can insert them in the right place.
Convert JSON to SQLite in Python - How to map json keys to database columns properly?

Why is `for...in` returning a tuple when trying to iterate through rows returned by query?

I select 1 column from a table in a database. I want to iterate through each of the results. Why is it when I do this it’s a tuple instead of a single value?
con = psycopg2.connect(…)
cur = con.cursor()
stmt = "SELECT DISTINCT inventory_pkg FROM {}.{} WHERE inventory_pkg IS NOT NULL;".format(schema, tableName)
cur.execute(stmt)
con.commit()
referenced = cur.fetchall()
for destTbl in referenced:#why is destTbl a single element tuple?
print('destTbl: '+str(referenced))
stmt = "SELECT attr_name, attr_rule FROM {}.{} WHERE ppm_table_name = {};".format(schema, tableName, destTbl)#this fails because the where clause gets messed up because ‘destTbl’ has a comma after it
cur.execute(stmt)
Because that's what the db api does: always returns a tuple for each row in the result.
It's pretty simple to refer to destTbl[0] wherever you need to.
Because you are getting rows from your database, and the API is being consistent.
If your query asked for * columns, or a specific number of columns that is greater than 1, you'd also need a tuple or list to hold those columns for each row.
In other words, just because you only have one column in this query doesn't mean the API suddenly will change what kind of object it returns to model a row.
Simply always treat a row as a sequence and use indexing or tuple assignment to get a specific value out. Use:
inventory_pkg = destTbl[0]
or
inventory_pkg, = destTbl
for example.

Elegant way to update multiple values for Sqlite3 Python

I have an Sqlite3 database called MYTABLE like this:
My objective is to update the values of COUNT column by simply adding the existing value with the new value.
There will be two inputs that I will recieve:
Firstly, a list of IDs that I need to update for COUNT column.
For example: ['1','3','2','5']
And Secondly, the number of count to be added to the every IDs in
the list above.
So far, the best I can come up with is:
#my input1, the list of IDs that need updating
id_list = ['1','2','5','3']
#my input2, the value to be added to the existing count value
new_count = 3
#empty list to store the original count values before updating
original_counts = []
#iterate through the input1 and retrieve the original count values
for item in id_list :
cursor = conn.execute("SELECT COUNT from MYTABLE where ID=?",[item])
for row in cursor:
original_counts.append(row[0])
#iterate through the input1 and update the count values
for i in range(len(id_list )):
conn.execute("UPDATE MYTABLE set COUNT = ? where ID=?",[original_counts[i]+new_count ,mylist[i])
Is there better/more elegent and more efficient way to achieve what I want?
UPDATE 1:
I have tried this based on N Reed's answer(not exactly the same) like this and it worked!
for item in mylist:
conn.execute("UPDATE MYTABLE set VAL=VAL+? where ID=?",[new_count,item])
Take Away for me is we can update a value in sqlite3 based on it's current value(which I didn't know)
You want to create a query that looks like this:
UPDATE MYTABLE set COUNT = COUNT + 5 where ID in (1, 2, 3, 4)
I don't know python that well, but you probably want code in python something like:
conn.execute("UPDATE MYTABLE set COUNT = COUNT + ? where ID in (?)", new_count , ",".join(mylist))
Keep in mind there is a limit to the number of items you can have in the Id list with sqllite (I think it is something like 1000)
Also be very careful about sql injection when you are creating queries this way. You probably will want to make sure that all the items in mylist have already been escaped somewhere else.
I also recommend against having a column called 'count' as it is a keyword in sql.

SELECT * in SQLAlchemy?

Is it possible to do SELECT * in SQLAlchemy?
Specifically, SELECT * WHERE foo=1?
Is no one feeling the ORM love of SQLAlchemy today? The presented answers correctly describe the lower-level interface that SQLAlchemy provides. Just for completeness, this is the more-likely (for me) real-world situation where you have a session instance and a User class that is ORM mapped to the users table.
for user in session.query(User).filter_by(name='jack'):
print(user)
# ...
And this does an explicit select on all columns.
The following selection works for me in the core expression language (returning a RowProxy object):
foo_col = sqlalchemy.sql.column('foo')
s = sqlalchemy.sql.select(['*']).where(foo_col == 1)
If you don't list any columns, you get all of them.
query = users.select()
query = query.where(users.c.name=='jack')
result = conn.execute(query)
for row in result:
print row
Should work.
You can always use a raw SQL too:
str_sql = sql.text("YOUR STRING SQL")
#if you have some args:
args = {
'myarg1': yourarg1
'myarg2': yourarg2}
#then call the execute method from your connection
results = conn.execute(str_sql,args).fetchall()
Where Bar is the class mapped to your table and session is your sa session:
bars = session.query(Bar).filter(Bar.foo == 1)
Turns out you can do:
sa.select('*', ...)
I had the same issue, I was trying to get all columns from a table as a list instead of getting ORM objects back. So that I can convert that list to pandas dataframe and display.
What works is to use .c on a subquery or cte as follows:
U = select(User).cte('U')
stmt = select(*U.c)
rows = session.execute(stmt)
Then you get a list of tuples with each column.
Another option is to use __table__.columns in the same way:
stmt = select(*User.__table__.columns)
rows = session.execute(stmt)
In case you want to convert the results to dataframe here is the one liner:
pd.DataFrame.from_records(rows, columns=rows.keys())
For joins if columns are not defined manually, only columns of target table are returned. To get all columns for joins(User table joined with Group Table:
sql = User.select(from_obj(Group, User.c.group_id == Group.c.id))
# Add all coumns of Group table to select
sql = sql.column(Group)
session.connection().execute(sql)
I had the same issue, I was trying to get all columns from a table as a list instead of getting ORM objects back. So that I can convert that list to pandas dataframe and display.
What works is to use .c on a subquery or cte as follows:
U = select(User).cte('U')
stmt = select(*U.c)
rows = session.execute(stmt)
Then you get a list of tuples with each column.
Another option is to use __table__.columns in the same way:
stmt = select(*User.__table__.columns)
rows = session.execute(stmt)
In case you want to convert the results to dataframe here is the one liner:
pd.DataFrame.from_records(dict(zip(r.keys(), r)) for r in rows)
If you're using the ORM, you can build a query using the normal ORM constructs and then execute it directly to get raw column values:
query = session.query(User).filter_by(name='jack')
for cols in session.connection().execute(query):
print cols
every_column = User.__table__.columns
records = session.query(*every_column).filter(User.foo==1).all()
When a ORM class is passed to the query function, e.g. query(User), the result will be composed of ORM instances. In the majority of cases, this is what the dev wants and will be easiest to deal with--demonstrated by the popularity of the answer above that corresponds to this approach.
In some cases, devs may instead want an iterable sequence of values. In these cases, one can pass the list of desired column objects to query(). This answer shows how to pass the entire list of columns without hardcoding them, while still working with SQLAlchemy at the ORM layer.

Categories

Resources