In this tutorial it says (http://www.rmunn.com/sqlalchemy-tutorial/tutorial.html) to select all rows of an entity like:
s = products.select()
rs = s.execute()
I get an error saying:
This select object is not bound and does not support direct execution ...
Do I need to reference the session object?
I just want to get all rows in my products table (i've already mapped everything, and I already inserted thousands of rows so that part works)
Since that tutorial is built for SQLALchemy 0.2, it is likely that you aren't using that old of a version. In the latest documentation using the connection and passing the select statement to it is the preferred method. Try this instead:
query = users.select()
result = conn.execute(query)
Ref: http://www.sqlalchemy.org/docs/05/sqlexpression.html#selecting
Related
I want to call a function that I created in my PostgreSQL database. I've looked at the official SQLAlchemy documentation as well as several questions here on SO, but nobody seems to explain how to set up the function in SQLAlchemy.
I did find this question, but am unsure how to compile the function as the answer suggests. Where does that code go? I get errors when I try to put this in both my view and model scripts.
Edit 1 (8/11/2016)
As per the community's requests and requirements, here are all the details I left out:
I have a table called books whose columns are arranged with information regarding the general book (title, author(s), publication date...).
I then have many tables all of the same kind whose columns contain information regarding all the chapters in each book (chapter name, length, short summary...). It is absolutely necessary for each book to have its own table. I have played around with one large table of all the chapters, and found it ill suited to my needs, not too mention extremely unwieldy.
My function that I'm asking about queries the table of books for an individual book's name, and casts the book's name to a regclass. It then queries the regclass object for all its data, returns all the rows as a table like the individual book tables, and exits. Here's the raw code:
CREATE OR REPLACE FUNCTION public.get_book(bookName character varying)
RETURNS TABLE(/*columns of individual book table go here*/)
LANGUAGE plpgsql
AS $function$
declare
_tbl regclass;
begin
for _tbl in
select name::regclass
from books
where name=bookName
loop
return query execute '
select * from ' ||_tbl;
end loop;
end;
$function$
This function has been tested several times in both the command line and pgAdmin. It works as expected.
My intention is to have a view in my Flask app whose route is #app.route('/book/<string:bookName>') and calls the above function before rendering the template. The exact view is as follows:
#app.route('/book/<string:bookName>')
def book(bookName):
chapterList = /*call function here*/
return render_template('book.html', book=bookName, list=chapterList)
This is my question: how do I set up my app in such a way that SQLAlchemy knows about and can call the function I have in my database? I am open to other suggestions of achieving the same result as well.
P.S. I only omitted this information with the intention of keeping my question as abstract as possible, not knowing that the rules of the forum dictate a requirement for a very specific question. Please forgive me my lack of knowledge.
If you want to do it without raw sql, you can use func from sqlalchemy:
from sqlalchemy import func
data = db.session.query(func.your_schema.your_function_name()).all()
You can use func
Syntax:
from sqlalchemy import func
func.function_name(column)
Example:
from sqlalchemy import func
result = db.session.query(func.lower(Student.name)).all()
I found a solution to execute the function with raw SQL:
Create a connection
Call the function as you normally would in the database GUI. E.g. for the function add_apples():
select add_apples();
Execute this statement, which should be a string.
Example code:
transaction = connection.begin()
sql = list() # Allows multiple queries
sql.append('select add_apples();')
print('Printing the queries.')
for i in sql:
print(i)
# Now, we iterate through the sql statements executing them one after another. If there is an exception on one of them, we stop the execution
# of the program.
for i in sql:
# We execute the corresponding command
try:
r = connection.execute(i)
print('Executed ----- %r' % i)
except Exception as e:
print('EXCEPTION!: {}'.format(e))
transaction.rollback()
exit(-1)
transaction.commit()
from sqlalchemy.sql import text
with engine.connect() as con:
statement = text("""your function""")
con.execute(statement)
You must execute raw sql through sqlalchemy
I'm trying to implement the following MySQL query using SQLAlchemy. The table in question is nested set hierarchy.
UPDATE category
JOIN
(
SELECT
node.cat_id,
(COUNT(parent.cat_id) - 1) AS depth
FROM category AS node, category AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rgt
GROUP BY node.cat_id
) AS depths
ON category.cat_id = depths.cat_id
SET category.depth = depths.depth
This works just fine.
This is where I start pulling my hair out:
from sqlalchemy.orm import aliased
from sqlalchemy import func
from myapp.db import db
node = aliased(Category)
parent = aliased(Category)
stmt = db.session.query(node.cat_id,
func.count(parent.cat_id).label('depth_'))\
.filter(node.lft.between(parent.lft, parent.rgt))\
.group_by(node.cat_id).subquery()
db.session.query(Category,
stmt.c.cat_id,
stmt.c.depth_)\
.outerjoin(stmt,
Category.cat_id == stmt.c.cat_id)\
.update({Category.depth: stmt.c.depth_},
synchronize_session='fetch')
...and I get InvalidRequestError: This operation requires only one Table or entity be specified as the target. It seems to me that Category.depth adequately specifies the target, but of course SQLAlchemy trumps whatever I may think.
Stumped. Any suggestions? Thanks.
I know this question is five years old, but I stumbled upon it today. My answer might be useful to someone else. I understand that my solution is not the perfect one, but I don't have a better way of doing this.
I had to change only the last line to:
db.session.query(Category)\
.outerjoin(stmt,
Category.cat_id == stmt.c.cat_id)\
.update({Category.depth: stmt.c.depth_},
synchronize_session='fetch')
Then, you have to commit the changes:
db.session.commit()
This gives the following warning:
SAWarning: Evaluating non-mapped column expression '...' onto ORM
instances; this is a deprecated use case. Please make use of the
actual mapped columns in ORM-evaluated UPDATE / DELETE expressions.
"UPDATE / DELETE expressions." % clause
To get rid of it, I used the solution in this post: Turn off a warning in sqlalchemy
Note: For some reason, aliases don't work in SQLAlchemy update statements.
The code is quite simple, as follows:
from pony.orm import Required, Set, Optional, PrimaryKey
from pony.orm import Database, db_session
import time
db = Database('mysql', host="localhost", port=3306, user="root",
passwd="123456", db="learn_pony")
class TryUpdate(db.Entity):
_table_ = "try_update_record"
t = Required(int, default=0)
db.generate_mapping(create_tables=True)
#db_session
def insert_record():
new_t = TryUpdate()
#db_session
def update():
t = TryUpdate.get(id=1)
print t.t
t.t = 0
print t.t
if __name__ == "__main__":
insert_record()
update()
pony.orm reports exception: pony.orm.core.CommitException: Object TryUpdate[1] was updated outside of current transaction. But there is no other transaction running at all
And as my experiments show, pony works OK as long as t.t is changed to a value different from the original, but it always reports exception when t.t is set to a value which equals to the original.
I'm not sure if this is a design decision. Do I have to check if my input value changes everytime before the assignment? Or is there anything I can do to avoid this annoying exception?
my pony version: 0.4.8
Thansk a lot~~~
Pony ORM author is here.
This behavior is a MySQL-specific bug which was fixed in release Pony ORM 0.4.9, so please upgrade. The rest of my answer is the explanation of what caused the bug.
The reason for this bug is the following. In order to prevent lost updates, Pony ORM uses optimistic checks. Pony tracks which attributes were read or changed during the program execution and then adds extra conditions in the WHERE section of the corresponding UPDATE query. This way Pony guarantees that no data will be lost because of the concurrent update. Lets consider the next example:
#db_session
def some_function()
obj = MyObject[123]
print obj.x
obj.x = 100
Upon exit of the some_function the #db_session decorator will commit ongoing transaction. Right before the commit, the object's data will be saved by the following UPDATE command:
UPDATE MyTable
SET x = <new_value>
WHERE id = 123 and x = <old_value>
You may wonder, why this additional condition and x = <old_value> was added? This is because Pony knows that the program saw previous value of the attribute x and may use this value in order to calculate new value of the same attribute. So Pony takes steps to guarantee that this attribute is still unchanged at the moment of the UPDATE. This approach is called "optimistic concurrency check" (see also Wikipedia article "optimistic concurrency control"). Since isolation level used by default in most databases is not SERIALIZABLE, without this additional check it is possible that some other transaction have managed to update value of the x attribute before our transaction commit, and then the value written by the concurrent transaction will be lost.
When Python database driver executes the UPDATE query, it returns the number of rows which satisfy the UPDATE criteria. This way Pony knows if the update was successful or not. If the result is 1, this means that one row was successfully found and updated, but if the result is 0, this means that the row was already modified by another transaction and now it doesn't satisfy the criteria in the WHERE section. When this happens Pony terminates the current transaction in order to prevent lost update.
The reason of the bug is that while all other database drivers return number of rows which were found by WHERE section criteria, MySQLdb driver by default returns the number of rows which were actually modified! Because of this, if the new value of the attribute turns out to be the same as the original value of the same attribute, MySQLdb reports that 0 rows were modified, and Pony (prior to the release 0.4.9) mistakenly believes that it means that the row was modified by a concurrent transaction. Started with the release 0.4.9 Pony ORM tells MySQLdb driver to behave in a standard way and return the number of rows which were found and not the number of rows which were actually updated.
Hope this helps :)
P.S. I found you question just by chance, in order to reliably get answers about Pony ORM I recommend you to send questions to our mailing list http://ponyorm-list.ponyorm.com. If you think that you found a bug you can open issue here: https://github.com/ponyorm/pony/issues.
Thank you for your question!
I'm writing a quick and dirty maintenace script to delete some rows and would like to avoid having to bring my ORM classes/mappings over from the main project. I have a query that looks similar to:
address_table = Table('address',metadata,autoload=True)
addresses = session.query(addresses_table).filter(addresses_table.c.retired == 1)
According to everything I've read, if I was using the ORM (not 'just' tables) and passed in something like:
addresses = session.query(Addresses).filter(addresses_table.c.retired == 1)
I could add a .delete() to the query, but when I try to do this using only tables I get a complaint:
File "/usr/local/lib/python2.6/dist-packages/sqlalchemy/orm/query.py", line 2146, in delete
target_cls = self._mapper_zero().class_
AttributeError: 'NoneType' object has no attribute 'class_'
Which makes sense as its a table, not a class. I'm quite green when it comes to SQLAlchemy, how should I be going about this?
Looking through some code where I did something similar, I believe this will do what you want.
d = addresses_table.delete().where(addresses_table.c.retired == 1)
d.execute()
Calling delete() on a table object gives you a sql.expression (if memory serves), that you then execute. I've assumed above that the table is bound to a connection, which means you can just call execute() on it. If not, you can pass the d to execute(d) on a connection.
See docs here.
When you call delete() from a query object, SQLAlchemy performs a bulk deletion. And you need to choose a strategy for the removal of matched objects from the session. See the documentation here.
If you do not choose a strategy for the removal of matched objects from the session, then SQLAlchemy will try to evaluate the query’s criteria in Python straight on the objects in the session. If evaluation of the criteria isn’t implemented, an error is raised.
This is what is happening with your deletion.
If you only want to delete the records and do not care about the records in the session after the deletion, you can choose the strategy that ignores the session synchronization:
address_table = Table('address', metadata, autoload=True)
addresses = session.query(address_table).filter(address_table.c.retired == 1)
addresses.delete(synchronize_session=False)
I have some database structure; as most of it is irrelevant for us, i'll describe just some relevant pieces. Let's lake Item object as example:
items_table = Table("invtypes", gdata_meta,
Column("typeID", Integer, primary_key = True),
Column("typeName", String, index=True),
Column("marketGroupID", Integer, ForeignKey("invmarketgroups.marketGroupID")),
Column("groupID", Integer, ForeignKey("invgroups.groupID"), index=True))
mapper(Item, items_table,
properties = {"group" : relation(Group, backref = "items"),
"_Item__attributes" : relation(Attribute, collection_class = attribute_mapped_collection('name')),
"effects" : relation(Effect, collection_class = attribute_mapped_collection('name')),
"metaGroup" : relation(MetaType,
primaryjoin = metatypes_table.c.typeID == items_table.c.typeID,
uselist = False),
"ID" : synonym("typeID"),
"name" : synonym("typeName")})
I want to achieve some performance improvements in the sqlalchemy/database layer, and have couple of ideas:
1) Requesting the same item twice:
item = session.query(Item).get(11184)
item = None (reference to item is lost, object is garbage collected)
item = session.query(Item).get(11184)
Each request generates and issues SQL query. To avoid it, i use 2 custom maps for an item object:
itemMapId = {}
itemMapName = {}
#cachedQuery(1, "lookfor")
def getItem(lookfor, eager=None):
if isinstance(lookfor, (int, float)):
id = int(lookfor)
if eager is None and id in itemMapId:
item = itemMapId[id]
else:
item = session.query(Item).options(*processEager(eager)).get(id)
itemMapId[item.ID] = item
itemMapName[item.name] = item
elif isinstance(lookfor, basestring):
if eager is None and lookfor in itemMapName:
item = itemMapName[lookfor]
else:
# Items have unique names, so we can fetch just first result w/o ensuring its uniqueness
item = session.query(Item).options(*processEager(eager)).filter(Item.name == lookfor).first()
itemMapId[item.ID] = item
itemMapName[item.name] = item
return item
I believe sqlalchemy does similar object tracking, at least by primary key (item.ID). If it does, i can wipe both maps (although wiping name map will require minor modifications to application which uses these queries) to not duplicate functionality and use stock methods. Actual question is: if there's such functionality in sqlalchemy, how to access it?
2) Eager loading of relationships often helps to save alot of requests to database. Say, i'll definitely need following set of item=Item() properties:
item.group (Group object, according to groupID of our item)
item.group.items (fetch all items from items list of our group)
item.group.items.metaGroup (metaGroup object/relation for every item in the list)
If i have some item ID and no item is loaded yet, i can request it from the database, eagerly loading everything i need: sqlalchemy will join group, its items and corresponding metaGroups within single query. If i'd access them with default lazy loading, sqlalchemy would need to issue 1 query to grab an item + 1 to get group + 1*#items for all items in the list + 1*#items to get metaGroup of each item, which is wasteful.
2.1) But what if i already have Item object fetched, and some of the properties which i want to load are already loaded? As far as i understand, when i re-fetch some object from the database - its already loaded relations do not become unloaded, am i correct?
2.2) If i have Item object fetched, and want to access its group, i can just getGroup using item.groupID, applying any eager statements i'll need ("items" and "items.metaGroup"). It should properly load group and its requested relations w/o touching item stuff. Will sqlalchemy properly map this fetched group to item.group, so that when i access item.group it won't fetch anything from the underlying database?
2.3) If i have following things fetched from the database: original item, item.group and some portion of the items from the item.group.items list some of which may have metaGroup loaded, what would be best strategy for completing data structure to the same as eager list above: re-fetch group with ("items", "items.metaGroup") eager load, or check each item from items list individually, and if item or its metaGroup is not loaded - load them? It seems to depend on the situation, because if everything has already been loaded some time ago - issuing such heavy query is pointless. Does sqlalchemy provide a way to track if some object relation is loaded, with the ability to look deeper than just one level?
As an illustration to 2.3 - i can fetch group with ID 83, eagerly fetching "items" and "items.metaGroup". Is there a way to determine from an item (which has groupID of an 83), does it have "group", "group.items" and "group.items.metaGroup" loaded or not, using sqlalchemy tools (in this case all of them should be loaded)?
To force loading lazy attributes just access them. This the simplest way and it works fine for relations, but is not as efficient for Columns (you will get separate SQL query for each column in the same table). You can get a list of all unloaded properties (both relations and columns) from sqlalchemy.orm.attributes.instance_state(obj).unloaded.
You don't use deferred columns in your example, but I'll describe them here for completeness. The typical scenario for handling deferred columns is the following:
Decorate selected columns with deferred(). Combine them into one or several groups by using group parameter to deferred().
Use undefer() and undefer_group() options in query when desired.
Accessing deferred column put in group will load all columns in this group.
Unfortunately this doesn't work reverse: you can combine columns into groups without deferring loading of them by default with column_property(Column(…), group=…), but defer() option won't affect them (it works for Columns only, not column properties, at least in 0.6.7).
To force loading deferred column properties session.refresh(obj, attribute_names=…) suggested by Nathan Villaescusa is probably the best solution. The only disadvantage I see is that it expires attributes first so you have to insure there is not loaded attributes among passed as attribute_names argument (e.g. by using intersection with state.unloaded).
Update
1) SQLAlchemy does track loaded objects. That's how ORM works: there must be the only object in the session for each identity. Its internal cache is weak by default (use weak_identity_map=False to change this), so the object is expunged from the cache as soon as there in no reference to it in your code. SQLAlchemy won't do SQL request for query.get(pk) when object is already in the session. But this works for get() method only, so query.filter_by(id=pk).first() will do SQL request and refresh object in the session with loaded data.
2) Eager loading of relations will lead to fewer requests, but it's not always faster. You have to check this for your database and data.
2.1) Refetching data from database won't unload objects bound via relations.
2.2) item.group is loaded using query.get() method, so there won't lead to SQL request if object is already in the session.
2.3) Yes, it depends on situation. For most cases it's the best is to hope SQLAlchemy will use the right strategy :). For already loaded relation you can check if related objects' relations are loaded via state.unloaded and so recursively to any depth. But when relation is not loaded yet you can't get know whether related objects and their relations are already loaded: even when relation is not yet loaded the related object[s] might be already in the session (just imagine you request first item, load its group and then request other item that has the same group). For your particular example I see no problem to just check state.unloaded recursively.
1)
From the Session documentation:
[The Session] is somewhat used as a cache, in that
it implements the identity map
pattern, and stores objects keyed to
their primary key. However, it doesn’t
do any kind of query caching. ... It’s only
when you say query.get({some primary
key}) that the Session doesn’t have to
issue a query.
2.1) You are correct, relationships are not modified when you refresh an object.
2.2) Yes, the group will be in the identity map.
2.3) I believe your best bet will be to attempt to reload the entire group.items in a single query. From my experience it is usually much quicker to issue one large request than several smaller ones. The only time it would make sense to only reload a specific group.item is there was exactly one of them that needed to be loaded. Though in that case you are doing one large query instead of one small one so you don't actually reduce the number of queries.
I have not tried it, but I believe you should be able to use the sqlalchemy.orm.util.identity_key method to determine whether an object is in sqlalchemy's identiy map. I would be interested to find out what calling identiy_key(Group, 83) returns.
Initial Question)
If I understand correctly you have an object that you fetched from the database where some of its relationships were eagerloaded and you would like to fetch the rest of the relationships with a single query? I believe you may be able to use the Session.refresh() method passing in the the names of the relationships that you want to load.