I'm familiar with Python generators, however I've just come across the term "generative method" which I am not familiar with and cannot find a satisfactory definition.
To put it in context, I found the term in SQLAlchemy's narrative documentation:
Full control of the “autocommit” behavior is available using the generative Connection.execution_options() method provided on Connection, Engine, Executable, using the “autocommit” flag which will turn on or off the autocommit for the selected scope.
What is a generative method? Trying to iterate the object returned by Connection.execution_options() doesn't work so I'm figuring it's something other than a standard generator.
It doesn't appear to be a common database concept, but SQLAlchemy uses the term generative in the sense "generated by your program iteratively at runtime". (So, no connection to python generators). An example from the tutorial:
The Query object is fully generative, meaning that most method calls
return a new Query object upon which further criteria may be added.
For example, to query for users named “ed” with a full name of “Ed
Jones”, you can call filter() twice, which joins criteria using AND:
>>> for user in session.query(User).\
... filter(User.name=='ed').\
... filter(User.fullname=='Ed Jones'):
... print user
This call syntax is more commonly known as "method chaining", and the design that allows it as a "fluent interface".
So, in the case of Connection.execution_options(), "generative" means that it returns the modified connection object, so that you can chain the calls as above.
Looking at the source code of Connection.execution_options (lib/sqlalchemy/engine/base.py), all that method does is add options to the connection.
The idea is that those options influence the future behaviour of e.g. queries.
As an example:
result = connection.execution_options(stream_results=True).\
execute(stmt)
Here, the behaviour was changed in the middle of the connection for just this query.
In a way, it "generates" or clones itself as an object that has a slightly different behaviour.
Here you can also set autocommit to True. Example
# obtain a connection
connection = ...
# do some stuff
# for the next section we want autocommit on
autocommitting_connection = connection.execution_options(autocommit=True)
autocommitting_connection.execute(some_insert)
result = autocommitting_connection.execute(some_query)
# done with this section. Continue using connection (no autocommit)
This is what is meant with that section of the documentation. "generative method" refers to a method that returns a modified copy of the same instance that you can continue working with. This is applicable to the classes Connection, Engine, Executable.
You would have to consult the specific documentation or source code of that project to really make sure, but I would guess that it returns a modified version of some object adapted to the requirements/behaviour defined by the arguments.
The documentation states:
The method returns a copy of this Connection which references the same
underlying DBAPI connection, but also defines the given execution
options which will take effect for a call to execute().
As #zzzeek comments above, this is now documented in the SQLAlchemy glossary.
generative means:
A term that SQLAlchemy uses to refer what’s normally known as method chaining; see that term for details.
And method chaining is:
An object-oriented technique whereby the state of an object is constructed by calling methods on the object. The object features any number of methods, each of which return a new object (or in some cases the same object) with additional state added to the object.
Related
This question is very generic but I don't think it is opinion based. It is about software design and the example prototype is in python:
I am writing a program which goal it is to simulate some behaviour (doesn't matter). The data on which the simulation works is fixed, but the simulated behaviour I want to change at every startup time. The simulation behaviour can't be changed at runtime.
Example:
Simulation behaviour is defined like:
usedMethod = static
The program than looks something like this:
while(true)
result = static(object) # static is the method specified in the behaviour
# do something with result
The question is, how is the best way to deal with exchangeable defined functions? So another run of the simulation could look like this
while(true)
result = dynamic(object)
if dynamic is specified as usedMethod. The first thing that came in my mind was an if-else block, where I ask, which is the used method and then execute this on. This solution would not be very good, because every time I add new behaviour I have to change the if-else block and the if-else block itself would maybe cost performance, which is important, too. The simulations should be fast.
So a solution I could think of was using a function pointer (output and input of all usedMethods should be well defined and so it should not be a problem). Then I initalize the function pointer at startup, where the used method is defined.
The problem I currently have, that the used method is not a function per-se, but is a method of a class, which depends heavily on the intern members of this class, so the code is more looking like this:
balance = BalancerClass()
while(true)
result = balance.static(object)
...
balance.doSomething(input)
So my question is, what is a good solution to deal with this problem?
I thought about inheriting from the balancerClass (this would then be an abstract class, I don't know if this conecpt exists in python) and add a derived class for every used method. Then I create the correct derived object which is specified in the simulation behaviour an run-time.
In my eyes, this is a good solution, because it encapsulates the methods from the base class itself. And every used method is managed by its own class, so it can add new internal behaviour if needed.
Furthermore the doSomething method shouldn't change, so therefore it is implemented the base class, but depends on the intern changed members of the derived class.
I don't know in general if this software design is good to solve my problem or if I am missing a very basic and easy concept.
If you have a another/better solution please tell me and it would be good, if you provide the advantages/disadvantages. Also could you tell me advantages/disadvantages of my solution, which I didn't think of?
Hey I can be wrong but what you are looking for boils down to either dependency injection or strategy design pattern both of which solve the problem of executing dynamic code at runtime via a common interface without worrying about the actual implementations. There are also much simpler ways just like u desrcibed creating an abstract class(Interface) and having all the classes implement this interface.
I am giving brief examples fo which here for your reference:
Dependecy Injection(From wikipedia):
In software engineering, dependency injection is a technique whereby one object supplies the dependencies of another object. A "dependency" is an object that can be used, for example as a service. Instead of a client specifying which service it will use, something tells the client what service to use. The "injection" refers to the passing of a dependency (a service) into the object (a client) that would use it. The service is made part of the client's state.
Passing the service to the client, rather than allowing a client to build or find the service, is the fundamental requirement of the pattern.
Python does not have such a conecpt inbuilt in the language itself but there are packages out there that implements this pattern.
Here is a nice article about this in python(All credits to the original author):
Dependency Injection in Python
Strategy Pattern: This is an anti-pattern to inheritance and is an example of composition which basically means instead of inheriting from a base class we pass the required class's object to the constructor of classes we want to have the functionality in. For example:
Suppose you want to have a common add() operation but it can be implemented in different ways(add two numbers or add two strings)
Class XYZ():
def __constructor__(adder):
self.adder = adder
The only condition being all adders passed to the XYZ class should have a common Interface.
Here is a more detailed example:
Strategy Pattern in Python
Interfaces:
Interfaces are the simplest, they define a set of common attributes and methods(with or without a default implementation). Any class then can implement an interface with its own functionality or some shared common functionality. In python Interfaces are implemented via abc package.
So I get that when I call list() on a Query object, that executes the query and returns the results as a list.
What I'm wondering is what exactly is happening in the source code to make the Query fire SQL and grab the results.
When do you list(query), python invokes the method __iter__(), as for any container, for the class Query.
This method initializes context and eventually calls _execute_and_instances(), and internal method of class Query, which gets the connection from the session and executes the query statement.
def _execute_and_instances(self, querycontext):
conn = self._connection_from_session(
mapper=self._mapper_zero_or_none(),
clause=querycontext.statement,
close_with_result=True)
result = conn.execute(querycontext.statement, self._params)
return loading.instances(self, result, querycontext)
So the query is executed just when the list object is created.
It's not much detail, and I hope it is enough to answer your question. Maybe some context on why you are asking would allow to go into more relevant details. Hope it helps anyway.
EDIT: After clarifications via comments, as for your second question (i.e. how to add a traceback of your call as a comment in the query, to be logged in the server): I don't think is feasible just modifying in one single point.
Most queries go through the object Query, and its method _compile_context is a common point where the query is composed. If you launch either a usual query (a select with filters and so on) or a delete, it will go through this method.
However, session.add is completely different. At the moment of adding the object not much is really done (regarding the query I mean; of course the new object is registered). The objects are only inserted when you perform commit (as expected btw). session.flush starts this process. At that point, you have available SQLAlchemy's mapper and you could get the instance of the object you are actually adding (e.g. an instance of your class User). If you have added the traceback at the moment of creating your instance, you could recover it here and add it to the query as a comment.
At any rate, this would not be easy at all, difficult to maintain as well, while it is only for debugging. I understand what you are trying to do but it is too much effort for a result which is not so great (in my opinion).
Inside of the declarative base I define a function like this:
def update_me(self):
if self.raw_info==1: self.changed_info=10
else: self.changed_info=20
I know this can be done with hybrid_property but I actually do more complicated manipulations, the above is just for illustrative purposes, this has to be done through a method. How can I commit these changes, from inside the declarative base, without passing it the session object? It seems logical that there would be a way, if I can access the object and change its values without a session object, then it seems like I should be able to save it here somehow. Of course adding this code to the end of the function above fails
self.commit()
It seems to me that you might want to reconsider your design. However, if you are sure you want to go this way, you can use the object_session method:
object_session(self).commit()
Warning: This will commit your whole session (not just one object) and it will only work if self is already attached to a session (by Session.add or having queried for it, for example). Thus, it will be a fragile operation and I would not recommend it unless absolutely necessary.
But you are right, when you say
It seems logical that there would be a way, if I can access the object and change its values without a session object, then it seems like I should be able to save it here somehow.
The object and the session are connected and thus changes will be delivered to the database. We get this connected session with the method above.
It seems to me like going through the whole process of creating the expression tree and then creating a query from it again is a wasted time when using sqlalchemy. Apart from an occasional dynamic query, almost everything will be exactly the same during the whole life of an application (apart from the parameters of course).
Is there any way to just save a query once it's created and reuse it later on with different parameters? Or maybe there's some internal mechanism which already does something similar?
It seems to me like going through the whole process of creating the expression tree and then creating a query from it again is a wasted time when using sqlalchemy.
Do you have any estimates on how much time is wasted, compared to the rest of the application? Profiling here is extremely important before making your program more complex. As I will often note, Reddit serves well over one billion page views a day, they use the SQLAlchemy Core to query their database, and the last time I looked at their code they make no attempt to optimize this process - they build expression trees on the fly and compile each time. We have had users that have determined that their specific system actually benefits from optimiztions in these areas, however.
I've written up some background on profiling here: How can I profile a SQLAlchemy powered application?
Is there any way to just save a query once it's created and reuse it later on with different parameters? Or maybe there's some internal mechanism which already does something similar?
There are several methods, depending on what APIs you're using and what areas you'd like to optimize.
There's two main portions to rendering SQL - there's the construction of the expression tree, so to speak, and then the compilation of the string from the expression tree.
The tree itself, which can either be a select() construct if using Core or a Query() if using ORM, can be reused. A select() especially has nothing associated with it that prevents it from being reused as often as you like (same for insert(), delete(), update(), etc.).
In the ORM, a Query also can be used with different sessions using the with_session() method. The win here is not as much, as Query() still produces an entire select() each time it is invoked. However as we'll see below there is a recipe that can allow this to be cached.
The next level of optimization involves the caching of the actual SQL text rendered. This is an area where a little more care is needed, as the SQL we generate is specific to the target backend; there are also edge cases where various parameterizations change the SQL itself (such as using "TOP N ROWS" with SQL Server, we can't use a bound parameter there). Caching of the SQL strings is provided using the execution_options() method of Connection, which is also available in a few other places, setting the compiled_cache feature by sending it a dictionary or other dict-like object which will cache the string format of statements, keyed to the dialect, the identity of the construct, and the parameters sent. This feature is normally used by the ORM for insert/update/delete statements.
There's a recipe I've posted which integrates the compiled_cache feature with the Query, at BakedQuery. By taking any Query and saying query.bake(), you can now run that query with any Session and as long as you don't call any more chained methods on it, it should use a cached form of the SQL string each time:
q = s.query(Foo).filter(Foo.data==bindparam('foo')).bake()
for i in range(10):
result = q.from_session(s).params(foo='data 12').all()
It's experimental, and not used very often, but it's exactly what you're asking for here. I'd suggest you tailor it to your needs, keep an eye on memory usage as you use it and make sure you follow how it works.
SQLAlchemy 1.0 introduced the baked extension which is a cache designed specifically for caching Query objects. It caches the object's construction and string-compilation steps to minimize the overhead of the Python interpreter on repetitive queries.
Official docs here: http://docs.sqlalchemy.org/en/latest/orm/extensions/baked.html
Note that it does NOT in cache the result set returned by the database. For that, you'll want to check out the dogpile.cache:
http://docs.sqlalchemy.org/en/latest/orm/examples.html#module-examples.dogpile_caching
I'm creating a basic database utility class in Python. I'm refactoring an old module into a class. I'm now working on an executeQuery() function, and I'm unsure of whether to keep the old design or change it. Here are the 2 options:
(The old design:) Have one generic executeQuery method that takes the query to execute and a boolean commit parameter that indicates whether to commit (insert, update, delete) or not (select), and determines with an if statement whether to commit or to select and return.
(This is the way I'm used to, but that might be because you can't have a function that sometimes returns something and sometimes doesn't in the languages I've worked with:) Have 2 functions, executeQuery and executeUpdateQuery (or something equivalent). executeQuery will execute a simple query and return a result set, while executeUpdateQuery will make changes to the DB (insert, update, delete) and return nothing.
Is it accepted to use the first way? It seems unclear to me, but maybe it's more Pythonistic...? Python is very flexible, maybe I should take advantage of this feature that can't really be accomplished in this way in more strict languages...
And a second part of this question, unrelated to the main idea - what is the best way to return query results in Python? Using which function to query the database, in what format...?
It's propably just me and my FP fetish, but I think a function executed solely for side effects is very different from a non-destructive function that fetches some data, and therefore have different names. Especially if the generic function would do something different depending on exactly that (the part on the commit parameter seems to imply that).
As for how to return results... I'm a huge fan of generators, but if the library you use for database connections returns a list anyway, you might as well pass this list on - a generator wouldn't buy you anything in this case. But if it allows you to iterate over the results (one at a time), seize the opportunity to save a lot of memory on larger queries.
I don't know how to answer the first part of your question, it seems like a matter of style more than anything else. Maybe you could invoke the Single Responsibility Principle to argue that it should be two separate functions.
When you're going to return a sequence of indeterminate length, it's best to use a Generator.
I'd have two methods, one which updates the database and one which doesn't. Both could delegate to a common private method, if they share a lot of code.
By separating the two methods, it becomes clear to callers what the different semantics are between the two, makes documenting the different methods easier, and clarifies what return types to expect. Since you can pull out shared code into private methods on the object, there's no worry about duplicating code.
As for returning query results, it'll depend on whether you're loading all the results from the database before returning, or returning a cursor object. I'd be tempted to do something like the following:
with db.executeQuery('SELECT * FROM my_table') as results:
for row in results:
print row['col1'], row['col2']
... so the executeQuery method returns a ContextManager object (which cleans up any open connections, if needed), which also acts as a Generator. And the results from the generator act as read-only dicts.