How does Django pass around it's database connection - python

If you use Django you can simply create an instance of one of your models, fill it with data and call save() on it and it will be saved to the database. You don't have to pass in a "connection" parameter or do anything special. Also, your view are just simple callables so there seems to be no magic hidden. I.e. this works:
from django.http import HttpResponse
from models import MyModel
def a_simple_view(request):
instance = MyModel(some_field="Foobar")
instance.save()
return HttpResponse(<html><body>Jep, just saved</body></html>)
So the question is: How does my freshly created model instance get a database connection so save itself? And as a followup: Is this is a sensible way to do it?

How does my freshly created model instance get a database connection
so save itself?
Essentially, each model has a Manager that knows the database connection. In reality it is a bit more complicated, because the manager delegates the connection creation and management (to database routers and connection managers).
Is this is a sensible way to do it?
Well, that's a question that cannot be answered without context, really. In the context of what a Django model is, this is the sensible approach because as a developer you do not have to concern yourself with connection management.
If you're asking whether Django takes a sensible approach to connection management, and you are worried it may not, here's what the Django documentation has to say about it:
Django opens a connection to the database when it first makes a
database query. It keeps this connection open and reuses it in
subsequent requests. Django closes the connection once it exceeds the
maximum age defined by CONN_MAX_AGE or when it isn’t usable any
longer.
and:
Since each thread maintains its own connection, your database must
support at least as many simultaneous connections as you have worker
threads.
So now the question is: when and how many threads are created? This depends on the server used. E.g. the development server starts a new thread for every request, whereas gunicorn reuses threads across requests.

It is hard to describe how django does this but you can probably learn more about this by looking at the source code of django and specifically on django.db module.
There's a DB abstraction layer so that Django works with many databases such as sqlite, mysql and postgresql. There's connection pooling so that django reuses the connection on subsequent queries. All these things are used by a django model when a db query is to be run. In the end it is not simple and you should check the source code to find detailed answers.

Related

Flask tutorial: Why do we use the app context for the DB connection?

I'm studying the Flask tutorial and am confused about the usage of the app context for connecting to the database. On the page, the author says:
Creating and closing database connections all the time is very
inefficient, so you will need to keep it around for longer. Because
database connections encapsulate a transaction, you will need to make
sure that only one request at a time uses the connection.
However, it seems that creating and closing the connection is exactly what is accomplished by the code. We have a view get_db that connects to the database, returns a connection object, and adds it to the app context if not present. We also have a close_db view that closes the connection object when the app context is torn down.
My understanding of what's happening here is as follows: A user connects to the app's home page, establishing a connection with the database and then closing it. Each time the user makes a request to the page (e.g., posting a new blog entry), a new connection is established with the database and then closed when the request is completed. So what's the value of using the app context at all? Why not just directly create and close a connection for each request?
Maybe I am misunderstanding something here -- any insight appreciated!
It's badly documented. But basically the whole point of the app context is to abstract app-specific-but-persistent-across-request data objects (ostensibly to allow multi-tenant applications that are all separate instances of Flask without tying everything to request context). In any case, flask.g is the app context-specific dict for the current app context. This way you can call get_db() on the very first request, shove the resulting connection object into g and then use g.dbconn later on in the app, which doesn't get killed on each request (presumably you want to use the same connection on each page access). I like to think of flask.g is like a class (well the app, in this case) instance of globals().
See also: http://flask.pocoo.org/docs/0.12/appcontext/#app-context

Handle connections to user defined DB in Django

I have pretty simple model. User defines url and database name for his own Postgres server. My django backend fetches some info from client DB to make some calculations, analytics and draw some graphs.
How to handle connections? Create new one when client opens a page, or keep connections alive all the time?(about 250-300 possible clients)
Can I use Django ORM or smth like SQLAlchemy? Or even psycopg library?
Does anyone tackle such a problem before?
Thanks
In your case, I would rather go with Django internal implementation and follow Django ORM as you will not need to worry about handling connection and different exceptions that may arise during your own implementation of DAO model in your code.
As per your requirement, you need to access user database, there still exists overhead for individual users to create db and setup something to connect with your codebase. So, I thinking sticking with Django will be more profound.

Do I need to use SQLAlchemy sessions?

The official tutorial for SQLAlchemy provides examples that make use of the session system, such as the following:
>>> from sqlalchemy.orm import sessionmaker
>>> Session = sessionmaker(bind=engine)
Many unofficial tutorials also make use of sessions, however some don't make use of them at all, instead opting for whatever one would call this approach:
e = create_engine('sqlite:///company.db')
conn = e.connect()
query = conn.execute("SELECT first_name FROM employee")
Why are sessions needed at all when this much simpler system seems to do the same thing? The official documentation doesn't make it clear why this would be necessary, as far as I've seen.
There is one particularily relevant section in the official SQLAlchemy documentation:
A web application is the easiest case because such an application is already constructed around a single, consistent scope - this is the request, which represents an incoming request from a browser, the processing of that request to formulate a response, and finally the delivery of that response back to the client. Integrating web applications with the Session is then the straightforward task of linking the scope of the Session to that of the request. The Session can be established as the request begins, or using a lazy initialization pattern which establishes one as soon as it is needed. The request then proceeds, with some system in place where application logic can access the current Session in a manner associated with how the actual request object is accessed. As the request ends, the Session is torn down as well, usually through the usage of event hooks provided by the web framework. The transaction used by the Session may also be committed at this point, or alternatively the application may opt for an explicit commit pattern, only committing for those requests where one is warranted, but still always tearing down the Session unconditionally at the end.
...and...
Some web frameworks include infrastructure to assist in the task of
aligning the lifespan of a Session with that of a web request. This
includes products such as Flask-SQLAlchemy, for usage in conjunction
with the Flask web framework, and Zope-SQLAlchemy, typically used with
the Pyramid framework. SQLAlchemy recommends that these products be
used as available.
Unfortunately I still can't tell whether I need to use sessions, or if the final paragraph is implying that certain implementation such as Flask-SQLAlchemy are already managing sessions automatically.
Do I need to use sessions? Is there a significant risk to not using sessions? Am I already using sessions because I'm using Flask-SQLAlchemy?
Like you pointed out, sessions are not strictly necessary if you only construct and execute queries using plain SQLAlchemy Core. However, they provide higher layer of abstraction required to take advantage of SQLAlchemy ORM. Session maintains a graph of modified models and makes sure the changes are efficiently and consistently flushed to the database when necessary.
Since you already use Flask-SQLAlchemy, I don't see a reason to avoid sessions even if you don't need the ORM features. The extension handles all the plumbing necessary for isolating requests, so that you don't have to reinvent the wheel and can focus on your application code.
You need sessions if you want to use ORM features including:
Changing some attribute on an object returned by a query and having it easily written back to the database
create an object using normal python constructors and have it easily sent back to the database
Have relationships on objects. So, for example if you had a blog, and you had a post object, be able to write post.author to access the User object responsible for that post.
I'll note also that even if you use Session, you don't actually need sessionmaker. In a web application, you probably want it, but you can use sessions like this if you're looking for simplicity:
from sqlalchemy import create_engine
from sqlalchemy.orm import Session
engine = create_engine(...)
session = Session(bind = engine)
Again, I think you probably want sessionmaker in your app, but you don't need it for sessions to work

Where to instantiate shared thread objects in Django, equiv of pyramid registry?

I'm plugging some framework agnostic code of mine into Django instead of Pyramid. It uses SQLAlchemy and has a customized session factory object for getting db sessions. In pyramid, I instantiate this at server start up in the main app method and attach it to the registry so that all other parts of my app can get at it. I'd like to know what the "correct" way of instantiating and making available a shared factory is in Django. Is there somewhere canonical for putting something like that so that Django users will find it easily and the code will be readable to people used to Django patterns?
thanks
I place my SQLAlchemy/SQLSoup connections at models.py, because it is related to persistence (and to the "model" layer of Model-View-Whatever).
You can even replace the Django ORM with SQLAlchemy if you are not using applications relying on the first like django.contrib.admin or django.contrib.auth.

sqlalchemy identity map question

The identity map and unit of work patterns are part of the reasons sqlalchemy is much more attractive than django.db. However, I am not sure how the identity map would work, or if it works when an application is configured as wsgi and the orm is accessed directly through api calls, instead of a shared service. I would imagine that apache would create a new thread with its own python instance for each request. Each instance would therefore have their own instance of the sqlalchemy classes and not be able to make use of the identity map. Is this correct?
I think you misunderstood the identity map pattern.
From : http://martinfowler.com/eaaCatalog/identityMap.html
An Identity Map keeps a record of all
objects that have been read from the
database in a single business
transaction.
Records are kept in the identity map for a single business transaction. This means that no matter how your web server is configured, you probably will not hold them for longer than a request (or store them in the session).
Normally, you will not have many users taking part in a single business transation. Anyway, you probably don't want your users to share objects, as they might end up doing things that are contradictory.
So this all depends on how you setup your sqlalchemy connection. Normally what you do is to manage each wsgi request to have it's own threadlocal session. This session will know about all of the goings-on of it, items added/changed/etc. However, each thread is not aware of the others. In this way the loading/preconfiguring of the models and mappings is shared during startup time, however each request can operate independent of the others.

Categories

Resources