Sqlalchemy's documentation says that one can create a session in two ways:
from sqlalchemy.orm import Session
session = Session(engine)
or with a sessionmaker
from sqlalchemy.orm import session_maker
Session = session_maker(engine)
session = Session()
Now in either case one needs a global object (either the engine, or the session_maker object). So I do not really see what the point of the session_maker is. Maybe I am misunderstanding something.
I could not find any advice when one should use one or the other. So the question is: In which situation would you want to use Session(engine) and in which situation would you prefer session_maker?
The docs describe the difference very well:
Session is a regular Python class which can be directly instantiated. However, to standardize how sessions are configured and acquired, the sessionmaker class is normally used to create a top level Session configuration which can then be used throughout an application without the need to repeat the configurational arguments.
Related
I'm learning sqlalchemy's ORM and I'm finding it very confusing / unintuitive. Say I want to create a User class and corresponding table.. I think I should do something like
from sqlalchemy import create_engine, Column, Integer, String
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
engine = create_engine("sqlite:///todooo.db")
Base = declarative_base()
class User(Base):
__tablename__ = 'some_table'
id = Column(Integer, primary_key=True)
name = Column(String(50))
Base.metadata.create_all(engine)
Session = sessionmaker(bind = engine)
session1 = Session()
user1 = User(name='fred')
session1.add(user1)
session1.commit()
Here I
create an Engine. My understanding is that the engine is like the front-line communicator between my code and the database.
create a DeclarativeMeta, a metaclass whose job I think is to keep track of a mapping between my Python classes and my SQL tables
create a User class
initialize my database and tables with Base.metadata.create_all(engine)
create a Session metaclass
create a Session instance, session1
create a User instance, user1
The thing I find quite confusing here is the Base superclass. What is the benefit to using it as opposed to doing something like engine.create_table(User)?
Additionally, if we don't need a Session to create the database and insert tables, why do we need a Session to insert records?
SQLAlchemy needs a mechanism to hook the classes being mapped to the database rows. Basically, you have to tell it:
Use the class User as a mapped class for the table some_table. One way to do it is to use a common base class - declarative base. Another way would be calling a function to register you class into the mapper. This declarative base used to be an extension of sqlalchemy IIRC but later it became a standard.
Now, having a common base makes perfect sense for me, because I do not have to make an extra step to call a function to register the mapping. Instead, whatever I inherit from the declarative base is automatically mapped. Both attitudes can work in general.
Engine is able to give you connections and can take care of connection pooling. Connection is able to run things on the database. No ORM as yet. With a connection, you can create and run queries using QL (query language), but you have no mapping of database data to python objects.
Session uses a connection and takes care of ORM. Better read the docs here, but the simplest case is:
user1.name = "Different Fred"
That's it. It will generate and execute the SQL at the right moment. Really, read the docs.
Now, you can create a table only with a connection, as it does not make much sense to include the session in the process, because session takes care of the current mapping session. There is nothing to map if you do not physically have the table yet. So you create the tables with the connection, then you can make a session and use mapping. Also, tables creation is usually a once-off action done separately from the normal program run (at least create_all).
I have an Python object which is a ProtoBuf message, that I want to insert into a database.
Ideally I'd like to be able to do something like
from sqlalchemy import create_engine, MetaData, Table
from sqlalchemy.orm import mapper, sessionmaker
from event_pb2 import Event
engine = create_engine(...)
metadata = MetaData(engine)
table = Table("events", metadata, autoload=True)
mapping = mapper(Event, table)
Session = sessionmaker(engine)
session = Session()
byte_string = b'.....'
event = Event()
event.ParseFromString(byte_string)
session.add(event)
When I try the above I get an error AttributeError: 'Event' object has no attribute '_sa_instance_state' when I try to create the Event object. Which isn't shocking because the Event class has been generated by ProtoBuf.
Is there a better i.e. safer or more succinct way to do that than manually generating the insert statement by looping over all the field names and values? I'm not married to using SqlAlchemy if there's a better way to solve the problem.
I think it's generally advised that you should limit protobuf generated classes to the client and server side gRPC methods and, for any uses beyond that, map Protobuf objects from|to application specific classes.
In this case, define a set of SQLAlchemy classes and transform the gRPC objects into SQLAlchemy specific classes for your app.
This avoids breakage if e.g. gRPC maintainers change the implementation in a way that would break SQLAlchemy, it provides you with a means to translate between e.g. proto Timestamps and your preferred database time format, and it provides a level of abstraction between gRPC and SQLAlchemy that affords you more flexibility in making changes to one or the other.
There do appear to be some tools to help with the translation but, these highlight issues with their approach e.g. Mercator.
I have a query that fetches a lot of data from my MySQL db, where loading all of the data into memory isn't an option. Luckily, SQLAlchemy lets me create an engine using MySQL's SSCursor, so the data is streamed and not fully loaded into memory. I can do this like so:
create_engine(connect_str, connect_args={'cursorclass': MySQLdb.cursors.SSCursor})
That's great, but I don't want to use SSCursor for all my queries including very small ones. I'd rather only use it where it's really necessary. I thought I'd be able to do this with the stream_results setting like so:
conn.execution_options(stream_results=True).execute(MyTable.__table__.select())
Unfortunately, when monitoring memory usage when using that, it seems to use the exact same amount of memory as if I don't do that, whereas using SSCursor, my memory usage goes down to nil as expected. What am I missing? Is there some other way to accomplish this?
From the docs:
stream_results – Available on: Connection, statement. Indicate to the
dialect that results should be “streamed” and not pre-buffered, if
possible. This is a limitation of many DBAPIs. The flag is currently
understood only by the psycopg2 dialect.
I think you just want to create multiple sessions one for streaming and one for normal queries, like:
from sqlalchemy.orm import sessionmaker
from sqlalchemy import create_engine
def create_session(engine):
# configure Session class with desired options
Session = sessionmaker()
# associate it with our custom Session class
Session.configure(bind=engine)
# work with the session
session = Session()
return session
#streaming
stream_engine = create_engine(connect_str, connect_args={'cursorclass': MySQLdb.cursors.SSCursor})
stream_session = create_session(stream_engine)
stream_session.execute(MyTable.__table__.select())
#normal
normal_engine = create_engine(connect_str)
normal_session = create_session(normal_engine)
normal_session.execute(MyTable.__table__.select())
I have a Flask application with which I'd like to use SQLAlchemy Core (i.e. I explicitly do not want to use an ORM), similarly to this "fourth way" described in the Flask doc:
http://flask.pocoo.org/docs/patterns/sqlalchemy/#sql-abstraction-layer
I'd like to know what would be the recommended pattern in terms of:
How to connect to my database (can I simply store a connection instance in the g.db variable, in before_request?)
How to perform reflection to retrieve the structure of my existing database (if possible, I'd like to avoid having to explicitly create any "model/table classes")
Correct: You would create a connection once per thread and access it using a threadlocal variable. As usual, SQLAlchemy has thought of this use-case and provided you with a pattern: Using the Threadlocal Execution Strategy
db = create_engine('mysql://localhost/test', strategy='threadlocal')
db.execute('SELECT * FROM some_table')
Note: If I am not mistaken, the example seems to mix up the names db and engine (which should be db as well, I think).
I think you can safely disregard the Note posted in the documentation as this is explicitly what you want. As long as each transaction scope is linked to a thread (as is with the usual flask setup), you are safe to use this. Just don't start messing with threadless stuff (but flask chokes on that anyway).
Reflection is pretty easy as described in Reflecting Database Objects. Since you don't want to create all the tables manually, SQLAlchemy offers a nice way, too: Reflecting All Tables at Once
meta = MetaData()
meta.reflect(bind=someengine)
users_table = meta.tables['users']
addresses_table = meta.tables['addresses']
I suggest you check that complete chapter concerning reflection.
Which is the best driver in python to connect to postgresql?
There are a few possibilities, http://wiki.postgresql.org/wiki/Python but I don't know which is the best choice
Any idea?
psycopg2 is the one everyone uses with CPython. For PyPy though, you'd want to look at the pure Python ones.
I would recommend sqlalchemy - it offers great flexibility and has a sophisticated inteface.
Futhermore it's not bound to postgresql alone.
Shameless c&p from the tutorial:
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
# an Engine, which the Session will use for connection
# resources
some_engine = create_engine('postgresql://scott:tiger#localhost/')
# create a configured "Session" class
Session = sessionmaker(bind=some_engine)
# create a Session
session = Session()
# work with sess
myobject = MyObject('foo', 'bar')
session.add(myobject)
session.commit()
Clarifications due to comments (update):
sqlalchemy itself is not a driver, but a so called Object Relational Mapper. It does provide and include it's own drivers, which in the postgresql-case is libpq, which itself is wrapped in psycopg2.
Because the OP emphasized he wanted the "best driver" to "connect to postgresql" i pointed sqlalchemy out, even if it might be a false answer terminology-wise, but intention-wise i felt it to be the more useful one.
And even if i do not like the "hair-splitting" dance, i still ended up doing it nonetheless, due to the pressure felt coming from the comments to my answer.
I apologize for any irritations caused by my slander.