Add database support at runtime - python

I have a python module that I've been using over the years to process a series of text files for work. I now have a need to store some of the info in a db (using SQLAlchemy), but I would still like the flexibility of using the module without db support, i.e. not have to actually have sqlalchemy import'ed (or installed). As of right now, I have the following... and I've been creating Product or DBProduct, etc depending on wether I intend to use a db or not.
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class Product(object):
pass
class WebSession(Product):
pass
class Malware(WebSession):
pass
class DBProduct(Product, Base):
pass
class DBWebSession(WebSession, DBProduct):
pass
class DBMalware(Malware, DBWebSession):
pass
However, I feel that there has got to be an easier/cleaner way to do this. I feel that I'm creating an inheritance mess and potential problems down the road. Ideally, I'd like to create a single class of Product, WebSession, etc (maybe using decorators) that contains the information neccessary for using a db, but it's only enabled/functional after calling something like enable_db_support(). Once that function is called, then regardless of what object I create, itself (and all the objects it inherits) enable all the column
bindings, etc. I should also note that if I somehow figure out how to include Product and DBProduct in one class, I sometimes need 2 versions of the same function: 1 which is called if db support is enabled and 1 if it's not. I've also considered "recreating" the object hierarchy when enable_db_support() is called, however that turned out to be a nightmare as well.
Any help is appreciated.

Well, you can probably get away with creating a pure non-DB aware model by using Classical Mapping without using a declarative extension. In this case, however, you will not be able to use relationships as they are used in SA, but for simple data import/export types of models this should suffice:
# models.py
class User(object):
pass
----
# mappings.py
from sqlalchemy import Table, MetaData, Column, ForeignKey, Integer, String
from sqlalchemy.orm import mapper
from models import User
metadata = MetaData()
user = Table('user', metadata,
Column('id', Integer, primary_key=True),
Column('name', String(50)),
Column('fullname', String(50)),
Column('password', String(12))
)
mapper(User, user)
Another option would be to have a base class for your models defined in some other module and configure on start-up this base class to be either DB-aware or not, and in case of DB-aware version add additional features like relationships and engine configurations...

It seems to me that the DRYest thing to do would be to abstract away the details of your data storage format, be that a plain text file or a database.
That is, write some kind of abstraction layer that your other code uses to store the data, and make it so that the output of your abstraction layer is switchable between SQL or text.
Or put yet another way, don't write a Product and DB_Product class. Instead, write a store_data() function that can be told to use either format='text' or format='db'. Then use that everywhere.
This is actually the same thing SQLAlchemy does behind the scenes - you don't have to write separate code for SQLAlchemy depending on whether it's driving mySQL, PostgreSQL, etc. That is all handled in SQLAlchemy, and you use the abstracted (database-neutral) interface.
Alternately, if your objection to SQLAlchemy is that it's not a Python builtin, there's always sqlite3. This gives you all the goodness of an SQL relational database with none of the fat.
Alternately alternately, use sqlite3 as an intermediate format. So rewrite all your code to use sqlite3, and then translate from sqlite3 to plain text (or another database) as required. In the limit case, conversion to plain text is only a sqlite3 db .dump away.

Related

Why does sqlalchemy use DeclarativeMeta class inheritance to map objects to tables

I'm learning sqlalchemy's ORM and I'm finding it very confusing / unintuitive. Say I want to create a User class and corresponding table.. I think I should do something like
from sqlalchemy import create_engine, Column, Integer, String
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
engine = create_engine("sqlite:///todooo.db")
Base = declarative_base()
class User(Base):
__tablename__ = 'some_table'
id = Column(Integer, primary_key=True)
name = Column(String(50))
Base.metadata.create_all(engine)
Session = sessionmaker(bind = engine)
session1 = Session()
user1 = User(name='fred')
session1.add(user1)
session1.commit()
Here I
create an Engine. My understanding is that the engine is like the front-line communicator between my code and the database.
create a DeclarativeMeta, a metaclass whose job I think is to keep track of a mapping between my Python classes and my SQL tables
create a User class
initialize my database and tables with Base.metadata.create_all(engine)
create a Session metaclass
create a Session instance, session1
create a User instance, user1
The thing I find quite confusing here is the Base superclass. What is the benefit to using it as opposed to doing something like engine.create_table(User)?
Additionally, if we don't need a Session to create the database and insert tables, why do we need a Session to insert records?
SQLAlchemy needs a mechanism to hook the classes being mapped to the database rows. Basically, you have to tell it:
Use the class User as a mapped class for the table some_table. One way to do it is to use a common base class - declarative base. Another way would be calling a function to register you class into the mapper. This declarative base used to be an extension of sqlalchemy IIRC but later it became a standard.
Now, having a common base makes perfect sense for me, because I do not have to make an extra step to call a function to register the mapping. Instead, whatever I inherit from the declarative base is automatically mapped. Both attitudes can work in general.
Engine is able to give you connections and can take care of connection pooling. Connection is able to run things on the database. No ORM as yet. With a connection, you can create and run queries using QL (query language), but you have no mapping of database data to python objects.
Session uses a connection and takes care of ORM. Better read the docs here, but the simplest case is:
user1.name = "Different Fred"
That's it. It will generate and execute the SQL at the right moment. Really, read the docs.
Now, you can create a table only with a connection, as it does not make much sense to include the session in the process, because session takes care of the current mapping session. There is nothing to map if you do not physically have the table yet. So you create the tables with the connection, then you can make a session and use mapping. Also, tables creation is usually a once-off action done separately from the normal program run (at least create_all).

Idiomatic Way to Insert/Upsert Protobuf Into A Relational Database

I have an Python object which is a ProtoBuf message, that I want to insert into a database.
Ideally I'd like to be able to do something like
from sqlalchemy import create_engine, MetaData, Table
from sqlalchemy.orm import mapper, sessionmaker
from event_pb2 import Event
engine = create_engine(...)
metadata = MetaData(engine)
table = Table("events", metadata, autoload=True)
mapping = mapper(Event, table)
Session = sessionmaker(engine)
session = Session()
byte_string = b'.....'
event = Event()
event.ParseFromString(byte_string)
session.add(event)
When I try the above I get an error AttributeError: 'Event' object has no attribute '_sa_instance_state' when I try to create the Event object. Which isn't shocking because the Event class has been generated by ProtoBuf.
Is there a better i.e. safer or more succinct way to do that than manually generating the insert statement by looping over all the field names and values? I'm not married to using SqlAlchemy if there's a better way to solve the problem.
I think it's generally advised that you should limit protobuf generated classes to the client and server side gRPC methods and, for any uses beyond that, map Protobuf objects from|to application specific classes.
In this case, define a set of SQLAlchemy classes and transform the gRPC objects into SQLAlchemy specific classes for your app.
This avoids breakage if e.g. gRPC maintainers change the implementation in a way that would break SQLAlchemy, it provides you with a means to translate between e.g. proto Timestamps and your preferred database time format, and it provides a level of abstraction between gRPC and SQLAlchemy that affords you more flexibility in making changes to one or the other.
There do appear to be some tools to help with the translation but, these highlight issues with their approach e.g. Mercator.

SQLAlchemy doesn't map reflected class

I have this code:
def advertiser_table(engine):
return Table('advertiser', metadata, autoload=True, autoload_with=engine)
And later I try this:
advertisers = advertiser_table(engine)
...
session.bulk_insert_mappings(
advertisers.name,
missing_advetisers.to_dict('records'),
)
where missing_adverisers is a Pandas DataFrame (but it's not important for this question).
The error this gives me is:
sqlalchemy.orm.exc.UnmappedClassError: Class ''advertiser'' is not mapped
From reading the documentation I could scramble enough to ask the question, but not much more than that... What is Mapper and why is it so detrimental to the functioning of this library?.. Why isn't "the class" mapped? Obviously, what am I to do to "map" it to whatever this library wants it to map?
A Mapper is the M in ORM. It is the thing that maps your table (advertisers in this case) to instances of a class (which you are missing in this case) in order for you to operate on it.
The reason it's confusing for you is because SQLAlchemy is actually two libraries in one -- one is called SQLAlchemy Core, and the other is the SQLAlchemy ORM. Core provides the ability to work with tables and to construct queries that return rows, while the ORM builds on top of Core to provide the ability to work with instances of classes and their relationships as an abstraction. Core roughly corresponds to things you can do on Connection and Engine, while ORM roughly corresponds to things you can do on Session.
So, all of that is to say, session.bulk_insert_mappings is an ORM functionality, and you cannot use it without having a mapped class.
What can you do instead? Use the equivalent Core functionality:
query = advertisers.insert().values(missing_advetisers.to_dict('records'))
engine.execute(query) # or session.execute(query)
Or even use the pandas-provided to_sql function:
missing_advetisers.to_sql("advertiser", engine, if_exists="append")
If you insist on using the ORM, you need to declare a mapped class for your table. The easiest way when using reflection is to use automap. The linked documentation has many examples, so I won't go into detail here.

flask-sqlalchemy or sqlalchemy

I am new in both flask and sqlalchemy, I just start working on a flask app, and I am using sqlalchemy for now. I was wondering if there is any significant benefit I can get from using flask-sqlalchemy vs sqlalchemy. I could not find enough motivations in http://packages.python.org/Flask-SQLAlchemy/index.html or maybe I did not understand the value!! I would appreciate your clarifications.
The main feature of the Flask-SQLAlchemy is proper integration with Flask application - it creates and configures engine, connection and session and configures it to work with the Flask app.
This setup is quite complex as we need to create the scoped session and properly handle it according to the Flask application request/response life-cycle.
In the ideal world that would be the only feature of Flask-SQLAlchemy, but actually, it adds few more things. Check out the docs for more info. Or see this blog post with the overview of them: Demystifying Flask-SQLAlchemy (update: the original article is not available at the moment, there is a snapshot on webarchive).
When I first worked with Flask and SQLAlchemy, I didn't like this overhead . I went over and extracted the session management code from the extension. This approach works, although I discovered that it is quite difficult to do this integration properly.
So the easier approach (which is used in another project I am working on) is to just drop the Flask-SQLAlchemy in and don't use any of additional features it provides. You will have the db.session and you can use it as if it was pure SQLAlchemy setup.
Flask-SQLAlchemy gives you a number of nice extra's you would else end up implementing yourself using SQLAlchemy.
Positive sides on using Flask-SQLAlchemy
Flask_SQLAlchemy handles session configuration, setup and teardown for you.
Gives you declarative base model that makes querying and pagination easier
Backend specific settings.Flask-SQLAlchemy scans installed libs for Unicode support and if fails automatically uses SQLAlchemy Unicode.
Has a method called apply_driver_hacks that automatically sets sane defaults to thigs like MySQL pool-size
Has nice build in methods create_all() and drop_all() for creating and dropping all tables. Useful for testing and in python command line if you did something stupid
It gives you get_or_404()instead of get() and find_or_404() instead of find() Code example at > http://flask-sqlalchemy.pocoo.org/2.1/queries/
Automatically set table names. Flask-SQLAlchemy automatically sets your table names converting your ClassName > class_name this can be overridden by setting __tablename__ class
List item
Negative sides on using Flask-SQLAlchemy
Using Flask-SQLAlchemy will make add additional difficulties to for
migrating from Flask to let's say Pyramid if you ever need to. This is mainly due to the custom declarative base model on Flask_SQLAchemy.
Using Flask-SQLAlchemy you risk using a package with a much smaller community than SQLAlchemy itself, which I cannot easily drop from active development any time soon.
Some nice extras Flask-SQLAlchemy has can make you confused if you do not know they are there.
To be honest, I don't see any benefits. IMHO, Flask-SQLAlchemy creates an additional layer you don't really need. In our case we have a fairly complex Flask application with multiple databases/connections (master-slave) using both ORM and Core where, among other things, we need to control our sessions / DB transactions (e.g. dryrun vs commit modes). Flask-SQLAlchemy adds some additional functionality such as automatic destruction of the session assuming some things for you which is very often not what you need.
The SQLAlchemy documentation clearly states that you should use Flask-SQLAlchemy (especially if you don't understand its benefits!):
[...] products such as Flask-SQLAlchemy [...] SQLAlchemy strongly recommends that these products be used as available.
This quote and a detailed motivation you can find in the second question of the Session FAQ.
As #schlamar suggests, Flask-SqlAlchemy is definitely a good thing. I'd just like to add some extra context to the point made there.
Don't feel like you are choosing one over the other. For example, let's say we want to grab all records from a table using a model using Flask-Sqlalchemy. It is as simple as
Model.query.all()
For a lot of the simple cases, Flask-Sqlalchemy is going to be totally fine. The extra point that I would like to make is, if Flask-Sqlalchemy is not going to do what you want, then there's no reason you can't use SqlAlchemy directly.
from myapp.database import db
num_foo = db.session.query(func.count(OtherModel.id)).filter(is_deleted=False).as_scalar()
db.session.query(Model.id, num_foo.label('num_foo')).order_by('num_foo').all()
As you can see, we can easily jump from one to the other with no trouble and in the second example we are in fact using the Flask-Sqlalchemy defined models.
Here is an example of a benefit flask-sqlalchemy gives you over plain sqlalchemy.
Suppose you're using flask_user.
flask_user automates creation and authentication of user objects, so it needs to access your database. The class UserManager does this by calling through to something called an "adapter" which abstracts the database calls. You provide an adapter in the UserManager constructor, and the adapter must implement these functions:
class MyAdapter(DBAdapter):
def get_object(self, ObjectClass, id):
""" Retrieve one object specified by the primary key 'pk' """
pass
def find_all_objects(self, ObjectClass, **kwargs):
""" Retrieve all objects matching the case sensitive filters in 'kwargs'. """
pass
def find_first_object(self, ObjectClass, **kwargs):
""" Retrieve the first object matching the case sensitive filters in 'kwargs'. """
pass
def ifind_first_object(self, ObjectClass, **kwargs):
""" Retrieve the first object matching the case insensitive filters in 'kwargs'. """
pass
def add_object(self, ObjectClass, **kwargs):
""" Add an object of class 'ObjectClass' with fields and values specified in '**kwargs'. """
pass
def update_object(self, object, **kwargs):
""" Update object 'object' with the fields and values specified in '**kwargs'. """
pass
def delete_object(self, object):
""" Delete object 'object'. """
pass
def commit(self):
pass
If you're using flask-sqlalchemy, you can use the built-in SQLAlchemyAdapter. If you're using sqlalchemy (not-flask-sqlalchemy) you might make different assumptions about the way in which objects are saved to the database (like the names of the tables) so you'll have to write your own adapter class.

Select statement with SqlAlchemy

Yes, very basic question.
I've successfully created my db using declarative_base, and can perform inserts into the db too. I just have a few questions about SqlAlchemy sql statements.
I've create a table called Location.
A few issues/questions (see code below):
For statement, "print row", I have to specify each column name that I want to have output. i.e. "print row.name, row.lat, etc" Why? (Otherwise the print statement outputs "<classname.Location at <...>>"
Also, what is the preferred way to interact with a db and perform queries (select, insert, update, etc.)- there seem to be a bunch of options: using sqlalchemy.orm.select for example, or engine.text(<sql query>).execute().fetchall(), or even conn.execute(<select>). Options are great, but right now they're all just confusing me.
Thanks so much for the tips!
Here's my code:
from sqlalchemy import create_engine
from sqlalchemy.sql import select
from location_db_setup import *
db_path = "sqlite:////volumes/users/shared/programming/python/web/map.db"
engine = create_engine(db_path, echo= True)
Session = sessionmaker(bind= engine)
session = Session()
session.query(Location).fetchall()
for row in locations:
print row
You code in sample is incomplete and has errors. So it's impossible to say for sure what is Location here. I assume it's a mapped class, so you are requesting a list of all Location objects, not rows. When you print an object you get its string representation. String representation of objects can be changed by defining custom __str__ method.
Although ORM is the most important part of SQLAlchemy, it's not the only. It also expose a lot of functionality not related to ORM directly. When you work with objects the preferred way to create queries are corresponding session method. But sometimes you need selectable objects not bound to particular session (they are not executed directly, but are used in expressions passed to session methods). That's why there are functions in sqlalchemy.orm package.
The preferred way to interact with a db when using an ORM is not to use queries but to use objects that correspond to the tables you are manipulating, typically in conjunction with the session object. SELECT queries become get() or find() calls in some ORMs, query() calls in others. INSERT becomes creating a new object of the type you want (and maybe explicitly adding it, eg session.add() in sqlalchemy). UPDATE becomes editing such an object, and DELETE becomes deleting an object (eg. session.delete() ). The ORM is meant to handle the hard work of translating these operations into SQL for you.
Have you read the tutorial?
Denis and Kylotan gave you good answers. I'm just gonna focus on point 2.
Sometimes depends on your taste. There are times when you need database specific features that an ORM can't do, that's a case when you should use Session(<sql here>).execute() or conn.execute(<sql here>). Another case is when you have a very complex query which is beyond you and you don't find a suitable ORM expression.
Usually, using ORM features like select([...]).where(... or Session.query(<Model here>).filter(... (declarative base) are enough. Almost every sql query has an ORM equivalent.

Categories

Resources