Enums in SQLite - python

My model uses a couple of Enum's for various columns, creating the tables using SQLAlchemy's create_all() method works fine in PostgreSQL, but it doesn't work with SQLite, it just stalls.
The problem seems to be with creating Enum's, as far as I can tell sqlite doesn't support these, but according to SQLAlchemy's docs that shouldn't pose a problem. When I try to create_all() on an sqlite memory db it just stalls, even with echo=True no output appears.
I tried the following code to demonstrate the problem:
from sqlalchemy import create_engine, Enum
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
e = Enum('foo', 'bar', metadata=Base.metadata)
engine = create_engine('sqlite:///:memory:', echo=True)
Base.metadata.create_all(bind=engine)
When I run this script it shows no output whatsoever, it just stalls. Python uses 100% CPU and the script never ends, until I ctrl-c it.
When I try to create_all() on my actual schema it does echo some PRAGMA commands trying to determine if tables exist, but then it stalls on the creating the Enums. I tried to remove code from the model definition until it worked just fine, this was when I figured out it's the Enums.
I tried to run this on Python 3.4 with SQLAlchemy 0.9.6 using SQLite 3.7.13.

A friend ran into exactly this same problem recently, and it looks to me like an infinite-loop bug in SQLA (which I should really report, so thanks for this minimal testcase :)).
Just remove the metadata= kwarg from your real code; as long as the enum is used as a type inside a declarative class, it'll inherit the right metadata anyway.

Related

Idiomatic Way to Insert/Upsert Protobuf Into A Relational Database

I have an Python object which is a ProtoBuf message, that I want to insert into a database.
Ideally I'd like to be able to do something like
from sqlalchemy import create_engine, MetaData, Table
from sqlalchemy.orm import mapper, sessionmaker
from event_pb2 import Event
engine = create_engine(...)
metadata = MetaData(engine)
table = Table("events", metadata, autoload=True)
mapping = mapper(Event, table)
Session = sessionmaker(engine)
session = Session()
byte_string = b'.....'
event = Event()
event.ParseFromString(byte_string)
session.add(event)
When I try the above I get an error AttributeError: 'Event' object has no attribute '_sa_instance_state' when I try to create the Event object. Which isn't shocking because the Event class has been generated by ProtoBuf.
Is there a better i.e. safer or more succinct way to do that than manually generating the insert statement by looping over all the field names and values? I'm not married to using SqlAlchemy if there's a better way to solve the problem.
I think it's generally advised that you should limit protobuf generated classes to the client and server side gRPC methods and, for any uses beyond that, map Protobuf objects from|to application specific classes.
In this case, define a set of SQLAlchemy classes and transform the gRPC objects into SQLAlchemy specific classes for your app.
This avoids breakage if e.g. gRPC maintainers change the implementation in a way that would break SQLAlchemy, it provides you with a means to translate between e.g. proto Timestamps and your preferred database time format, and it provides a level of abstraction between gRPC and SQLAlchemy that affords you more flexibility in making changes to one or the other.
There do appear to be some tools to help with the translation but, these highlight issues with their approach e.g. Mercator.

SQLAlchemy metadata conflict between model imports and alembic schema upgrade in test

I have a Flask application with a PostgreSQL database, Alembic migrations and SQLAlchemy.
Recently I started writing an integration test against the database. I import a model, say Item, that is mapped to table "item" and doing "from models import Item" triggers construction of the SQLAlchemy metadata for my tables.
In my test setup, I have
def setUpClass(cls):
try:
os.remove('testdb.db:')
except:
pass
#Run db migrations
global db_manager
db_manager = DatabaseManager()
alembic_cfg = Config("./alembic.ini")
alembic_cfg.attributes['db_manager'] = db_manager
command.upgrade(alembic_cfg, "head")
This results in
sqlalchemy.exc.InvalidRequestError: Table 'item' is already defined for this MetaData instance. Specify 'extend_existing=True' to redefine options and columns on an existing Table object.
I have debugged this to the metadata object being the same one between the calls and thus accumulating the "item" table twice to its tables array.
I have another pretty much identical application where this setup works, so I know it should work in theory. In this other application, metadata objects in the import and in the upgrade phases differ so the tables array is empty when alembic runs the upgrade, and hence there is no error.
Sorry I can't provide actual code, work project. Might be able to construct a minimal toy example if I find the time.
If I understood where the metadata actually gets created inside SQLAlchemy, I might be able to track down why alembic gets a clean metadata instance in the working app, and not in the problem app.
In the working application, "extend_existing" is not set and I'd rather not invoke some hack to mask an underlying issue.

PyCharm SQLAlchemy autocomplete

I started evaluating PyCharm 3 professional edition because I will be working on several Pyramid + SQLAlchemy projects.
One of the things I would really love to have is SQLAlchemy autocomplete.
I created a new starter project with the alchemy scaffold following these instructions. I also installed the SQLAlchemy package for the interpreter and virtual environment I am using for this project.
Also, when I created a new pycharm project for this code, the IDE suggested me to install the pyramid, sqlalchemy and other packages. Of course I accepted the suggestion and let the IDE install all of those packages.
In the models.py file, the DBSession is declared as follows:
DBSession = scoped_session(sessionmaker(extension=ZopeTransactionExtension()))
In the views.py file, the DBSession is used this way:
one = DBSession.query(MyModel).filter(MyModel.name == 'one').first()
So I started playing with the IDE and did something like this: typed DBSession. and the IDE just gave me some few suggestions, within which the 'query' function was not listed. Then I tried typing: DBSession.query(MyModel). and pressed Ctrl+Space to try to get suggestions and a 'No suggestions' message showed up.
I would really like to have the SQLAlchemy suggestions of functions that I could use on my DBSession variable (like filter, filter_by, first, etc). I would say that this is mandatory for me :)
Is there something I am missing? Or, PyCharm doesn't support this?
The solution I've found to this (picked up from somewhere on the web) was to type hint the DBSession instance like this:
DBSession = scoped_session(sessionmaker(extension=ZopeTransactionExtension()))
""":type: sqlalchemy.orm.Session"""
After this the code completion seems to work fine everywhere in the project
Note that the tutorial states:
This guide was written for PyCharm 2.7.3, although many of the topics apply for PyCharm 3.
In PyCharm 3 Professional, it is much easier to install Pyramid and start using a scaffold. See one of my video tutorials Pyramid in PyCharm in 5 minutes at 1:17 specifically.
Also you might want to blow away your project and start fresh if stuff doesn't work as expected.
PyCharm 3 Professional supports SQAlchemy as follows.
Code insight (2.6+)
Possibility to view database structure in a diagram. Refer to the section Working with Diagrams.
Code completion and resolve. (3.0+)
See more information on how to use code completion.
I use a type declaration after a variable assignment:
from sqlalchemy import create_engine
from sqlalchemy.engine import Engine
...
engine = create_engine(connect_str, max_overflow=10)
engine: Engine
As a use for variables in for loop, I used:
for table, meta in tables.items():
meta: Table
pass
in which, tables is sqlalchemy.orm.mapper.Mapper, and table is an imported type:
from sqlalchemy import create_engine, Table
If anyone gets here now, the best solution I'v seen for this issue can be found here. To save you the click:
from contextlib import contextmanager
from typing import ContextManager
#contextmanager
def session() -> ContextManager[Session]:
yield Session(...)

Driver python for postgresql

Which is the best driver in python to connect to postgresql?
There are a few possibilities, http://wiki.postgresql.org/wiki/Python but I don't know which is the best choice
Any idea?
psycopg2 is the one everyone uses with CPython. For PyPy though, you'd want to look at the pure Python ones.
I would recommend sqlalchemy - it offers great flexibility and has a sophisticated inteface.
Futhermore it's not bound to postgresql alone.
Shameless c&p from the tutorial:
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
# an Engine, which the Session will use for connection
# resources
some_engine = create_engine('postgresql://scott:tiger#localhost/')
# create a configured "Session" class
Session = sessionmaker(bind=some_engine)
# create a Session
session = Session()
# work with sess
myobject = MyObject('foo', 'bar')
session.add(myobject)
session.commit()
Clarifications due to comments (update):
sqlalchemy itself is not a driver, but a so called Object Relational Mapper. It does provide and include it's own drivers, which in the postgresql-case is libpq, which itself is wrapped in psycopg2.
Because the OP emphasized he wanted the "best driver" to "connect to postgresql" i pointed sqlalchemy out, even if it might be a false answer terminology-wise, but intention-wise i felt it to be the more useful one.
And even if i do not like the "hair-splitting" dance, i still ended up doing it nonetheless, due to the pressure felt coming from the comments to my answer.
I apologize for any irritations caused by my slander.

Add database support at runtime

I have a python module that I've been using over the years to process a series of text files for work. I now have a need to store some of the info in a db (using SQLAlchemy), but I would still like the flexibility of using the module without db support, i.e. not have to actually have sqlalchemy import'ed (or installed). As of right now, I have the following... and I've been creating Product or DBProduct, etc depending on wether I intend to use a db or not.
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class Product(object):
pass
class WebSession(Product):
pass
class Malware(WebSession):
pass
class DBProduct(Product, Base):
pass
class DBWebSession(WebSession, DBProduct):
pass
class DBMalware(Malware, DBWebSession):
pass
However, I feel that there has got to be an easier/cleaner way to do this. I feel that I'm creating an inheritance mess and potential problems down the road. Ideally, I'd like to create a single class of Product, WebSession, etc (maybe using decorators) that contains the information neccessary for using a db, but it's only enabled/functional after calling something like enable_db_support(). Once that function is called, then regardless of what object I create, itself (and all the objects it inherits) enable all the column
bindings, etc. I should also note that if I somehow figure out how to include Product and DBProduct in one class, I sometimes need 2 versions of the same function: 1 which is called if db support is enabled and 1 if it's not. I've also considered "recreating" the object hierarchy when enable_db_support() is called, however that turned out to be a nightmare as well.
Any help is appreciated.
Well, you can probably get away with creating a pure non-DB aware model by using Classical Mapping without using a declarative extension. In this case, however, you will not be able to use relationships as they are used in SA, but for simple data import/export types of models this should suffice:
# models.py
class User(object):
pass
----
# mappings.py
from sqlalchemy import Table, MetaData, Column, ForeignKey, Integer, String
from sqlalchemy.orm import mapper
from models import User
metadata = MetaData()
user = Table('user', metadata,
Column('id', Integer, primary_key=True),
Column('name', String(50)),
Column('fullname', String(50)),
Column('password', String(12))
)
mapper(User, user)
Another option would be to have a base class for your models defined in some other module and configure on start-up this base class to be either DB-aware or not, and in case of DB-aware version add additional features like relationships and engine configurations...
It seems to me that the DRYest thing to do would be to abstract away the details of your data storage format, be that a plain text file or a database.
That is, write some kind of abstraction layer that your other code uses to store the data, and make it so that the output of your abstraction layer is switchable between SQL or text.
Or put yet another way, don't write a Product and DB_Product class. Instead, write a store_data() function that can be told to use either format='text' or format='db'. Then use that everywhere.
This is actually the same thing SQLAlchemy does behind the scenes - you don't have to write separate code for SQLAlchemy depending on whether it's driving mySQL, PostgreSQL, etc. That is all handled in SQLAlchemy, and you use the abstracted (database-neutral) interface.
Alternately, if your objection to SQLAlchemy is that it's not a Python builtin, there's always sqlite3. This gives you all the goodness of an SQL relational database with none of the fat.
Alternately alternately, use sqlite3 as an intermediate format. So rewrite all your code to use sqlite3, and then translate from sqlite3 to plain text (or another database) as required. In the limit case, conversion to plain text is only a sqlite3 db .dump away.

Categories

Resources