Encrypting data into Postgres and decrypting from Postgres using sqlalchemy and ORM

Encrypting data into Postgres and decrypting from Postgres using sqlalchemy and ORM - python

I want to encrypt data into Postgres and then decrypt and read from it. I prefer using sqlalchemy and ORM but if it is difficult to do using sqlalchemy and ORM then I am curious to know the other ways also
I tried using the below code, It is encrypting into the database but it is not asking me for any key or anything for the decryption. May I know why?
import sqlalchemy as sa
from sqlalchemy import create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
from sqlalchemy_utils import EncryptedType
from sqlalchemy_utils.types.encrypted.encrypted_type import AesEngine
secret_key = 'secretkey1234'
connection_string = '***********'
engine = create_engine(connection_string)
connection = engine.connect()
sa.orm.configure_mappers()
Session = sessionmaker(bind=connection)
session = Session()
Base = declarative_base()
class User(Base):
__tablename__ = "user"
id = sa.Column(sa.Integer, primary_key=True)
username = sa.Column(EncryptedType(sa.Unicode,secret_key,AesEngine,'pkcs5'))
number_of_accounts = sa.Column(EncryptedType(sa.Integer,secret_key,AesEngine,'oneandzeroes'))
Base.metadata.create_all(connection)
I run the below code for the decryption:
user_id = user.id
session.expunge_all()
user_instance = session.query(User).get(user_id)
print('username: {}'.format(user_instance.username))

You have likely figured this out by now, as this question is a few years old, but for anyone else looking:
You are interacting with your Postgres tables through the model classes you define (in your example User).
When you execute a query, data is returned and passed through the class to determine how to process the response. From your example a query will return results for id, username and number_of_accounts. If you were to log each element returned, id would be processed as an int because that is how it is defined in your model.
Similarly, username and number_of_accounts will also be processed based on the definition in the User class - as an EncryptedType() value. This is a more complex datatype though. Your model defines the key to use for encryption/decryption. Prior to storing the value, the results are decrypted based on the context provided in your model. In this case using the AESEngine and decrypted with the key of 'secretkey1234'. That is why you don't need to specify a key on read. It is already defined in your model.
If you were to run a select * from user limit 1; query directly on your Postgres db, the values displayed for your two encrypted columns would remain encrypted, as you would not be passing the results through your defined model.

Related

"NULL identity key" error using SQLAlchemy's base automap to reflect a postgres DB using IDENTITY columns

I have a postgres database that I'm trying to reflect that uses the now standard "Identity" column for primary keys.
Here's my table definition:
create table class_label (
class_label_id integer PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
class_name varchar not null,
default_color varchar,
created_dttm timestamp default current_timestamp NOT NULL,
created_by varchar DEFAULT USER NOT NULL,
updated_dttm timestamp default current_timestamp NOT NULL,
updated_by varchar DEFAULT user NOT NULL
);
And here's my code:
from sqlalchemy import create_engine, MetaData, insert, Table, or_, and_, func
from sqlalchemy.ext.automap import automap_base
from sqlalchemy.orm import scoped_session, sessionmaker
import os
from sqlalchemy.schema import CreateColumn
from sqlalchemy.ext.compiler import compiles
#compiles(CreateColumn, 'postgresql')
def use_identity(element, compiler, **kw):
text = compiler.visit_create_column(element, **kw)
text = text.replace("SERIAL", "INT GENERATED BY DEFAULT AS IDENTITY")
return text
usr = os.environ.get("POSTGRES_USR")
pwd = os.environ.get("POSTGRES_PWD")
host = os.environ.get("POSTGRES_HOST")
engine = create_engine('postgresql://' + usr + ':' + pwd + host, convert_unicode=True)
session = scoped_session(sessionmaker(bind=engine))
metadata = MetaData(bind=engine)
metadata.reflect(engine, only=['class_label'])
Base = automap_base(metadata=metadata)
Base.prepare()
Class_Label = Base.classes.class_label
session.add(Class_Label(class_name="Testing", default_color="red"))
session.commit()
When I run my code, I get this error:
sqlalchemy.orm.exc.FlushError: Instance <class_label at 0x1091c4ba8> has a NULL identity key.
If this is an auto-generated value, check that the database table allows generation of new
primary key values, and that the mapped Column object is configured to expect these generated
values. Ensure also that this flush() is not occurring at an inappropriate time, such as within
a load() event.
Per https://docs.sqlalchemy.org/en/13/dialects/postgresql.html#postgresql-10-identity-columns I understand that this is somewhat a shortcoming of SQLAlchemy, but I'm wondering if the work-around they suggest can work for auto-mapped/reflected databases and how I would implement it.
I'm using SQLAlchemy 1.3.16 and Postgres 11.

A fix was added in SQLAlchemy 1.4

How to reproduce an error caused by sqlalchemy session caching

I'm trying to reproduce a bug locally which I think is caused by a race condition where an update is relying on stale data (due to synchronize_session=False), essentially something like the following:
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import create_engine
from sqlalchemy import Column, Integer, Boolean, CheckConstraint
from sqlalchemy.orm.session import sessionmaker
Base = declarative_base()
# change this to your actual postgres url
db_string = "postgres://max:steve#localhost/test"
db = create_engine(db_string)
class User(Base):
__tablename__ = 'users4'
id = Column(Integer, primary_key=True)
deleted = Column(Boolean)
super_user = Column(Boolean, CheckConstraint('NOT (super_user AND deleted)', name='check1'))
Base.metadata.create_all(db)
Session = sessionmaker(bind=db)
session = Session()
session.autoflush = False
# Create a user
session.add(User(id=1, deleted=False, super_user=False))
# Delete that user
session.query(User).filter(User.id == 1).update(
{'deleted': True}, synchronize_session=False)
# Make all non-deleted users into super users
# Will violate the CHECK constraint if it's the previous query hasn't
# been flushed
session.query(User).filter(User.deleted == False).update({'super_user': True})
Is there a way I can force sqlalchemy to use the cached session (maybe through mocking or some such) so that this code will raise violate the constraint and raise an IntegrityError?
The docs for synchronize_session say that
... updated objects may still remain in the session with stale values on their attributes, which can lead to confusing results.
This is the situation that I want to reproduce.

The last update query does not utilize the stale session data. I think a case like this, where logic acts on the stale attributes will trigger the check constraint when a flush finally does occur:
# Create a user
user1 = User(id=1, deleted=False, super_user=False)
session.add(user1)
# Delete that user
session.query(User).filter(User.id == 1).update(
{'deleted': True}, synchronize_session=False)
# Make all non-deleted users into super users
# Will violate the CHECK constraint if it's the previous query hasn't
# been flushed
if not user1.deleted:
user1.super_user = True
session.flush()

What is the type of table_name in select_from(table_name)?

I'm trying to get the number of rows from a SQL Server which consists of many tables by looping through it to see how much data they contain. However, I'm not sure what will go into select_from() function. As I currently supply Unicode for table names and it raised
NoInspectionAvailable: No inspection system is available for object of type <type 'unicode'>
The code that I used was
from sqlalchemy import create_engine
import urllib
from sqlalchemy import inspect
import sqlalchemy
from sqlalchemy import select, func, Integer, Table, Column, MetaData
from sqlalchemy.orm import sessionmaker
connection_string = "DRIVER={SQL Server}; *hidden*"
connection_string = urllib.quote_plus(connection_string)
connection_string = "mssql+pyodbc:///?odbc_connect=%s" % connection_string
engine = sqlalchemy.create_engine(connection_string)
Session = sessionmaker()
Session.configure(bind=engine)
session = Session()
connection = engine.connect()
inspector = inspect(engine)
for table_name in inspector.get_table_names():
print session.query(func.count('*')).select_from(table_name).scalar()

Typically, it's a class name that refers to a class that describes the database table.
In the sqlalchemy docs, http://docs.sqlalchemy.org/en/latest/orm/tutorial.html, they have you create a base class using declarative base and then create child classes for each table you want to query. You would then pass that class name into the select_from function unquoted.
The Flask framework provides a built-in base class that is ready for use called db.Model and Django has one called models.Model.
Alternatively, you can also pass queries. I use the Flask framework typically for python so I usually initiate queries like this:
my_qry = db.session.query(Cust).filter(Cust, Cust.cust == 'lolz')
results = my_qry.all()
On a side note, if you decide to look at .NET they also have nice ORMs. Personally, I favor Entity Framework, but Linq to SQL is out there, too.

Sqlalchemy if table does not exist

I wrote a module which is to create an empty database file
def create_database():
engine = create_engine("sqlite:///myexample.db", echo=True)
metadata = MetaData(engine)
metadata.create_all()
But in another function, I want to open myexample.db database, and create tables to it if it doesn't already have that table.
EG of the first, subsequent table I would create would be:
Table(Variable_TableName, metadata,
Column('Id', Integer, primary_key=True, nullable=False),
Column('Date', Date),
Column('Volume', Float))
(Since it is initially an empty database, it will have no tables in it, but subsequently, I can add more tables to it. Thats what i'm trying to say.)
Any suggestions?

I've managed to figure out what I intended to do. I used engine.dialect.has_table(engine, Variable_tableName) to check if the database has the table inside. IF it doesn't, then it will proceed to create a table in the database.
Sample code:
engine = create_engine("sqlite:///myexample.db") # Access the DB Engine
if not engine.dialect.has_table(engine, Variable_tableName): # If table don't exist, Create.
metadata = MetaData(engine)
# Create a table with the appropriate Columns
Table(Variable_tableName, metadata,
Column('Id', Integer, primary_key=True, nullable=False),
Column('Date', Date), Column('Country', String),
Column('Brand', String), Column('Price', Float),
# Implement the creation
metadata.create_all()
This seems to be giving me what i'm looking for.

Note that in 'Base.metadata' documentation it states about create_all:
Conditional by default, will not attempt to recreate tables already
present in the target database.
And if you can see that create_all takes these arguments: create_all(self, bind=None, tables=None, checkfirst=True), and according to documentation:
Defaults to True, don't issue CREATEs for tables already present in
the target database.
So if I understand your question correctly, you can just skip the condition.

The accepted answer prints a warning that engine.dialect.has_table() is only for internal use and not part of the public API. The message suggests this as an alternative, which works for me:
import os
import sqlalchemy
# Set up a connection to a SQLite3 DB
test_db = os.getcwd() + "/test.sqlite"
db_connection_string = "sqlite:///" + test_db
engine = create_engine(db_connection_string)
# The recommended way to check for existence
sqlalchemy.inspect(engine).has_table("BOOKS")
See also the SQL Alchemy docs.

For those who define the table first in some models.table file, among other tables.
This is a code snippet for finding the class that represents the table we want to create ( so later we can use the same code to just query it )
But together with the if written above, I still run the code with checkfirst=True
ORMTable.__table__.create(bind=engine, checkfirst=True)
models.table
class TableA(Base):
class TableB(Base):
class NewTableC(Base):
id = Column('id', Text)
name = Column('name', Text)
form
Then in the form action file:
engine = create_engine("sqlite:///myexample.db")
if not engine.dialect.has_table(engine, table_name):
# Added to models.tables the new table I needed ( format Table as written above )
table_models = importlib.import_module('models.tables')
# Grab the class that represents the new table
# table_name = 'NewTableC'
ORMTable = getattr(table_models, table_name)
# checkfirst=True to make sure it doesn't exists
ORMTable.__table__.create(bind=engine, checkfirst=True)

engine.dialect.has_table does not work for me on cx_oracle.
I am getting AttributeError: 'OracleDialect_cx_oracle' object has no attribute 'default_schema_name'
I wrote a workaround function:
from sqlalchemy.engine.base import Engine
def orcl_tab_or_view_exists(in_engine: Engine, in_object: str, in_object_name: str,)-> bool:
"""Checks if Oracle table exists in current in_engine connection
in_object: 'table' | 'view'
in_object_name: table_name | view_name
"""
obj_query = """SELECT {o}_name FROM all_{o}s WHERE owner = SYS_CONTEXT ('userenv', 'current_schema') AND {o}_name = '{on}'
""".format(o=in_object, on=in_object_name.upper())
with in_engine.connect() as connection:
result = connection.execute(obj_query)
return len(list(result)) > 0

This is the code working for me to create all tables of all model classes defined with Base class
from sqlalchemy import create_engine
from sqlalchemy.ext.declarative import declarative_base
class YourTable(Base):
__tablename__ = 'your_table'
id = Column(Integer, primary_key = True)
DB_URL="mysql+mysqldb://<user>:<password>#<host>:<port>/<db_name>"
scoped_engine = create_engine(DB_URL)
Base = declarative_base()
Base.metadata.create_all(scoped_engine)

Sqlalchemy session.refresh does not refresh object

I have the following mapping (straight from SA examples):
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
name = Column(String)
fullname = Column(String)
password = Column(String)
I'm working with a MySql DB and the table has an innoDB engine.
I have a single record in my table:
1|'user1'|'user1 test'|'password'
I've opened a session with the following code:
from sqlalchemy.orm.session import sessionmaker
from sqlalchemy.engine import create_engine
from sqlalchemy.orm.scoping import scoped_session
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
db_engine = create_engine('mysql://...#localhost/test_db?charset=utf8',echo=False,pool_recycle=1800)
session_factory = sessionmaker(bind=db_engine,autocommit=False,autoflush=False)
session_maker = scoped_session(session_factory)
session = session_maker()
user_1 = session.query(User).filter(User.id==1).one()
user_1.name # This prints: u'user1'
Now, when I change the record's name in the DB to 'user1_change' and commit it and then refresh the object like this:
session.refresh(user_1)
user_1.name # This still prints: u'user1' and not u'user1_change'
It still prints: u'user1' and not u'user1_change'.
What am I missing (or setting up wrong) here?
Thanks!

From the docs:
Note that a highly isolated transaction will return the same values as were previously read in that same transaction, regardless of changes in database state outside of that transaction
SQLAlchemy uses a transactional unit of work model, wherein each transaction is assumed to be internally consistent. A session is an interface on top of a transaction. Since a transaction is assumed to be internally consistent, SQLAlchemy will only (well, not quite, but for ease of explanation...) retrieve a given piece of data from the database and update the state of the associated objects once per transaction. Since you already queried for the object in the same session transaction, SQLAlchemy will not update the data in that object from the database again within that transaction scope. If you want to poll the database, you'll need to do it with a fresh transaction each time.

session.refresh() didn't work for me either. Even though I saw a low-level SELECT the object was not updated after the refresh.
This answer https://stackoverflow.com/a/11121788/562267 hints to doing an actual commit/rollback to reset the session, and that worked for me:
user_1 = session.query(User).filter(User.id==1).one()
user_1.name # This prints: u'user1'
# update the database from another client here
session.commit()
user_1 = session.query(User).filter(User.id==1).one()
user_1.name # Should be updated now.

Did you try with "expire" as described in the official doc:
http://docs.sqlalchemy.org/en/rel_0_8/orm/session.html#refreshing-expiring
# expire objects obj1, obj2, attributes will be reloaded
# on the next access:
session.expire(user_1)
session.refresh(user_1)
Using expire on a object results in a reload that will occur upon next access.

Merge the session.
u = session.query(User).get(id)
u.name = 'user1_changed'
u = session.merge(u)
This will update the database and return the newer object.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Encrypting data into Postgres and decrypting from Postgres using sqlalchemy and ORM - python

Related

"NULL identity key" error using SQLAlchemy's base automap to reflect a postgres DB using IDENTITY columns

How to reproduce an error caused by sqlalchemy session caching

What is the type of table_name in select_from(table_name)?

Sqlalchemy if table does not exist

Sqlalchemy session.refresh does not refresh object

Categories

Resources