reflecting every schema from postgres DB using SQLAlchemy - python

I have an existing database that has two schemas, named schools and students, contained in an instance of declarative_base and through two different classes that inherit from that instance
class DirectorioEstablecimiento(Base):
__table_args__ = {'schema': 'schools'}
__tablename__ = 'addresses'
# some Columns are defined here
and
class Matricula(Base):
__table_args__ = {'schema': 'students'}
__tablename__ = 'enrollments'
# some Columns are defined here
I can use the Base instance to as Base.metadata.create_all(bind=engine) to recreate it in a test DB I have in postgres. I can confirm this was done without problems if I query the pg_namespace
In [111]: engine.execute("SELECT * FROM pg_namespace").fetchall()
2017-12-13 18:04:01,006 INFO sqlalchemy.engine.base.Engine SELECT * FROM pg_namespace
2017-12-13 18:04:01,006 INFO sqlalchemy.engine.base.Engine {}
Out[111]:
[('pg_toast', 10, None),
('pg_temp_1', 10, None),
('pg_toast_temp_1', 10, None),
('pg_catalog', 10, '{postgres=UC/postgres,=U/postgres}'),
('public', 10, '{postgres=UC/postgres,=UC/postgres}'),
('information_schema', 10, '{postgres=UC/postgres,=U/postgres}'),
('schools', 16386, None),
('students', 16386, None)]
and from the psql CLI
user# select * from pg_tables;
schemaname | tablename | tableowner | tablespace | hasindexes | hasrules | hastriggers | rowsecurity
--------------------+------------------------------+------------+------------+------------+----------+-------------+-------------
schools | addresses | diego | | t | f | f | f
students | enrollments | diego | | t | f | f | f
pg_catalog | pg_statistic | postgres | | t | f | f | f
pg_catalog | pg_type | postgres | | t | f | f | f
pg_catalog | pg_authid | postgres | pg_global | t | f | f | f
pg_catalog | pg_user_mapping | postgres | | t | f | f | f
-- other tables were omitted
However, if I want to reflect that database in some other instance of declarative_base nothing is reflected.
Something like
In [87]: Base.metadata.tables.keys()
Out[87]: dict_keys(['schools.addresses', 'students.enrollments'])
In [88]: new_base = declarative_base()
In [89]: new_base.metadata.reflect(bind=engine)
In [90]: new_base.metadata.tables.keys()
Out[90]: dict_keys([])
I understand that reflect accepts a schema as a parameter but I would like to obtain all of them at once during reflection. For some reason I can achieve this one at a time.
Is there a way to do this?

When you call metadata.reflect() it will only reflect the default schema (the first in your search_path for which you have permissions). So if your search_path is public,students,school it will only reflect the tables in schema public. If you do not have permissions on schema public, public schema will be skipped and will default to reflect only students.
The default schema is retrieved by SELECT current_schema();
In order to reflect other schemas
you need to call metadata.reflect() for each schema.
metadata.reflect(schema='public') # will reflect even if you do not have permissions on the tables in schema `public`, as long as you have access to pg_* system tables
metadata.reflect(schema='students')
metadata.reflect(schema='schools')
Note: When you reflect with an explicit schema
Reflected tables in metadata.tables will have the keys with the tables fully qualified schema name as in schema1.mytable, schema2.mytable
Any conflicting table names will be replaced with the later one. If you have any tables with the same name, you should implement your the function classname_for_table to prefix the names with the schema name.
An example of prefixing table names with the schema
def classname_for_table(base, tablename, table):
schema_name = table.schema
fqname = '{}.{}'.format(schema_name, tablename)
return fqname
Base.prepare(classname_for_table=classname_for_table)
**As a bonus, here is a small snippet which will expose all tables within a dynamic submodule per schema so you can access it **
create a file ie. db.py and place the following
from types import ModuleType
def register_classes(base, module_dict):
for name, table in base.classes.items():
schema_name, table_name = name.split('.')
class_name = table_name.title().replace('_', '')
if schema_name not in module_dict:
module = module_dict[schema_name] = ModuleType(schema_name)
else:
module = module_dict[schema_name]
setattr(module, class_name, table)
Call this function with the automap base and the __dict__ of the module which you would like to register the schemas with.
register_classes(base, globals())
or
import db
db.register_classes(base, db.__dict__)
and then you will get
import db
db.students.MyTable
db.schools.MyTable

Related

PostgreSQL JOIN on JSON Object column

I'm supposed to join 3 different tables on postgres:
lote_item (on which I have some books id's)
lote_item_log (on which I have a column "attributes", with a JSON object such as {"aluno_id": "2823", "aluno_email": "someemail#outlook.com", "aluno_unidade": 174, "livro_codigo": "XOZK-0NOYP0Z1EMJ"}) - Obs.: Some values on aluno_unidade are null
and finally
company (on which I have every school name for every aluno_unidade.
Ex: aluno_unidade = 174 ==> nome_fantasia = mySchoolName).
Joining the first two tables was easy, since lote_item_log has a foreign key which I could match like this:
SELECT * FROM lote_item JOIN lote_item_log ON lote_item.id = lote_item_log.lote_item_id
Now, I need to get the School Name, contained on table company, with the aluno_unidade ID from table lote_item_log.
My current query is:
SELECT
*
FROM
lote_item
JOIN
lote_item_log
ON
lote_item.id = lote_item_log.lote_item_id
JOIN
company
ON
(
SELECT
JSON_EXTRACT_PATH_TEXT(attributes, 'aluno_unidade')::int
FROM
lote_item_log
WHERE
operation_id = 6
) = company.senior_id
WHERE
item_id = {book_id};
operation_id determines which school is active.
ERROR I'M GETTING:
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.CardinalityViolation) more than one row returned by a subquery used as an expression
I tried LIMIT 1, but then I got just an empty array.
What I need is:
lote_item.created_at | lote_item.updated_at | lote_item.item_id | uuid | aluno_email | c014_id | nome_fantasia | cnpj | is_franchise | is_active
somedate | somedate | some_item_id | XJW4 | someemail#a | some_id | SCHOOL NAME | cnpj | t | t
I got it.
Not sure it's the best way, but worked...
SELECT
*
FROM
lote_item
JOIN
lote_item_log
ON
lote_item.id = lote_item_log.lote_item_id
JOIN
company
ON
JSON_EXTRACT_PATH_TEXT(attributes, 'aluno_unidade')::int = company.senior_id
WHERE
item_id = {book_id};

Specify FLOAT column precision in Peewee with MariaDB/MySQL

I am trying to specify the float precision for a column definition in Peewee and cannot find how to do this in the official docs or in the github issues.
My example model is below:
DB = peewee.MySQLDatabase(
"example",
host="localhost",
port=3306,
user="root",
password="whatever"
)
class TestModel(peewee.Model):
class Meta:
database = DB
value = peewee.FloatField()
The above creates the following table spec in the database:
SHOW COLUMNS FROM testmodel;
/*
+-------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+----------------+
| value | float | NO | | NULL | |
+-------+---------+------+-----+---------+----------------+
*/
What I would like is to specify the M and D parameters that the FLOAT field accepts so that the column is created with the precision parameters I need. I can accomplish this in SQL after the table is created using the below:
ALTER TABLE testmodel MODIFY COLUMN value FLOAT(20, 6); -- 20 and 6 are example parameters
Which gives this table spec:
SHOW COLUMNS FROM testmodel;
/*
+-------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+----------------+
| value | float(20,6) | YES | | NULL | |
+-------+-------------+------+-----+---------+----------------+
*/
But I'd like it be done at table creation time within the peewee structure itself, rather than needing to run a separate "alter table" query after the peewee.Database.create_tables() method is run. If there is no way to do this in the peewee.FloatField itself then I'd also accept any other solution so long as it ensures the create_tables() call will create the columns with the specified precision.
As #booshong already mentions
The simpelst solution is to subclass the default FloatField like this :
class CustomFloatField(FloatField):
def __init__(self, *args, **kwargs):
self.max_digits = kwargs.pop("max_digits", 7)
self.decimal_places = kwargs.pop("decimal_places", 4)
super().__init__(*args, **kwargs)
def get_modifiers(self):
return [self.max_digits, self.decimal_places]
and then use it like this
my_float_field = CustomFloatField(max_digits=2, decimal_places=2)

How to create index on on SQLAlchemy column_property?

Using SQLAlchemy with an SQLite engine, I've got a self-referential hierarchal table that describes a directory structure.
from sqlalchemy import Column, Integer, String, ForeignKey, Index
from sqlalchemy.orm import column_property, aliased, join
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class Dr(Base):
__tablename__ = 'directories'
id = Column(Integer, primary_key=True)
name = Column(String)
parent_id = Column(Integer, ForeignKey('directories.id'))
Each Dr row only knows it's own "name" and its "parent_id". I've added a recursive column_property called "path" that returns a string containing all of a Dr's ancestors from the root Dr.
root_anchor = (
select([Dr.id, Dr.name, Dr.parent_id,Dr.name.label('path')])
.where(Dr.parent_id == None).cte(recursive=True)
)
dir_alias = aliased(Dr)
cte_alias = aliased(root_anchor)
path_table = root_anchor.union_all(
select([
dir_alias.id, dir_alias.name,
dir_alias.parent_id, cte_alias.c.path + "/" + dir_alias.name
]).select_from(join(
dir_alias, cte_alias, onclause=cte_alias.c.id==dir_alias.parent_id)
))
)
Dr.path = column_property(
select([path_table.c.path]).where(path_table.c.id==Dr.id)
)
Here's an example of the output:
"""
-----------------------------
| id | name | parent_id |
-----------------------------
| 1 | root | NULL |
-----------------------------
| 2 | kid | 1 |
-----------------------------
| 3 | grandkid | 2 |
-----------------------------
"""
sqllite_engine = create_engine('sqlite:///:memory:')
Session = sessionmaker(bind=sqllite_engine)
session = Session()
instance = session.query(Dr).filter(Dr.name=='grandkid').one()
print(instance.path)
# Outputs: "root/kid/grandkid"
I'd like to be able to add an index, or a least a unique constraint, on the "path" property so that unique paths cannot exist more than once in the table. I've tried:
Index('pathindex', Directory.path, unique=True)
...with no luck. No error is raised, but SQLAlchemy doesn't seem to register the index, it just silently ignores it. It still allows adding a duplicate path, e.g.:
session.add(Dr(name='grandkid', parent_id=2))
session.commit()
As further evidence that the Index() was ignored, inspecting the "indexes" property of the table results in an empty set:
print(Dr.__table__.indexes)
#Outputs: set([])
It's essential to me that duplicate paths cannot exist in the database. I'm not sure whether what I'm trying to do with column_property is possible in SQLAlchemy, and if not I'd love to hear some suggestions on how else I can go about this.
I think unique index should suffice, in class Db
__table_args__ = (UniqueConstraint('parent_id', 'name'), )

Inserting Unicode values on alembic migration

I'm working on a small pet-project that involves some accounting in multiple currencies. During its development I decided to move from straight-forward DB setting to DB-migrations using alembic. And on some migrations I need to populate DB with initial currencies, that are displayed in Ukrainian.
My problem is that data populated from alembic migration scripts is saving in some unknown encoding, so I cannot use it within the application (that expects to be human readable). My settings as well as script are as follows:
alembic.ini
...
sqlalchemy.url = mysql+pymysql://defaultuser:defaultpwd#localhost/petdb
...
alembic/versions/f433ab2a814_adding_currency.py
from alembic import op
# -*- coding: utf-8 -*-
"""Adding currency
Revision ID: f433ab2a814
Revises: 49538bba2220
Create Date: 2016-03-08 13:50:35.369021
"""
# revision identifiers, used by Alembic.
revision = 'f433ab2a814'
down_revision = '1c0b47263c82'
branch_labels = None
depends_on = None
def upgrade():
op.create_table(
'currency',
Column('id', Integer, primary_key=True),
Column('name', Unicode(120), nullable=False),
Column('abbr', String(3), nullable=False)
)
op.execute(u'INSERT INTO currency SET name="{}", abbr="{}";'.format(u"Гривня", "UAH"))
After checking table currency from mysql client or mysql-workbench, it is displayed as:
mysql> SELECT * FROM currency;
+----+----------------------------+------+
| id | name | abbr |
+----+----------------------------+------+
| 1 | Ð“Ñ€Ð¸Ð²Ð½Ñ | UAH |
+----+----------------------------+------+
Expected result is:
mysql> SELECT * FROM currency;
+----+----------------------------+------+
| id | name | abbr |
+----+----------------------------+------+
| 1 | Гривня | UAH |
+----+----------------------------+------+
From my application I've been setting this value as follows:
from petproject import app
app.config.from_object(config.DevelopmentConfig)
engine = create_engine(app.config["DATABASE"]+"?charset=utf8",
convert_unicode=True, encoding="utf8", echo=False)
db_session = scoped_session(sessionmaker(autocommit=False,
autoflush=False,
bind=engine))
if len(db_session.query(Currency).all()) == 0:
default_currency = Currency()
default_currency.name = u"Гривня"
default_currency.abbr = u"UAH"
db_session.add(default_currency)
db_session.commit()
So I'm wondering how to insert initial Unicode values on migration that will be stored in correct encoding. Did I miss anything?
After a more extended analysis, I discovered, that MySQL keeps all data in 'windows-1252' encoding. MySQL manual (section "West European Character Sets") states about this issue as:
latin1 is the default character set. MySQL's latin1 is the same as the Windows cp1252 character set.
It looked like either MySQL ignored character_set_client that, I assumed to be 'utf-8', or SQLAlchemy / alembic didn't inform server to accept data as 'UTF-8' encoded data. Unfortunatelly, recommended option '?charset=utf8' is not possible to set in alembic.ini.
In order to accept and save data in correct encoding, I set character set manually by calling op.execute('SET NAMES utf8');. Thus complete code looks like:
def upgrade():
op.create_table(
'currency',
Column('id', Integer, primary_key=True),
Column('name', Unicode(120), nullable=False),
Column('abbr', String(3), nullable=False)
)
op.execute('SET NAMES utf8')
op.execute(u'INSERT INTO currency SET name="{}", abbr="{}";'.format(u"Гривня", "UAH"))
And result became as expected:
mysql> SELECT * FROM currency;
+----+----------------------------+------+
| id | name | abbr |
+----+----------------------------+------+
| 1 | Гривня | UAH |
+----+----------------------------+------+

SQLAlchemy Column to Row Transformation and vice versa -- is it possible?

I'm looking for a SQLAlchemy only solution for converting a dict received from a form submission into a series of rows in the database, one for each field submitted. This is to handle preferences and settings that vary widely across applications. But, it's very likely applicable to creating pivot table like functionality. I've seen this type of thing in ETL tools but I was looking for a way to do it directly in the ORM. I couldn't find any documentation on it but maybe I missed something.
Example:
Submitted from form: {"UniqueId":1, "a":23, "b":"Hello", "c":"World"}
I would like it to be transformed (in the ORM) so that it is recorded in the database like this:
_______________________________________
|UniqueId| ItemName | ItemValue |
---------------------------------------
| 1 | a | 23 |
---------------------------------------
| 1 | b | Hello |
---------------------------------------
| 1 | c | World |
---------------------------------------
Upon a select the result would be transformed (in the ORM) back into a row of data from each of the individual values.
---------------------------------------------------
| UniqueId | a | b | c |
---------------------------------------------------
| 1 | 23 | Hello | World |
---------------------------------------------------
I would assume on an update that the best course of action would be to wrap a delete/create in a transaction so the current records would be removed and the new ones inserted.
The definitive list of ItemNames will be maintained in a separate table.
Totally open to more elegant solutions but would like to keep out of the database side if at all possible.
I'm using the declarative_base approach with SQLAlchemy.
Thanks in advance...
Cheers,
Paul
Here is a slightly modified example from documentation to work with such table structure mapped to dictionary in model:
from sqlalchemy import *
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm.collections import attribute_mapped_collection
from sqlalchemy.ext.associationproxy import association_proxy
from sqlalchemy.orm import relation, sessionmaker
metadata = MetaData()
Base = declarative_base(metadata=metadata, name='Base')
class Item(Base):
__tablename__ = 'Item'
UniqueId = Column(Integer, ForeignKey('ItemSet.UniqueId'),
primary_key=True)
ItemSet = relation('ItemSet')
ItemName = Column(String(10), primary_key=True)
ItemValue = Column(Text) # Use PickleType?
def _create_item(ItemName, ItemValue):
return Item(ItemName=ItemName, ItemValue=ItemValue)
class ItemSet(Base):
__tablename__ = 'ItemSet'
UniqueId = Column(Integer, primary_key=True)
_items = relation(Item,
collection_class=attribute_mapped_collection('ItemName'))
items = association_proxy('_items', 'ItemValue', creator=_create_item)
engine = create_engine('sqlite://', echo=True)
metadata.create_all(engine)
session = sessionmaker(bind=engine)()
data = {"UniqueId": 1, "a": 23, "b": "Hello", "c": "World"}
s = ItemSet(UniqueId=data.pop("UniqueId"))
s.items = data
session.add(s)
session.commit()

Categories

Resources