How to interrogate/alter a database sequence in python? - python

I'm using alembic and sqlalchemy to work with different database types. I use an own create_sequence-methode and the drop_sequence-method of Operations from alembic.
Now I'm making unittests to test my functionality. I want to alter/interrogate the sequence that I created before. But how?
def create_sequence(self, sequence):
kwargs_list = sequence
self.oprt.execute(CreateSequence(sequence))
def drop_sequence(self, sequence_name):
self.oprt.drop_sequence(sequence_name)
self.oprt is initialized like this:
engine = create_engine(connection_string, echo=True) # echo=True to get executed sql
conn = engine.connect()
ctx = MigrationContext.configure(connection=conn)
self.oprt = Operations(ctx)
I tried already to get a Sequence object with the help of the engine-object or an Metadata-Object. It doesn't work yet

Here are some ideas I tested with with postgresql so I'm not sure how many other dbs support these.
Get next_value() of sequence.
engine = create_engine(f"postgresql+psycopg2://{username}:{password}#/{db}", echo=True)
Base = declarative_base()
metadata = Base.metadata
seq_name = 'counter'
from sqlalchemy import Sequence
counter_seq = Sequence(seq_name, metadata=metadata)
metadata.create_all(engine)
with Session(engine) as session, session.begin():
res = session.execute(select(counter_seq.next_value())).scalar()
assert res > 0
Use inspect and check if sequence name is listed
from sqlalchemy import inspect
ins = inspect(engine)
assert seq_name in ins.get_sequence_names()
postgresql only -- check currval manually
I know there is a way to check the current sequence value in postgresql but it doesn't seem that sqlalchemy support that directly. You could do it manually like this:
from sqlalchemy.sql import func, select
with Session(engine) as session:
res = session.execute(select(func.currval(seq_name))).scalar()
assert res > 0

Related

"Maximum number of parameters" error with filter .in_(list) using pyodbc

One of our queries that was working in Python 2 + mxODBC is not working in Python 3 + pyodbc; it raises an error like this: Maximum number of parameters in the sql query is 2100. while connecting to SQL Server. Since both the printed queries have 3000 params, I thought it should fail in both environments, but clearly that doesn't seem to be the case here. In the Python 2 environment, both MSODBC 11 or MSODBC 17 works, so I immediately ruled out a driver related issue.
So my question is:
Is it correct to send a list as multiple params in SQLAlchemy because the param list will be proportional to the length of list? I think it looks a bit strange; I would have preferred concatenating the list into a single string because the DB doesn't understand the list datatype.
Are there any hints on why it would be working in mxODBC but not pyodbc? Does mxODBC optimize something that pyodbc does not? Please let me know if there are any pointers - I can try and paste more info here. (I am still new to debugging SQLAlchemy.)
Footnote: I have seen lot of answers that suggest to chunk the data, but because of 1 and 2, I wonder if I am doing the correct thing in the first place.
(Since it seems to be related to pyodbc, I have raised an internal issue in the official repository.)
import sqlalchemy
import sqlalchemy.orm
from sqlalchemy import MetaData, Table
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm.session import Session
Base = declarative_base()
create_tables = """
CREATE TABLE products(
idn NUMERIC(8) PRIMARY KEY
);
"""
check_tables = """
SELECT * FROM products;
"""
insert_values = """
INSERT INTO products
(idn)
values
(1),
(2);
"""
delete_tables = """
DROP TABLE products;
"""
engine = sqlalchemy.create_engine('mssql+pyodbc://user:password#dsn')
connection = engine.connect()
cursor = engine.raw_connection().cursor()
Session = sqlalchemy.orm.sessionmaker(bind=connection)
session = Session()
session.execute(create_tables)
metadata = MetaData(connection)
class Products(Base):
__table__ = Table('products', metadata, autoload=True)
try:
session.execute(check_tables)
session.execute(insert_values)
session.commit()
query = session.query(Products).filter(
Products.idn.in_(list(range(0, 3000)))
)
query.all()
f = open("query.sql", "w")
f.write(str(query))
f.close()
finally:
session.execute(delete_tables)
session.commit()
When you do a straightforward .in_(list_of_values) SQLAlchemy renders the following SQL ...
SELECT team.prov AS team_prov, team.city AS team_city
FROM team
WHERE team.prov IN (?, ?)
... where each value in the IN clause is specified as a separate parameter value. pyodbc sends this to SQL Server as ...
exec sp_prepexec #p1 output,N'#P1 nvarchar(4),#P2 nvarchar(4)',N'SELECT team.prov AS team_prov, team.city AS team_city, team.team_name AS team_team_name
FROM team
WHERE team.prov IN (#P1, #P2)',N'AB',N'ON'
... so you hit the limit of 2100 parameters if your list is very long. Presumably, mxODBC inserted the parameter values inline before sending it to SQL Server, e.g.,
SELECT team.prov AS team_prov, team.city AS team_city
FROM team
WHERE team.prov IN ('AB', 'ON')
You can get SQLAlchemy to do that for you with
provinces = ["AB", "ON"]
stmt = (
session.query(Team)
.filter(
Team.prov.in_(sa.bindparam("p1", expanding=True, literal_execute=True))
)
.statement
)
result = list(session.query(Team).params(p1=provinces).from_statement(stmt))

create objects on MSSQL and Oracle from SQLAlchemy Metadata with different casing for names

I need to take a SQLALchemy metadata object and create on MSSQL and on Oracle. This is easy enough in SQLAlchemy core, see the example below. But the names of objects should be mixed-case on MSSQL and default case on Oracle (all-caps). Its ends up mixed on Oracle as well, which is expected when the names in metadata are mixed. Manually it is trivial, just have separate metadata definitions, but this defeats the purpose of having one set of metadata. Sqlalchemy is 1.3.8, python is 3.7.4.
"""
Create from metadata on MSSQL and Oracle
"""
import urllib
from sqlalchemy import * # pylint: disable=wildcard-import, unused-wildcard-import
params = urllib.parse.quote_plus(
"Driver={ODBC Driver 17 for SQL Server};Server=xxx\\xxx;Database=xxx;Trusted_Connection=yes"
)
print("mssql+pyodbc:///?odbc_connect=%s" % params)
mssqle = create_engine("mssql+pyodbc:///?odbc_connect=%s" % params, echo=True)
oraclee = create_engine("oracle+cx_oracle://#xxx", echo=True)
metadata = MetaData()
t1 = Table(
"TableNumberOne",
metadata,
Column("id", Integer(), primary_key=True),
Column("ColumnNumberOne", String(50), index=True)
)
metadata.create_all(mssqle)
metadata.create_all(oraclee)
I can get away with this. I am not sure it covers every possible situation.
for _tab in metadata.sorted_tables:
_tab.name = _tab.name.lower()
for _col in _tab.columns:
_col.name = _col.name.lower()
for _con in _tab.constraints:
if _con.name is not None:
_con.name = _con.name.lower()
for _idx in _tab.indexes:
if _idx.name is not None:
_idx.name = _idx.name.lower()

Using session.query to read uncommitted data in SQLAlchemy

Summary
I'm trying write integration tests against a series of database operations, and I want to be able to use a SQLAlchemy session as a staging environment in which to validate and rollback a transaction.
Is it possible to retrieve uncommitted data using session.query(Foo) instead of session.execute(text('select * from foo'))?
Background and Research
These results were observed using SQLAlchemy 1.2.10, Python 2.7.13, and Postgres 9.6.11.
I've looked at related StackOverflow posts but haven't found an explanation as to why the two operations below should behave differently.
SQLalchemy: changes not committing to db
Tried with and without session.flush() before every session.query. No success.
sqlalchemy update not commiting changes to database. Using single connection in an app
Checked to make sure I am using the same session object throughout
Sqlalchemy returns different results of the SELECT command (query.all)
N/A: My target workflow is to assess a series of CRUD operations within the staging tables of a single session.
Querying objects added to a non committed session in SQLAlchemy
Seems to be the most related issue, but my motivation for avoiding session.commit() is different, and I didn't quite find the explanation I'm looking for.
Reproducible Example
1) I establish a connection to the database and define a model object; no issues so far:
from sqlalchemy import text
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Integer, String, ForeignKey
#####
# Prior DB setup:
# CREATE TABLE foo (id int PRIMARY KEY, label text);
#####
# from https://docs.sqlalchemy.org/en/13/orm/mapping_styles.html#declarative-mapping
Base = declarative_base()
class Foo(Base):
__tablename__ = 'foo'
id = Column(Integer, primary_key=True)
label = Column(String)
# from https://docs.sqlalchemy.org/en/13/orm/session_basics.html#getting-a-session
some_engine = create_engine('postgresql://username:password#endpoint/database')
Session = sessionmaker(bind=some_engine)
2) I perform some updates without committing the result, and I can see the staged data by executing a select statement within the session:
session = Session()
sql_insert = text("INSERT INTO foo (id, label) VALUES (1, 'original')")
session.execute(sql_insert);
sql_read = text("SELECT * FROM foo WHERE id = 1");
res = session.execute(sql_read).first()
print res.label
sql_update = text("UPDATE foo SET label = 'updated' WHERE id = 1")
session.execute(sql_update)
res2 = session.execute(sql_read).first()
print res2.label
sql_update2 = text("""
INSERT INTO foo (id, label) VALUES (1, 'second_update')
ON CONFLICT (id) DO UPDATE
SET (label) = (EXCLUDED.label)
""")
session.execute(sql_update2)
res3 = session.execute(sql_read).first()
print res3.label
session.rollback()
# prints expected values: 'original', 'updated', 'second_update'
3) I attempt to replace select statements with session.query, but I can't see the new data:
session = Session()
sql_insert = text("INSERT INTO foo (id, label) VALUES (1, 'original')")
session.execute(sql_insert);
res = session.query(Foo).filter_by(id=1).first()
print res.label
sql_update = text("UPDATE foo SET label = 'updated' WHERE id = 1")
session.execute(sql_update)
res2 = session.query(Foo).filter_by(id=1).first()
print res2.label
sql_update2 = text("""
INSERT INTO foo (id, label) VALUES (1, 'second_update')
ON CONFLICT (id) DO UPDATE
SET (label) = (EXCLUDED.label)
""")
session.execute(sql_update2)
res3 = session.query(Foo).filter_by(id=1).first()
print res3.label
session.rollback()
# prints: 'original', 'original', 'original'
I expect the printed output of Step 3 to be 'original', 'updated', 'second_update'.
The root cause is that the raw SQL queries and the ORM do not mix automatically in this case. While the Session is not a cache, meaning it does not cache queries, it does store objects based on their primary key in the identity map. When a Query returns a row for a mapped object, the existing object is returned. This is why you do not observe the changes you made in the 3rd step. This might seem like a rather poor way to handle the situation, but SQLAlchemy is operating based on some assumptions about transaction isolation, as described in "When to Expire or Refresh":
Transaction Isolation
...[So] as a best guess, it assumes that within the scope of a transaction, unless it is known that a SQL expression has been emitted to modify a particular row, there’s no need to refresh a row unless explicitly told to do so.
The whole note about transaction isolation is a worthwhile read. The way to make such changes known to SQLAlchemy is to perform updates using the Query API, if possible, and to manually expire changed objects, if all else fails. With this in mind, your 3rd step could look like:
session = Session()
sql_insert = text("INSERT INTO foo (id, label) VALUES (1, 'original')")
session.execute(sql_insert);
res = session.query(Foo).filter_by(id=1).first()
print(res.label)
session.query(Foo).filter_by(id=1).update({Foo.label: 'updated'},
synchronize_session='fetch')
# This query is actually redundant, `res` and `res2` are the same object
res2 = session.query(Foo).filter_by(id=1).first()
print(res2.label)
sql_update2 = text("""
INSERT INTO foo (id, label) VALUES (1, 'second_update')
ON CONFLICT (id) DO UPDATE
SET label = EXCLUDED.label
""")
session.execute(sql_update2)
session.expire(res)
# Again, this query is redundant and fetches the same object that needs
# refreshing anyway
res3 = session.query(Foo).filter_by(id=1).first()
print(res3.label)
session.rollback()

How to get Explain Plan in Oracle using SQLAlchemy ORM

Using SQLAlchemy ORM, I would like to get the explain plan for an Oracle query.
This would be done in sqlplus using 2 steps that look something like this:
# step 1
explain plan for select * from table_xyz;
# step 2
select * from table(dbms_xplan.display(null, null, 'SERIAL'));
I have tried the following so far with SQLAlchemy ORM without luck:
from sqlalchemy import create_engine, Table
from sqlalchemy.orm import sessionmaker
import os
# create engine
user, pswd = (os.environ['USER'], os.environ['PW'])
conn_str = f'oracle://{user}:{pswd}#myDbService'
engine = create_engine(conn_str, echo=True)
# create session
Session = sessionmaker(bind=engine)
# reflect existing table
MyTable = Table('my_table', meta, autoload=True, autoload_with=engine, schema='myschema')
# generate query
bind_param = dict('state_cd'=['CA', 'WA'])
query = str(Session().query(MyTable).filter(MyTable.c.my_state_cd.in_(state_cd)))
# print(query) # <-- this returns a properly formulated select query with bound parameters
result = Session().execute('EXPLAIN PLAN FOR ' + query, bind_param)
Executing the last line above keeps failing with the following error and I'm not sure what I'm doing wrong:
StatementError: (sqlalchemy.exc.InvalidRequestError) A value is required for bind parameter 'my_state_cd_2'

Close SQLAlchemy connection

I have the following function in python:
def add_odm_object(obj, table_name, primary_key, unique_column):
db = create_engine('mysql+pymysql://root:#127.0.0.1/mydb')
metadata = MetaData(db)
t = Table(table_name, metadata, autoload=True)
s = t.select(t.c[unique_column] == obj[unique_column])
rs = s.execute()
r = rs.fetchone()
if not r:
i = t.insert()
i_res = i.execute(obj)
v_id = i_res.inserted_primary_key[0]
return v_id
else:
return r[primary_key]
This function looks if the object obj is in the database, and if it is not found, it saves it to the DB. Now, I have a problem. I call the above function in a loop many times. And after few hundred times, I get an error: user root has exceeded the max_user_connections resource (current value: 30) I tried to search for answers and for example the question: How to close sqlalchemy connection in MySQL recommends creating a conn = db.connect() object where dbis the engine and calling conn.close() after my query is completed.
But, where should I open and close the connection in my code? I am not working with the connection directly, but I'm using the Table() and MetaData functions in my code.
The engine is an expensive-to-create factory for database connections. Your application should call create_engine() exactly once per database server.
Similarly, the MetaData and Table objects describe a fixed schema object within a known database. These are also configurational constructs that in most cases are created once, just like classes, in a module.
In this case, your function seems to want to load up tables dynamically, which is fine; the MetaData object acts as a registry, which has the convenience feature that it will give you back an existing table if it already exists.
Within a Python function and especially within a loop, for best performance you typically want to refer to a single database connection only.
Taking these things into account, your module might look like:
# module level variable. can be initialized later,
# but generally just want to create this once.
db = create_engine('mysql+pymysql://root:#127.0.0.1/mydb')
# module level MetaData collection.
metadata = MetaData()
def add_odm_object(obj, table_name, primary_key, unique_column):
with db.begin() as connection:
# will load table_name exactly once, then store it persistently
# within the above MetaData
t = Table(table_name, metadata, autoload=True, autoload_with=conn)
s = t.select(t.c[unique_column] == obj[unique_column])
rs = connection.execute(s)
r = rs.fetchone()
if not r:
i_res = connection.execute(t.insert(), some_col=obj)
v_id = i_res.inserted_primary_key[0]
return v_id
else:
return r[primary_key]

Categories

Resources