Using SQLAlchemy ORM, I would like to get the explain plan for an Oracle query.
This would be done in sqlplus using 2 steps that look something like this:
# step 1
explain plan for select * from table_xyz;
# step 2
select * from table(dbms_xplan.display(null, null, 'SERIAL'));
I have tried the following so far with SQLAlchemy ORM without luck:
from sqlalchemy import create_engine, Table
from sqlalchemy.orm import sessionmaker
import os
# create engine
user, pswd = (os.environ['USER'], os.environ['PW'])
conn_str = f'oracle://{user}:{pswd}#myDbService'
engine = create_engine(conn_str, echo=True)
# create session
Session = sessionmaker(bind=engine)
# reflect existing table
MyTable = Table('my_table', meta, autoload=True, autoload_with=engine, schema='myschema')
# generate query
bind_param = dict('state_cd'=['CA', 'WA'])
query = str(Session().query(MyTable).filter(MyTable.c.my_state_cd.in_(state_cd)))
# print(query) # <-- this returns a properly formulated select query with bound parameters
result = Session().execute('EXPLAIN PLAN FOR ' + query, bind_param)
Executing the last line above keeps failing with the following error and I'm not sure what I'm doing wrong:
StatementError: (sqlalchemy.exc.InvalidRequestError) A value is required for bind parameter 'my_state_cd_2'
Related
I'm using alembic and sqlalchemy to work with different database types. I use an own create_sequence-methode and the drop_sequence-method of Operations from alembic.
Now I'm making unittests to test my functionality. I want to alter/interrogate the sequence that I created before. But how?
def create_sequence(self, sequence):
kwargs_list = sequence
self.oprt.execute(CreateSequence(sequence))
def drop_sequence(self, sequence_name):
self.oprt.drop_sequence(sequence_name)
self.oprt is initialized like this:
engine = create_engine(connection_string, echo=True) # echo=True to get executed sql
conn = engine.connect()
ctx = MigrationContext.configure(connection=conn)
self.oprt = Operations(ctx)
I tried already to get a Sequence object with the help of the engine-object or an Metadata-Object. It doesn't work yet
Here are some ideas I tested with with postgresql so I'm not sure how many other dbs support these.
Get next_value() of sequence.
engine = create_engine(f"postgresql+psycopg2://{username}:{password}#/{db}", echo=True)
Base = declarative_base()
metadata = Base.metadata
seq_name = 'counter'
from sqlalchemy import Sequence
counter_seq = Sequence(seq_name, metadata=metadata)
metadata.create_all(engine)
with Session(engine) as session, session.begin():
res = session.execute(select(counter_seq.next_value())).scalar()
assert res > 0
Use inspect and check if sequence name is listed
from sqlalchemy import inspect
ins = inspect(engine)
assert seq_name in ins.get_sequence_names()
postgresql only -- check currval manually
I know there is a way to check the current sequence value in postgresql but it doesn't seem that sqlalchemy support that directly. You could do it manually like this:
from sqlalchemy.sql import func, select
with Session(engine) as session:
res = session.execute(select(func.currval(seq_name))).scalar()
assert res > 0
Here's the goal raw SQL:
INSERT INTO "db2"."schema2"."table" ("col1", "col2")
SELECT "db1"."schema1"."tablea"."column1",
"db1"."schema1"."tableb"."column1"
FROM "db1"."schema1"."tablea" JOIN "db1"."schema1"."tableb"
ON
"db1"."schema1"."tablea"."join_col" = "db1"."schema1"."tableb"."join_col";
I'm using SQLAlchemy Core to reflect the tables. This is simple enough but the issue is I'm unable to reflect them in a fully qualified way.
from sqlalchemy import create_engine, MetaData, Table, text
from snowflake.sqlalchemy import URL
from sqlalchemy.sql import select, insert
engine = create_engine(URL(
account="my_account",
user="my_username",
password="my_password",
warehouse="my_warehouse",
role="my_role",
)
)
metadata = MetaData()
with engine.begin() as cnxn:
cnxn.execute(text("USE DATABASE db1;"))
tablea = Table("tablea", metadata, autoload_with=engine, schema="schema1")
tableb = Table("tableb", metadata, autoload_with=engine, schema="schema1")
with engine.begin() as cnxn:
cnxn.execute(text("USE DATABASE db2;"))
table = Table("table", metadata, autoload_with=engine, schema="schema2")
sql = select(tablea.c.column1, tableb.c.column1)
sql = sql.select_from(tablea.join(tableb, tablea.c.join_col == tableb.c.join_col))
insert_stmnt = insert(table).from_select([c.name for c in table.c]), sql)
with engine.begin() as cnxn:
cnxn.execute(insert_stmnt)
The problem is the insert statement fails because I have three tables and two different databases. I think the easy answer is when reflecting the tables they need to be fully qualified. Is that possible? If so, how? I've been trying all different combinations of table_name and schema in the reflection trying to include the database but have been completely striking out.
The only thing I've been able to do is print the insert statement and then manually add the databases to each table in the resulting string. Then, run the string. This completely defeats the purpose of SQLAlchemy IMO.
python_full_version = "3.9.6"
sqlalchemy = "1.4.45"
snowflake-sqlalchemy = "1.4.4"
I have an SQL Alchemy engine where I try to insert parameters via sqlalchemy.sql.text to protect against SQL injection.
The following code works, where I code variables for the condition and conditions values.
from sqlalchemy import create_engine
from sqlalchemy.sql import text
db_engine = create_engine(...)
db_engine.execute(
text(
'SELECT * FROM table_name WHERE :condition_1 = :condition_1_value'), condition_1="name", condition_1_value="John"
)
).fetchall()
However, when I try to code the variable name for table_name, it returns an error.
from sqlalchemy import create_engine
from sqlalchemy.sql import text
db_engine = create_engine(...)
db_engine.execute(
text(
'SELECT * FROM :table_name WHERE :condition_1 = :condition_1_value'), table_name="table_1", condition_1="name", condition_1_value="John"
)
).fetchall()
Any ideas why this does not work?
EDIT:
I know that it has something to do with the table_name not being a string, but I am not sure how to do it in another way.
Any ideas why this does not work?
Query parameters are used to supply the values of things (usually column values), not the names of things (tables, columns, etc.). Every database I've seen works that way.
So, despite the ubiquitous advice that dynamic SQL is a "Bad Thing", there are certain cases where it is simply necessary. This is one of them.
table_name = "table_1" # NOTE: Do not use untrusted input here!
stmt = text(f'SELECT * FROM "{table_name}" …')
Also, check the results you get from trying to parameterize a column name. You may not be getting what you expect.
stmt = text("SELECT * FROM table_name WHERE :condition_1 = :condition_1_value")
db_engine.execute(stmt, dict(condition_1="name", condition_1_value="John"))
will not produce the equivalent of
SELECT * FROM table_name WHERE name = 'John'
It will render the equivalent of
SELECT * FROM table_name WHERE 'name' = 'John'
and will not throw an error, but it will also return no rows because 'name' = 'John' will never be true.
One of our queries that was working in Python 2 + mxODBC is not working in Python 3 + pyodbc; it raises an error like this: Maximum number of parameters in the sql query is 2100. while connecting to SQL Server. Since both the printed queries have 3000 params, I thought it should fail in both environments, but clearly that doesn't seem to be the case here. In the Python 2 environment, both MSODBC 11 or MSODBC 17 works, so I immediately ruled out a driver related issue.
So my question is:
Is it correct to send a list as multiple params in SQLAlchemy because the param list will be proportional to the length of list? I think it looks a bit strange; I would have preferred concatenating the list into a single string because the DB doesn't understand the list datatype.
Are there any hints on why it would be working in mxODBC but not pyodbc? Does mxODBC optimize something that pyodbc does not? Please let me know if there are any pointers - I can try and paste more info here. (I am still new to debugging SQLAlchemy.)
Footnote: I have seen lot of answers that suggest to chunk the data, but because of 1 and 2, I wonder if I am doing the correct thing in the first place.
(Since it seems to be related to pyodbc, I have raised an internal issue in the official repository.)
import sqlalchemy
import sqlalchemy.orm
from sqlalchemy import MetaData, Table
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm.session import Session
Base = declarative_base()
create_tables = """
CREATE TABLE products(
idn NUMERIC(8) PRIMARY KEY
);
"""
check_tables = """
SELECT * FROM products;
"""
insert_values = """
INSERT INTO products
(idn)
values
(1),
(2);
"""
delete_tables = """
DROP TABLE products;
"""
engine = sqlalchemy.create_engine('mssql+pyodbc://user:password#dsn')
connection = engine.connect()
cursor = engine.raw_connection().cursor()
Session = sqlalchemy.orm.sessionmaker(bind=connection)
session = Session()
session.execute(create_tables)
metadata = MetaData(connection)
class Products(Base):
__table__ = Table('products', metadata, autoload=True)
try:
session.execute(check_tables)
session.execute(insert_values)
session.commit()
query = session.query(Products).filter(
Products.idn.in_(list(range(0, 3000)))
)
query.all()
f = open("query.sql", "w")
f.write(str(query))
f.close()
finally:
session.execute(delete_tables)
session.commit()
When you do a straightforward .in_(list_of_values) SQLAlchemy renders the following SQL ...
SELECT team.prov AS team_prov, team.city AS team_city
FROM team
WHERE team.prov IN (?, ?)
... where each value in the IN clause is specified as a separate parameter value. pyodbc sends this to SQL Server as ...
exec sp_prepexec #p1 output,N'#P1 nvarchar(4),#P2 nvarchar(4)',N'SELECT team.prov AS team_prov, team.city AS team_city, team.team_name AS team_team_name
FROM team
WHERE team.prov IN (#P1, #P2)',N'AB',N'ON'
... so you hit the limit of 2100 parameters if your list is very long. Presumably, mxODBC inserted the parameter values inline before sending it to SQL Server, e.g.,
SELECT team.prov AS team_prov, team.city AS team_city
FROM team
WHERE team.prov IN ('AB', 'ON')
You can get SQLAlchemy to do that for you with
provinces = ["AB", "ON"]
stmt = (
session.query(Team)
.filter(
Team.prov.in_(sa.bindparam("p1", expanding=True, literal_execute=True))
)
.statement
)
result = list(session.query(Team).params(p1=provinces).from_statement(stmt))
I want to copy a table from one oracle database to postgre database using sqlalchemy
After setting up connection and engine in oracle and postgre and reflecting tables to the sourceMeta metadata, I try to create in the destEngine, but it gives me an error saying cant render element of type...
for t in sourceMeta.sorted_tables:
newtable = Table(t.name, sourceMeta, autoload=True)
newtable.metadata.create_all(destEngine)
it seems what you are looking for is sqlalchemys #compiles decorator.
Here is an example of how it worked for me when trying to copy tables from a MS SQL Server db to a PostgreSQL db.
from sqlalchemy import create_engine, Table, MetaData
from sqlalchemy.schema import CreateTable
from sqlalchemy.ext.compiler import compiles
from sqlalchemy.dialects.mssql import TINYINT, DATETIME, VARCHAR
#compiles(TINYINT, 'postgresql')
def compile_TINYINT_mssql_int(element, compiler, **kw):
""" Handles mssql TINYINT datatype as INT in postgresql """
return 'INTEGER'
# add a function for each datatype that causes an error
table_name = '<table_name>'
# create engine, reflect existing columns, and create table object for oldTable
srcEngine = create_engine('mssql+pymssql://<user>:<password>#<host>/<db>')
srcEngine._metadata = MetaData(bind=srcEngine)
srcEngine._metadata.reflect(srcEngine) # get columns from existing table
srcTable = Table(table_name, srcEngine._metadata)
# create engine and table object for newTable
destEngine = create_engine('postgresql+psycopg2://<user>:<password>#<host><db>')
destEngine._metadata = MetaData(bind=destEngine)
destTable = Table(table_name.lower(), destEngine._metadata)
# copy schema and create newTable from oldTable
for column in srcTable.columns:
dstCol = column.copy()
destTable.append_column(dstCol)
# maybe change column name etc.
print(CreateTable(destTable).compile(destEngine)) # <- check the query that will be used to create the table
destTable.create()
check the docs:
https://docs.sqlalchemy.org/en/13/core/compiler.html
and maybe also this example:
https://gist.github.com/methane/2972461