how to get existing table using snowflake sqlalchemy - python

In my snowflake db, I have already created tables. I used the sqlalchemy inspector library to check if it was created.
engine = create_engine("snowflake://my_connection_string")
db = {}
metadata = MetaData(bind=engine)
inspector = inspect(engine)
schemas = inspector.get_schema_names()
for schema in schemas:
print(f"schema: {schema}")
for table_name in inspector.get_table_names(schema=schema):
print(f"table name: {table_name}")
db[f"{schema}.{table_name}"] = Table(f"{schema}.{table_name}", metadata)
print("Columns: ", end=': ')
for column in inspector.get_columns(table_name, schema=schema):
print(f"{column['name']}", end=',')
print()
However, I'm having trouble fetching those tables using sqlalchemy metadata:
engine = create_engine("snowflake://my_connection_string")
meta_data = MetaData(bind=engine)
MetaData.reflect(meta_data)
when I run the above code I get the following error:
ProgrammingError: (snowflake.connector.errors.ProgrammingError) 001059 (22023): SQL compilation error:
Must specify the full search path starting from database for TEST_DB
[SQL: SHOW /* sqlalchemy:_get_schema_primary_keys */PRIMARY KEYS IN SCHEMA test_db]
(Background on this error at: https://sqlalche.me/e/14/f405)
Any suggestions would be much appreciated!

Related

How to execute a sqlalchecmy TextClause statement with a sqlite3 connection cursor?

I have a python flask app which primarily uses sqlalchemy to execute all of it's mySQL queries and I need to write tests for it using a local database and behave.
After a brief research, the database I've chosen for this task is a local sqlite3 db, mainly because I've read that its pretty much compatible with mySQL and sqlalchemy, and also because it's easy to set up and tear-down.
I've established a connection to it successfully and managed to create all the tables I need for the tests.
I've encountered a problem when trying to execute some queries, where the query statement is being built as a sqlalchemy TextClause object and my sqlite3 connection cursor raises the following exception when trying to execute the statement:
TypeError: argument 1 must be str, not TextClause
How can I convert this TextClause object dynamically to a string and execute it?
I don't want to make drastic changes to the code just for testing.
A code example:
employees table:
id
name
1
Jeff Bezos
2
Bill Gates
from sqlalchemy import text
import sqlite3
def select_employee_by_id(id: int):
employees_table = 'employees'
db = sqlite3.connect(":memory:")
cursor = db.cursor()
with db as session:
statement = text("""
SELECT *
FROM {employees_table}
WHERE
id = :id
""".format(employees_table=employees_table)
).bindparams(id=id)
data = cursor.execute(statement)
return data.fetchone()
Should return a row containing {'id': 1, 'name': 'Jeff Bezos'} for select_employee_by_id(1)
Thanks in advance!
If you want to test your TextClause query then you should execute it by using SQLAlchemy, not by using a DBAPI (SQLite) cursor:
from sqlalchemy import create_engine, text
def select_employee_by_id(id: int):
employees_table = 'employees'
engine = create_engine("sqlite://")
with engine.begin() as conn:
statement = text("""
SELECT *
FROM {employees_table}
WHERE
id = :id
""".format(employees_table=employees_table)
).bindparams(id=id)
data = conn.execute(statement)
return data.one()

Getting sqlalchemy.exc.InvalidRequestError when loading data from Excel to Snowflake

I am trying to load data from excel to Snowflake using Python.
Below is my code so far:
config = ConfigParser()
# parse ini file
config.read('config.ini')
## Read excel
file = '/path/INTRANSIT.xlsx'
df_excel = pd.read_excel(file, engine='openpyxl')
# sqlalchemy to create DB engine
engine = create_engine(URL(
account = config.get('Production', 'accountname'),
user = config.get('Production', 'username'),
password = config.get('Production', 'password'),
database = config.get('Production', 'dbname'),
schema = config.get('Production', 'schemaname'),
warehouse = config.get('Production', 'warehousename'),
role=config.get('Production', 'rolename'),
)
)
con = engine.connect()
df_excel.to_sql('transit_table',con, if_exists='replace', index=False)
con.close()
But I am getting below error:
sqlalchemy.exc.InvalidRequestError: Could not reflect: requested table(s) not available in Engine(snowflake://username:***#account_identifier/): (db.schema.transit_table)
I have tried prefixing Database and schema to table and also tried passing table name alone. I have also tried passing uppercase and lowercase table name.
Still not able to resolve this error. Would really appreciate any help to resolve this!
Thank you.

SQLAlchemy and the sql 'Use Database' command

I am using sqlalchemy and the create_engine to connect to mysql, build a database and start populating with relevant data.
edit, to preface, the database in question needs to be first created. to do this I perform the following commands
database_address = 'mysql+pymysql://{0}:{1}#{2}:{3}'
database_address = database_address.format('username',
'password',
'ip_address',
'port')
engine = create_engine(database_address, echo=False)
Database_Name = 'DatabaseName'
engine.execute(("Create Databse {0}").format(Database_Name)
Following creation of the database, I try to perform a 'use' command but end up receiving the following error
line 3516, in _escape_identifier
value = value.replace(self.escape_quote, self.escape_to_quote)
AttributeError: 'NoneType' object has no attribute 'replace'
I traced the error to a post which stated that this occurs when using the following command in python 3
engine.execute("USE dbname")
What else needs to be included in the execute command to access the mysql database and not throw an error.
You shouldn't use the USE command - instead you should specify the database you wish to connect to in the create_engine connection url - i.e.:
database_address = 'mysql+pymysql://{0}:{1}#{2}:{3}/{4}'
database_address = database_address.format('username',
'password',
'ip_address',
'port',
'database')
engine = create_engine(database_address, echo=False)
The create_engine docs are here: https://docs.sqlalchemy.org/en/13/core/engines.html#mysql
Figured out what to do,
Following advise from Match, I looked into SQLAlchemy and it's ability to create schema.
Found the following code
from sqlalchemy import create_engine
from sqlalchemy_utils import database_exists, create_database
database_address = 'mysql+pymysql://{0}:{1}#{2}:{3}/{4}?charset=utf8mb4'
database_address = database_address.format('username','password','address','port','DB')
engine = create_engine(database_address, echo=False)
if not database_exists(self.engine.url):
create_database(self.engine.url)
So creating an engine with the Schema name identified, I can use the utility database_exists to see if the database does exist, if not, then create using the create_database function.

join on two different databases with sqlalchemy

Im trying to make a join on 2 databases in MSSQL.
here is the SQL query:
SELECT od.Indice, cs.Argued
FROM HN_Ondata.dbo.ODCalls as od
JOIN HN_ADMIN.dbo.CallStatus as cs ON od.CallStatusGroup = cs.StatusGroup
I have tried:
create two engines making the tables with autoload and query it.
create two engines opening two session and making a subquery.
create two engines create a CTE of table2.
create a metadata bind to database1 reflect table1 then call reflect(bind=database2) for table2
always end up with this error:
pymssql.ProgrammingError: (208, b"Invalid object name 'CallStatus'.DB-Lib error message 20018, severity 16:\nGeneral SQL Server error: Check messages from the SQL Server\n")
The current solution i got is using session.execute and write down raw sql, i could stick with it but im currious is there any way doing that with sqlalchemy ORM ?
EDIT 1:
Here is my code :
db1 = DatabaseManager(settings.DATABASE['hrm'], database='HN_Ondata')
db2 = DatabaseManager(settings.DATABASE['hrm'], database='HN_ADMIN')
metadata1 = MetaData(bind=db1.database)
metadata2 = MetaData(bind=db2.database)
table1 = Table('ODCalls', metadata1, autoload=True)
table2 = Table('CallStatus', metadata2, autoload=True)
with db1.session(raise_err=True) as session:
result = (
session
.query(table1.c.Indice, table2.c.Argued)
.join(table2, table1.c.CallStatusGroup == table2.c.StatusGroup)
.all()
)
who produce the following query:
SELECT [ODCalls].[Indice] AS [ODCalls_Indice], [CallStatus].[Argued] AS [CallStatus_Argued]
FROM [ODCalls]
JOIN [CallStatus] ON [ODCalls].[CallStatusGroup] = [CallStatus].[StatusGroup]
Found the solution thank's to Ryan Gadsdon and Ilja Everilä pointing me the way.
You need to precise database.schema in Table schema parameters like this:
table1 = Table('ODCalls', metadata1, autoload=True, schema='HN_Odcalls.dbo')
Specify Database in schema is needed only if the table refer to a database who your engine is not connected, if you precise database.schema in schema parameters you can then use the table with any engine connected to any database on the same server.
http://docs.sqlalchemy.org/en/latest/dialects/mssql.html#multipart-schema-names

SQLAlchemy not finding tables, possible connection issues

I'm trying to connect to one our our internal databases using the following code:
engine = create_engine('postgresql+psycopg2://{user}:{passw}#{host}:{port}/{db}'.format(
user=config3.CANVAS_USERNAME,
passw=config3.CANVAS_PWD,
host=config3.CANVAS_BOUNCER,
port=config3.CANVAS_PORT,
db='cluster17dr'
))
metadata = MetaData()
metadata.reflect(bind=engine)
print(metadata.tables)
And my only result is a table called 'spatial_ref_sys', which I assume is some kind of metadata. I know that my login stuff is correct, because this works perfectly:
with ppg2.connect(
database='cluster17dr',
user=config3.CANVAS_USERNAME,
password=config3.CANVAS_PWD,
host=config3.CANVAS_BOUNCER,
port=config3.CANVAS_PORT) as conn:
cur = conn.cursor()
sql = 'SELECT * FROM canvas.switchman_shards LIMIT 10'
cur.execute(sql)
res = cur.fetchall()
print(res)
Any ideas as to what I'm missing in my connection using SQLAlchemy?
By default, if no schema name is specified, SQLAlchemy will only give you tables under the default schema. If you want to reflect tables in a schema other than the default schema (which defaults to public in PostgreSQL), you need to specify the schema keyword to .reflect():
metadata.reflect(..., schema="canvas")

Categories

Resources