Overview
I want to include sqlite extensions in sqlalchemy.
Issues
When I try to load the extension I get a not authorized error.
MVE
Setup engine
import sqlalchemy
engine = sqlalchemy.create_engine('sqlite:///:memory:')
extension = '/path/to/extension.dll'
with engine.begin() as conn:
conn.execute(
'SELECT load_extension(:path)',
path=extension
).fetchall()
Error
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) not authorized
[SQL: SELECT load_extension(:path)]
[parameters: {'path': '/path/to/extension.dll'}]
(Background on this error at: https://sqlalche.me/e/14/e3q8)
Known Alternative
The sqlite3 library's connection has the enable_load_extension method. I cannot use sqlite3 because I am using sqlalchemy's ORM heavily. The sqlite3 method loads the extension without issue. Something similar to that method - but in sqlalchemy - would be ideal.
Based on Gord Thompson's hint and VPfB's answer for a similar Stackoverflow question, I was successfully able to enable the extension by attaching a handler for the "connect" event.
Ubuntu
from sqlalchemy import event, create_engine, text
engine = create_engine("sqlite:///:memory:")
#event.listens_for(engine, "connect")
def receive_connect(connection, _):
connection.enable_load_extension(True)
connection.execute("SELECT load_extension('mod_spatialite');")
connection.enable_load_extension(False)
with engine.connect() as connection:
result = connection.execute(text("SELECT spatialite_version() as version;"))
row = result.fetchone()
print(f"Spatialite Version: {row['version']}")
Windows
The path to the folder containing the DLL file (mod_spatialite.dll) must be added in front of the PATH variable. Windows binaries for SpatiaLite can be downloaded from gaia-gis.it.
import os
from sqlalchemy import event, create_engine, text
SPATIALITE_PATH = r"C:\apps\mod_spatialite-5.0.1-win-amd64"
os.environ["PATH"] = f"{SPATIALITE_PATH};{os.environ['PATH']}"
engine = create_engine("sqlite:///:memory:")
#event.listens_for(engine, "connect")
def receive_connect(connection, _):
connection.enable_load_extension(True)
connection.execute("SELECT load_extension('mod_spatialite')")
connection.enable_load_extension(False)
with engine.connect() as connection:
result = connection.execute(text("SELECT spatialite_version() as version;"))
row = result.fetchone()
print(f"Spatialite Version: {row['version']}")
Related
I'm trying to connect to a postgres db using SQL Alchemy and the pg8000 driver. I'd like to specify a search path for this connection. With the Psycopg driver, I could do this by doing something like
engine = create_engine(
'postgresql+psycopg2://dbuser#dbhost:5432/dbname',
connect_args={'options': '-csearch_path={}'.format(dbschema)})
However, this does not work for the pg8000 driver. Is there a good way to do this?
You can use pg8000 pretty much in the same way as psycopg2, just need to swap scheme from postgresql+psycopg2 to postgresql+pg8000.
The full connection string definition is in the SQLAlchemy pg8000 docs:
postgresql+pg8000://user:password#host:port/dbname[?key=value&key=value...]
But while psycopg2.connect will pass kwargs to the server (like options and its content), pg8000.connect will not, so there is no setting search_path with pg8000.
The SQLAlchemy docs describe how to do this. For example:
from sqlalchemy import create_engine, event, text
engine = create_engine("postgresql+pg8000://postgres:postgres#localhost/postgres")
#event.listens_for(engine, "connect", insert=True)
def set_search_path(dbapi_connection, connection_record):
existing_autocommit = dbapi_connection.autocommit
dbapi_connection.autocommit = True
cursor = dbapi_connection.cursor()
cursor.execute("SET SESSION search_path='myschema'")
cursor.close()
dbapi_connection.autocommit = existing_autocommit
with engine.connect() as connection:
result = connection.execute(text("SHOW search_path"))
for row in result:
print(row)
However, as it says in the docs:
SQLAlchemy is generally organized around the concept of keeping this
variable at its default value of public
I am trying to use python sqlalchemy to query our PostgreSQL database view using ODBC but I am getting the error
{ProgrammingError}(pyodbc.ProgrammingError) ('42883', '[42883] ERROR: function schema_name() does not exist;\nError while executing the query (1) (SQLExecDirectW)')
[SQL: SELECT schema_name()]
(Background on this error at: https://sqlalche.me/e/14/f405)
Using the code below, I successfully create the connection engine but executing the query seems to be the problem.
When using 'pyodbc' or 'psycopg2' establishing the connection and querying data does work perfectly, but with a warning
'UserWarning: pandas only support SQLAlchemy connectable(engine/connection) ordatabase string URI or sqlite3 DBAPI2 connectionother DBAPI2 objects are not tested, please consider using SQLAlchemy
warnings.warn('
as to why we are looking into establishing the connection the sqlalchemy-way
import config
import sqlalchemy
if __name__ == '__main__':
connection_string = (config.odbc('database_odbc.txt'))
connection_url = sqlalchemy.engine.url.URL.create("mssql+pyodbc", query={"odbc_connect": connection_string})
conn = sqlalchemy.create_engine(connection_url)
query_string = """SELECT [column name in view] FROM public.[name of view]"""
df1 = pd.read_sql(query_string, conn)
print(df1.to_string())
conn.close()
print('Database connection closed.')
As mentioned, the query runs perfectly using the other methods. I already tried different syntax of the database view including
SELECT [column name in view] FROM [database name].public.[name of view]
SELECT [column name in view] FROM [name of view]
and more without success.
Any help is appreciated, thank you!
Thank you #Gord Thompson,
I followed the default postgresql syntax at https://docs.sqlalchemy.org/en/14/core/engines.html
engine = create_engine('postgresql://scott:tiger#localhost/mydatabase')
now the code looks like
import sqlalchemy
if __name__ == '__main__':
engine = create_engine('postgresql://[user]:[password]#[host]/[db]')
conn = engine.connect()
query_string = """SELECT [column name in view] FROM public.[name of view]"""
df1 = pd.read_sql(query_string, conn)
print(df1.to_string())
conn.close()
print('Database connection closed.')
and now it works perfectly, thank you!
In SQLAlchemy I create an engine with :
engine = create_engine(url="oracle+cx_oracle://user:xxxx#tns")
In cx_Oracle, I would create a connection with :
conn = cx_Oracle.connect(user="user", password="xxxx", dsn="tns")
I can then set the module with Connection.module attribute which tags appropriately when looking at v$session table.
conn.module = "MyModule"
Is there a way to set the Oracle session module name to an sqlalchemy.engine.Engine once it is created with create_engine?
I ended up using DialectEvents.do_connect() hook which worked nicely for me.
import cx_Oracle
from sqlalchemy import create_engine, event
engine = create_engine(url="oracle+cx_oracle://user:xxxx#tns")
#event.listens_for(engine, "do_connect")
def receive_do_connect(dialect, conn_rec, cargs, cparams):
"""listen for the 'do_connect' event"""
connection = cx_Oracle.connect(*cargs, **cparams)
connection.module = "MyModule"
return connection
I'm new to hadoop and impala. I managed to connect to impala by installing impyla and executing the following code. This is connection by LDAP:
from impala.dbapi import connect
from impala.util import as_pandas
conn = connect(host="server.lrd.com",port=21050, database='tcad',auth_mechanism='PLAIN', user="alexcj", use_ssl=True,timeout=20, password="secret1pass")
I'm then able to grab a cursor and execute queries as:
cursor = conn.cursor()
cursor.execute('SELECT * FROM tab_2014_m LIMIT 10')
df = as_pandas(cursor)
I'd like to be able use sqlalchemy to connect to impala and be able to use some nice sqlalchemy functions. I found a test file in imyla source code that illustrates how to create an sqlalchemy engine with impala driver like:
engine = create_engine('impala://localhost')
I'd like to be able to do that but I'm not able to because my call to the connect function above has a lot more parameters; and I do not know how to pass those to sqlalchemy's create_engine to get a successful connection. Has anyone done this? Thanks.
As explained at https://github.com/cloudera/impyla/issues/214
import sqlalchemy
def conn():
return connect(host='some_host',
port=21050,
database='default',
timeout=20,
use_ssl=True,
ca_cert='some_pem',
user=user, password=pwd,
auth_mechanism='PLAIN')
engine = sqlalchemy.create_engine('impala://', creator=conn)
If your Impala is secured by Kerberos below script works (due to some reason I need to use hive:// instead of impala://)
import sqlalchemy
from sqlalchemy.engine import create_engine
connect_args={'auth': 'KERBEROS', 'kerberos_service_name': 'impala'}
engine = create_engine('hive://impalad-host:21050', connect_args=connect_args)
conn = engine.connect()
ResultProxy = conn.execute("SELECT * FROM db1.table1 LIMIT 5")
print(ResultProxy.fetchall())
import time
from sqlalchemy import create_engine, MetaData, Table, select, and_
ENGINE = create_engine(
'impala://{host}:{port}/{database}'.format(
host=host, # your host
port=port,
database=database,
)
)
METADATA = MetaData(ENGINE)
TABLES = {
'table': Table('table_name', METADATA, autoload=True),
}
Using SQLAlchemy, an Engine object is created like this:
from sqlalchemy import create_engine
engine = create_engine("postgresql://localhost/mydb")
Accessing engine fails if the database specified in the argument to create_engine (in this case, mydb) does not exist. Is it possible to tell SQLAlchemy to create a new database if the specified database doesn't exist?
SQLAlchemy-Utils provides custom data types and various utility functions for SQLAlchemy. You can install the most recent official version using pip:
pip install sqlalchemy-utils
The database helpers include a create_database function:
from sqlalchemy import create_engine
from sqlalchemy_utils import database_exists, create_database
engine = create_engine("postgres://localhost/mydb")
if not database_exists(engine.url):
create_database(engine.url)
print(database_exists(engine.url))
On postgres, three databases are normally present by default. If you are able to connect as a superuser (eg, the postgres role), then you can connect to the postgres or template1 databases. The default pg_hba.conf permits only the unix user named postgres to use the postgres role, so the simplest thing is to just become that user. At any rate, create an engine as usual with a user that has the permissions to create a database:
>>> engine = sqlalchemy.create_engine("postgres://postgres#/postgres")
You cannot use engine.execute() however, because postgres does not allow you to create databases inside transactions, and sqlalchemy always tries to run queries in a transaction. To get around this, get the underlying connection from the engine:
>>> conn = engine.connect()
But the connection will still be inside a transaction, so you have to end the open transaction with a commit:
>>> conn.execute("commit")
And you can then proceed to create the database using the proper PostgreSQL command for it.
>>> conn.execute("create database test")
>>> conn.close()
It's possible to avoid manual transaction management while creating database by providing isolation_level='AUTOCOMMIT' to create_engine function:
import sqlalchemy
with sqlalchemy.create_engine(
'postgresql:///postgres',
isolation_level='AUTOCOMMIT'
).connect() as connection:
connection.execute('CREATE DATABASE my_database')
Also if you are not sure that database doesn't exist there is a way to ignore database creation error due to existence by suppressing sqlalchemy.exc.ProgrammingError exception:
import contextlib
import sqlalchemy.exc
with contextlib.suppress(sqlalchemy.exc.ProgrammingError):
# creating database as above
Extending the accepted answer using with yields:
from sqlalchemy import create_engine
engine = create_engine("postgresql://localhost")
NEW_DB_NAME = 'database_name'
with engine.connect() as conn:
conn.execute("commit")
# Do not substitute user-supplied database names here.
conn.execute(f"CREATE DATABASE {NEW_DB_NAME}")
Please note that I couldn't get the above suggestions with database_exists because whenever I check if the database exists using if not database_exists(engine.url): I get this error:
InterfaceError('(pyodbc.InterfaceError) (\'28000\', u\'[28000]
[Microsoft][SQL Server Native Client 11.0][SQL Server]Login failed for
user \\'myUser\\'. (18456) (SQLDriverConnect); [28000]
[Microsoft][SQL Server Native Client 11.0][SQL Server]Cannot open
database "MY_DATABASE" requested by the login. The login failed.
(4060); [28000] [Microsoft][SQL Server Native Client 11.0][SQL
Server]Login failed for user \\'myUser\\'. (18456); [28000]
[Microsoft][SQL Server Native Client 11.0][SQL Server]Cannot open
database "MY_DATABASE" requested by the login. The login failed.
(4060)\')',)
Also contextlib/suppress was not working and I'm not using postgres so I ended up doing this to ignore the exception if the database happens to already exist with SQL Server:
import logging
import sqlalchemy
logging.basicConfig(filename='app.log', format='%(asctime)s-%(levelname)s-%(message)s', level=logging.DEBUG)
engine = create_engine('mssql+pyodbc://myUser:mypwd#localhost:1234/MY_DATABASE?driver=SQL+Server+Native+Client+11.0?trusted_connection=yes', isolation_level = "AUTOCOMMIT")
try:
engine.execute('CREATE DATABASE ' + a_database_name)
except Exception as db_exc:
logging.exception("Exception creating database: " + str(db_exc))
If someone like me don't want to take whole sqlalchemy_utils to your project just for database creation, you can use script like this. I've come with it, based on SingleNegationElimination's answer. I'm using pydantic here (it's FastAPI project) and my imported settings for reference, but you can easily change this:
from sqlalchemy import create_engine
from sqlalchemy.exc import OperationalError
from pydantic import PostgresDsn
from src.conf import settings
def build_db_connection_url(custom_db: Optional[str] = None):
db_name = f"/{settings.POSTGRES_DB or ''}" if custom_db is None else "/" + custom_db
return PostgresDsn.build(
scheme='postgresql+psycopg2',
user=settings.POSTGRES_USER,
password=settings.POSTGRES_PASSWORD,
host=settings.POSTGRES_HOST,
path=db_name,
)
def create_database(db_name: str):
try:
eng = create_engine(build_db_connection_url(custom_db=db_name))
conn = eng.connect()
conn.close()
except OperationalError as exc:
if "does not exist" in exc.__str__():
eng = create_engine(build_db_connection_url(custom_db="postgres"))
conn = eng.connect()
conn.execute("commit")
conn.execute(f"create database {db_name}")
conn.close()
print(f"Database {db_name} created")
else:
raise exc
eng.dispose()
create_database("test_database")