How to create a new database using SQLAlchemy? - python

Using SQLAlchemy, an Engine object is created like this:
from sqlalchemy import create_engine
engine = create_engine("postgresql://localhost/mydb")
Accessing engine fails if the database specified in the argument to create_engine (in this case, mydb) does not exist. Is it possible to tell SQLAlchemy to create a new database if the specified database doesn't exist?

SQLAlchemy-Utils provides custom data types and various utility functions for SQLAlchemy. You can install the most recent official version using pip:
pip install sqlalchemy-utils
The database helpers include a create_database function:
from sqlalchemy import create_engine
from sqlalchemy_utils import database_exists, create_database
engine = create_engine("postgres://localhost/mydb")
if not database_exists(engine.url):
create_database(engine.url)
print(database_exists(engine.url))

On postgres, three databases are normally present by default. If you are able to connect as a superuser (eg, the postgres role), then you can connect to the postgres or template1 databases. The default pg_hba.conf permits only the unix user named postgres to use the postgres role, so the simplest thing is to just become that user. At any rate, create an engine as usual with a user that has the permissions to create a database:
>>> engine = sqlalchemy.create_engine("postgres://postgres#/postgres")
You cannot use engine.execute() however, because postgres does not allow you to create databases inside transactions, and sqlalchemy always tries to run queries in a transaction. To get around this, get the underlying connection from the engine:
>>> conn = engine.connect()
But the connection will still be inside a transaction, so you have to end the open transaction with a commit:
>>> conn.execute("commit")
And you can then proceed to create the database using the proper PostgreSQL command for it.
>>> conn.execute("create database test")
>>> conn.close()

It's possible to avoid manual transaction management while creating database by providing isolation_level='AUTOCOMMIT' to create_engine function:
import sqlalchemy
with sqlalchemy.create_engine(
'postgresql:///postgres',
isolation_level='AUTOCOMMIT'
).connect() as connection:
connection.execute('CREATE DATABASE my_database')
Also if you are not sure that database doesn't exist there is a way to ignore database creation error due to existence by suppressing sqlalchemy.exc.ProgrammingError exception:
import contextlib
import sqlalchemy.exc
with contextlib.suppress(sqlalchemy.exc.ProgrammingError):
# creating database as above

Extending the accepted answer using with yields:
from sqlalchemy import create_engine
engine = create_engine("postgresql://localhost")
NEW_DB_NAME = 'database_name'
with engine.connect() as conn:
conn.execute("commit")
# Do not substitute user-supplied database names here.
conn.execute(f"CREATE DATABASE {NEW_DB_NAME}")

Please note that I couldn't get the above suggestions with database_exists because whenever I check if the database exists using if not database_exists(engine.url): I get this error:
InterfaceError('(pyodbc.InterfaceError) (\'28000\', u\'[28000]
[Microsoft][SQL Server Native Client 11.0][SQL Server]Login failed for
user \\'myUser\\'. (18456) (SQLDriverConnect); [28000]
[Microsoft][SQL Server Native Client 11.0][SQL Server]Cannot open
database "MY_DATABASE" requested by the login. The login failed.
(4060); [28000] [Microsoft][SQL Server Native Client 11.0][SQL
Server]Login failed for user \\'myUser\\'. (18456); [28000]
[Microsoft][SQL Server Native Client 11.0][SQL Server]Cannot open
database "MY_DATABASE" requested by the login. The login failed.
(4060)\')',)
Also contextlib/suppress was not working and I'm not using postgres so I ended up doing this to ignore the exception if the database happens to already exist with SQL Server:
import logging
import sqlalchemy
logging.basicConfig(filename='app.log', format='%(asctime)s-%(levelname)s-%(message)s', level=logging.DEBUG)
engine = create_engine('mssql+pyodbc://myUser:mypwd#localhost:1234/MY_DATABASE?driver=SQL+Server+Native+Client+11.0?trusted_connection=yes', isolation_level = "AUTOCOMMIT")
try:
engine.execute('CREATE DATABASE ' + a_database_name)
except Exception as db_exc:
logging.exception("Exception creating database: " + str(db_exc))

If someone like me don't want to take whole sqlalchemy_utils to your project just for database creation, you can use script like this. I've come with it, based on SingleNegationElimination's answer. I'm using pydantic here (it's FastAPI project) and my imported settings for reference, but you can easily change this:
from sqlalchemy import create_engine
from sqlalchemy.exc import OperationalError
from pydantic import PostgresDsn
from src.conf import settings
def build_db_connection_url(custom_db: Optional[str] = None):
db_name = f"/{settings.POSTGRES_DB or ''}" if custom_db is None else "/" + custom_db
return PostgresDsn.build(
scheme='postgresql+psycopg2',
user=settings.POSTGRES_USER,
password=settings.POSTGRES_PASSWORD,
host=settings.POSTGRES_HOST,
path=db_name,
)
def create_database(db_name: str):
try:
eng = create_engine(build_db_connection_url(custom_db=db_name))
conn = eng.connect()
conn.close()
except OperationalError as exc:
if "does not exist" in exc.__str__():
eng = create_engine(build_db_connection_url(custom_db="postgres"))
conn = eng.connect()
conn.execute("commit")
conn.execute(f"create database {db_name}")
conn.close()
print(f"Database {db_name} created")
else:
raise exc
eng.dispose()
create_database("test_database")

Related

How can I allow loading sqlite extensions with a sqlalchemy engine?

Overview
I want to include sqlite extensions in sqlalchemy.
Issues
When I try to load the extension I get a not authorized error.
MVE
Setup engine
import sqlalchemy
engine = sqlalchemy.create_engine('sqlite:///:memory:')
extension = '/path/to/extension.dll'
with engine.begin() as conn:
conn.execute(
'SELECT load_extension(:path)',
path=extension
).fetchall()
Error
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) not authorized
[SQL: SELECT load_extension(:path)]
[parameters: {'path': '/path/to/extension.dll'}]
(Background on this error at: https://sqlalche.me/e/14/e3q8)
Known Alternative
The sqlite3 library's connection has the enable_load_extension method. I cannot use sqlite3 because I am using sqlalchemy's ORM heavily. The sqlite3 method loads the extension without issue. Something similar to that method - but in sqlalchemy - would be ideal.
Based on Gord Thompson's hint and VPfB's answer for a similar Stackoverflow question, I was successfully able to enable the extension by attaching a handler for the "connect" event.
Ubuntu
from sqlalchemy import event, create_engine, text
engine = create_engine("sqlite:///:memory:")
#event.listens_for(engine, "connect")
def receive_connect(connection, _):
connection.enable_load_extension(True)
connection.execute("SELECT load_extension('mod_spatialite');")
connection.enable_load_extension(False)
with engine.connect() as connection:
result = connection.execute(text("SELECT spatialite_version() as version;"))
row = result.fetchone()
print(f"Spatialite Version: {row['version']}")
Windows
The path to the folder containing the DLL file (mod_spatialite.dll) must be added in front of the PATH variable. Windows binaries for SpatiaLite can be downloaded from gaia-gis.it.
import os
from sqlalchemy import event, create_engine, text
SPATIALITE_PATH = r"C:\apps\mod_spatialite-5.0.1-win-amd64"
os.environ["PATH"] = f"{SPATIALITE_PATH};{os.environ['PATH']}"
engine = create_engine("sqlite:///:memory:")
#event.listens_for(engine, "connect")
def receive_connect(connection, _):
connection.enable_load_extension(True)
connection.execute("SELECT load_extension('mod_spatialite')")
connection.enable_load_extension(False)
with engine.connect() as connection:
result = connection.execute(text("SELECT spatialite_version() as version;"))
row = result.fetchone()
print(f"Spatialite Version: {row['version']}")

Google Cloud Function with Google Cloud SQL: Python SQLalchemy engine works, but connection fails

I have a simple test script to ensure I can connect to and read/write using GCF to Google Cloud SQL instance.
def update_db(request):
import sqlalchemy
# connect to GCP SQL db
user = 'root'
pw = 'XXX'
conn_name = 'hb-project-319121:us-east4:hb-data'
db_name = 'Homebase'
url = 'mysql+pymysql://{}:{}#/{}?unix_socket=/cloudsql/{}'.format(user,pw,db_name,conn_name)
engine = sqlalchemy.create_engine(url,pool_size=1,max_overflow=0)
# Returns
return f'Success'
This is successful but when I try to add a connection:
conn = engine.connect()
I get the following error:
pymysql.err.OperationalError: (1045, "Access denied for user 'root'#'cloudsqlproxy~' (using password: YES)")
Seems odd that engine can be created but no connection can be made to it? Worth noting that any kind of execute using the engine, e.g. engine.execute('select * from table limit 3') will also lead to the error above.
Any thoughts are appreciated. Thanks for your time
Cheers,
Dave
When the create_engine method is called, it creates the necessary objects, but does not actually attempt to connect to the database.
Per the SQLAlchemy documentation:
[create_engine] creates a Dialect object tailored towards [the database in the connection string], as well as a Pool object which will establish a DBAPI connection at [host]:[port] when a connection request is first received.
The error message you receive from MySQL is likely due to the fact that the special user account for the Cloud SQL Auth proxy has not yet been created.
You need to create a user in CloudSQL (MySQL) ('[USER_NAME]'#'cloudsqlproxy~%' or '[USER_NAME]'#'cloudsqlproxy~[IP_ADDRESS]' ).

pyodbc autocommit does not appear to work with sybase and sqlalchemy

I am connecting to a sybase ASE 15 database from Python 3.4 using pyodbc and executing a stored procedure.
All works as expected if I use native pyodbc:
import pd
import pyodbc
con = pyodbc.connect('DSN=dsn_name;UID=username;PWD=password', autocommit=True)
df = pd.read_sql("exec p_procecure #GroupName='GROUP'", con)
[Driver is Adaptive Server Enterprise].
I have to have autocommit=True and if I do no I get the following error:
DatabaseError: Execution failed on sql 'exec ....': ('ZZZZZ', "[ZZZZZ]
[SAP][ASE ODBC Driver][Adaptive Server Enterprise]Stored procedure
'p_procedure' may be run only in unchained transaction mode. The 'SET
CHAINED OFF' command will cause the current session to use unchained
transaction mode.\n (7713) (SQLExecDirectW)")
I attempt to achieve the same using SQLAlchemy (1.0.9):
from sqlalchemy import create_engine, engine
from sqlalchemy.orm import sessionmaker
from sqlalchemy.sql import text
url = r'sybase+pyodbc://username:password#dsn'
engine = create_engine(url, echo=True)
sess = sessionmaker(bind=engine).Session()
df = pd.read_sql(text("exec p_procedure #GroupName='GROUP'"),conn.execution_options(autocommit=True))
The error message is the same despite the fact I have specified autocommit=True on the connection. (I have also tested this at the session level but should not be necessary and made no difference).
DBAPIError: (pyodbc.Error) ('ZZZZZ', "[ZZZZZ] [SAP][ASE ODBC
Driver][Adaptive Server Enterprise]....
Can you see anything wrong here?
As always, any help would be much appreciated.
Passing the autocommit=True argument as an item in the connect_args argument dictionary does work:
connect_args = {'autocommit': True}
create_engine(url, connect_args=connect_args)
connect_args – a dictionary of options which will be passed directly
to the DBAPI’s connect() method as additional keyword arguments.
I had some problems with autocommit option. The only thing that worked for me was to change this option to True after establishing connection.
ConnString = 'Driver=%SQL_DRIVER%;Server=%SQL_SERVER%;Uid=%SQL_LOGIN%;Pwd=%SQL_PASSWORD%;'
SQL_CONNECTION = pyodbc.connect(ConnString)
SQL_CONNECTION.autocommit = True

How do I connect to SQL Server via sqlalchemy using Windows Authentication?

sqlalchemy, a db connection module for Python, uses SQL Authentication (database-defined user accounts) by default. If you want to use your Windows (domain or local) credentials to authenticate to the SQL Server, the connection string must be changed.
By default, as defined by sqlalchemy, the connection string to connect to the SQL Server is as follows:
sqlalchemy.create_engine('mssql://*username*:*password*#*server_name*/*database_name*')
This, if used using your Windows credentials, would throw an error similar to this:
sqlalchemy.exc.DBAPIError: (Error) ('28000', "[28000] [Microsoft][ODBC SQL Server Driver][SQL Server]Login failed for us
er '***S\\username'. (18456) (SQLDriverConnect); [28000] [Microsoft][ODBC SQL Server Driver][SQL Server]Login failed for us
er '***S\\username'. (18456)") None None
In this error message, the code 18456 identifies the error message thrown by the SQL Server itself. This error signifies that the credentials are incorrect.
In order to use Windows Authentication with sqlalchemy and mssql, the following connection string is required:
ODBC Driver:
engine = sqlalchemy.create_engine('mssql://*server_name*/*database_name*?trusted_connection=yes')
SQL Express Instance:
engine = sqlalchemy.create_engine('mssql://*server_name*\\SQLEXPRESS/*database_name*?trusted_connection=yes')
If you're using a trusted connection/AD and not using username/password, or otherwise see the following:
SAWarning: No driver name specified; this is expected by PyODBC when using >DSN-less connections
"No driver name specified; "
Then this method should work:
from sqlalchemy import create_engine
server = <your_server_name>
database = <your_database_name>
engine = create_engine('mssql+pyodbc://' + server + '/' + database + '?trusted_connection=yes&driver=ODBC+Driver+13+for+SQL+Server')
A more recent response if you want to connect to the MSSQL DB from a different user than the one you're logged with on Windows. It works as well if you are connecting from a Linux machine with FreeTDS installed.
The following worked for me from both Windows 10 and Ubuntu 18.04 using Python 3.6 & 3.7:
import getpass
from sqlalchemy import create_engine
password = getpass.getpass()
eng_str = fr'mssql+pymssql://{domain}\{username}:{password}#{hostip}/{db}'
engine = create_engine(eng_str)
What changed was to add the Windows domain before \username.
You'll need to install the pymssql package.
Create Your SqlAlchemy Connection URL      From Your pyodbc Connection String      OR Your Known Connection Parameters
I found all the other answers to be educational, and I found the SqlAlchemy Docs on connection strings helpful too, but I kept failing to connect to MS SQL Server Express 19 where I was using no username or password and trusted_connection='yes' (just doing development at this point).
Then I found THIS method in the SqlAlchemy Docs on Connection URLs built from a pyodbc connection string (or just a connection string), which is also built from known connection parameters (i.e. this can simply be thought of as a connection string that is not necessarily used in pyodbc). Since I knew my pyodbc connection string was working, this seemed like it would work for me, and it did!
This method takes the guesswork out of creating the correct format for what you feed to the SqlAlchemy create_engine method. If you know your connection parameters, you put those into a simple string per the documentation exemplified by the code below, and the create method in the URL class of the sqlalchemy.engine module does the correct formatting for you.
The example code below runs as is and assumes a database named master and an existing table named table_one with the schema shown below. Also, I am using pandas to import my table data. Otherwise, we'd want to use a context manager to manage connecting to the database and then closing the connection like HERE in the SqlAlchemy docs.
import pandas as pd
import sqlalchemy
from sqlalchemy.engine import URL
# table_one dictionary:
table_one = {'name': 'table_one',
'columns': ['ident int IDENTITY(1,1) PRIMARY KEY',
'value_1 int NOT NULL',
'value_2 int NOT NULL']}
# pyodbc stuff for MS SQL Server Express
driver='{SQL Server}'
server='localhost\SQLEXPRESS'
database='master'
trusted_connection='yes'
# pyodbc connection string
connection_string = f'DRIVER={driver};SERVER={server};'
connection_string += f'DATABASE={database};'
connection_string += f'TRUSTED_CONNECTION={trusted_connection}'
# create sqlalchemy engine connection URL
connection_url = URL.create(
"mssql+pyodbc", query={"odbc_connect": connection_string})
""" more code not shown that uses pyodbc without sqlalchemy """
engine = sqlalchemy.create_engine(connection_url)
d = {'value_1': [1, 2], 'value_2': [3, 4]}
df = pd.DataFrame(data=d)
df.to_sql('table_one', engine, if_exists="append", index=False)
Update
Let's say you've installed SQL Server Express on your linux machine. You can use the following commands to make sure you're using the correct strings for the following:
For the driver: odbcinst -q -d
For the server: sqlcmd -S localhost -U <username> -P <password> -Q 'select ##SERVERNAME'
pyodbc
I think that you need to put:
"+pyodbc" after mssql
try this:
from sqlalchemy import create_engine
engine = create_engine("mssql+pyodbc://user:password#host:port/databasename?driver=ODBC+Driver+17+for+SQL+Server")
cnxn = engine.connect()
It works for me
Luck!
If you are attempting to connect:
DNS-less
Windows Authentication for a server not locally hosted.
Without using ODBC connections.
Try the following:
import sqlalchemy
engine = sqlalchemy.create_engine('mssql+pyodbc://' + server + '/' + database + '?trusted_connection=yes&driver=SQL+Server')
This avoids using ODBC connections and thus avoids pyobdc interface errors from DPAPI2 vs DBAPI3 conflicts.
I would recommend using the URL creation tool instead of creating the url from scratch.
connection_url = sqlalchemy.engine.URL.create("mssql+pyodbc",database=databasename, host=servername, query = {'driver':'SQL Server'})
engine = sqlalchemy.create_engine(connection_url)
See this link for creating a connection string with SQL Server Authentication (non-domain, uses username and password)

Connect to an URI in postgres

I'm guessing this is a pretty basic question, but I can't figure out why:
import psycopg2
psycopg2.connect("postgresql://postgres:postgres#localhost/postgres")
Is giving the following error:
psycopg2.OperationalError: missing "=" after
"postgresql://postgres:postgres#localhost/postgres" in connection info string
Any idea? According to the docs about connection strings I believe it should work, however it only does like this:
psycopg2.connect("host=localhost user=postgres password=postgres dbname=postgres")
I'm using the latest psycopg2 version on Python2.7.3 on Ubuntu12.04
I would use the urlparse module to parse the url and then use the result in the connection method. This way it's possible to overcome the psycop2 problem.
from urlparse import urlparse # for python 3+ use: from urllib.parse import urlparse
result = urlparse("postgresql://postgres:postgres#localhost/postgres")
username = result.username
password = result.password
database = result.path[1:]
hostname = result.hostname
port = result.port
connection = psycopg2.connect(
database = database,
user = username,
password = password,
host = hostname,
port = port
)
The connection string passed to psycopg2.connect is not parsed by psycopg2: it is passed verbatim to libpq. Support for connection URIs was added in PostgreSQL 9.2.
To update on this, Psycopg3 does actually include a way to parse a database connection URI.
Example:
import psycopg # must be psycopg 3
pg_uri = "postgres://jeff:hunter2#example.com/db"
conn_dict = psycopg.conninfo.conninfo_to_dict(pg_uri)
with psycopg.connect(**conn_dict) as conn:
...
Another option is using SQLAlchemy for this. It's not just ORM, it consists of two distinct components Core and ORM, and it can be used completely without using ORM layer.
SQLAlchemy provides such functionality out of the box by create_engine function. Moreover, via URI you can specify DBAPI driver or many various postgresql settings.
Some examples:
# default
engine = create_engine("postgresql://user:pass#localhost/mydatabase")
# psycopg2
engine = create_engine("postgresql+psycopg2://user:pass#localhost/mydatabase")
# pg8000
engine = create_engine("postgresql+pg8000://user:pass#localhost/mydatabase")
# psycopg3 (available only in SQLAlchemy 2.0, which is currently in beta)
engine = create_engine("postgresql+psycopg://user:pass#localhost/test")
And here is a fully working example:
import sqlalchemy as sa
# set connection URI here ↓
engine = sa.create_engine("postgresql://user:password#db_host/db_name")
ddl_script = sa.DDL("""
CREATE TABLE IF NOT EXISTS demo_table (
id serial PRIMARY KEY,
data TEXT NOT NULL
);
""")
with engine.begin() as conn:
# do DDL and insert data in a transaction
conn.execute(ddl_script)
conn.exec_driver_sql("INSERT INTO demo_table (data) VALUES (%s)",
[("test1",), ("test2",)])
conn.execute(sa.text("INSERT INTO demo_table (data) VALUES (:data)"),
[{"data": "test3"}, {"data": "test4"}])
with engine.connect() as conn:
cur = conn.exec_driver_sql("SELECT * FROM demo_table LIMIT 2")
for name in cur.fetchall():
print(name)
# you also can obtain raw DBAPI connection
rconn = engine.raw_connection()
SQLAlchemy provides many other benefits:
You can easily switch DBAPI implementations just by changing URI (psycopg2, psycopg2cffi, etc), or maybe even databases.
It implements connection pooling out of the box (both psycopg2 and psycopg3 has connection pooling, but API is different)
asyncio support via create_async_engine (psycopg3 also supports asyncio).

Categories

Resources