How to specify a search path with SQL Alchemy and pg8000? - python

I'm trying to connect to a postgres db using SQL Alchemy and the pg8000 driver. I'd like to specify a search path for this connection. With the Psycopg driver, I could do this by doing something like
engine = create_engine(
'postgresql+psycopg2://dbuser#dbhost:5432/dbname',
connect_args={'options': '-csearch_path={}'.format(dbschema)})
However, this does not work for the pg8000 driver. Is there a good way to do this?

You can use pg8000 pretty much in the same way as psycopg2, just need to swap scheme from postgresql+psycopg2 to postgresql+pg8000.
The full connection string definition is in the SQLAlchemy pg8000 docs:
postgresql+pg8000://user:password#host:port/dbname[?key=value&key=value...]
But while psycopg2.connect will pass kwargs to the server (like options and its content), pg8000.connect will not, so there is no setting search_path with pg8000.

The SQLAlchemy docs describe how to do this. For example:
from sqlalchemy import create_engine, event, text
engine = create_engine("postgresql+pg8000://postgres:postgres#localhost/postgres")
#event.listens_for(engine, "connect", insert=True)
def set_search_path(dbapi_connection, connection_record):
existing_autocommit = dbapi_connection.autocommit
dbapi_connection.autocommit = True
cursor = dbapi_connection.cursor()
cursor.execute("SET SESSION search_path='myschema'")
cursor.close()
dbapi_connection.autocommit = existing_autocommit
with engine.connect() as connection:
result = connection.execute(text("SHOW search_path"))
for row in result:
print(row)
However, as it says in the docs:
SQLAlchemy is generally organized around the concept of keeping this
variable at its default value of public

Related

How to specify Schema in psycopg2 connection method?

Using the psycopg2 module to connect to the PostgreSQL database using python. Able to execute all queries using the below connection method. Now I want to specify a different schema than public to execute my SQL statements. Is there any way to specify the schema name in the connection method?
conn = psycopg2.connect(host="localhost",
port="5432",
user="postgres",
password="password",
database="database",
)
I tried to specify schema directly inside the method.
schema="schema2"
But I am getting the following programming error.
ProgrammingError: invalid dsn: invalid connection option "schema"
When we were working on ThreadConnectionPool which is in psycopg2 and creating connection pool, this is how we did it.
from psycopg2.pool import ThreadedConnectionPool
db_conn = ThreadedConnectionPool(
minconn=1, maxconn=5,
user="postgres", password="password", database="dbname", host="localhost", port=5432,
options="-c search_path=dbo,public"
)
You see that options key there in params. That's how we did it.
When you execute a query using the cursor from that connection, it will search across those schemas mentioned in options i.e., dbo,public in sequence from left to right.
You may try something like this:
psycopg2.connect(host="localhost",
port="5432",
user="postgres",
password="password",
database="database",
options="-c search_path=dbo,public")
Hope this might help you.
If you are using the string form you need to URL escape the options argument:
postgresql://localhost/airflow?options=-csearch_path%3Ddbo,public
(%3D = URL encoding of =)
This helps if you are using SQLAlchemy for example.

How to connect to a cluster in Amazon Redshift using SQLAlchemy?

In Amazon Redshift's Getting Started Guide, it's mentioned that you can utilize SQL client tools that are compatible with PostgreSQL to connect to your Amazon Redshift Cluster.
In the tutorial, they utilize SQL Workbench/J client, but I'd like to utilize python (in particular SQLAlchemy). I've found a related question, but the issue is that it does not go into the detail or the python script that connects to the Redshift Cluster.
I've been able to connect to the cluster via SQL Workbench/J, since I have the JDBC URL, as well as my username and password, but I'm not sure how to connect with SQLAlchemy.
Based on this documentation, I've tried the following:
from sqlalchemy import create_engine
engine = create_engine('jdbc:redshift://shippy.cx6x1vnxlk55.us-west-2.redshift.amazonaws.com:5439/shippy')
ERROR:
Could not parse rfc1738 URL from string 'jdbc:redshift://shippy.cx6x1vnxlk55.us-west-2.redshift.amazonaws.com:5439/shippy'
I don't think SQL Alchemy "natively" knows about Redshift. You need to change the JDBC "URL" string to use postgres.
jdbc:postgres://shippy.cx6x1vnxlk55.us-west-2.redshift.amazonaws.com:5439/shippy
Alternatively, you may want to try using sqlalchemy-redshift using the instructions they provide.
I was running into the exact same issue, and then I remembered to include my Redshift credentials:
eng = create_engine('postgresql://[LOGIN]:[PASSWORD]#shippy.cx6x1vnxlk55.us-west-2.redshift.amazonaws.com:5439/shippy')
sqlalchemy-redshift is works for me, but after few days of reserch
packages (python3.4):
SQLAlchemy==1.0.14 sqlalchemy-redshift==0.5.0 psycopg2==2.6.2
First of all, I checked, that my query is working workbench (http://www.sql-workbench.net), then I force it work in sqlalchemy (this https://stackoverflow.com/a/33438115/2837890 helps to know that auto_commit or session.commit() must be):
db_credentials = (
'redshift+psycopg2://{p[redshift_user]}:{p[redshift_password]}#{p[redshift_host]}:{p[redshift_port]}/{p[redshift_database]}'
.format(p=config['Amazon_Redshift_parameters']))
engine = create_engine(db_credentials, connect_args={'sslmode': 'prefer'})
connection = engine.connect()
result = connection.execute(text(
"COPY assets FROM 's3://xx/xx/hello.csv' WITH CREDENTIALS "
"'aws_access_key_id=xxx_id;aws_secret_access_key=xxx'"
" FORMAT csv DELIMITER ',' IGNOREHEADER 1 ENCODING UTF8;").execution_options(autocommit=True))
result = connection.execute("select * from assets;")
print(result, type(result))
print(result.rowcount)
connection.close()
And after that, I forced to work sqlalchemy_redshift CopyCommand perhaps bad way, looks little tricky:
import sqlalchemy as sa
tbl2 = sa.Table(TableAssets, sa.MetaData())
copy = dialect_rs.CopyCommand(
assets,
data_location='s3://xx/xx/hello.csv',
access_key_id=access_key_id,
secret_access_key=secret_access_key,
truncate_columns=True,
delimiter=',',
format='CSV',
ignore_header=1,
# empty_as_null=True,
# blanks_as_null=True,
)
print(str(copy.compile(dialect=RedshiftDialect(), compile_kwargs={'literal_binds': True})))
print(dir(copy))
connection = engine.connect()
connection.execute(copy.execution_options(autocommit=True))
connection.close()
We make just that I made with sqlalchemy, excute query, except comine query by CopyCommand. I have not see some profit :(.
The following works for me with Databricks on all kinds of SQLs
import sqlalchemy as SA
import psycopg2
host = 'your_host_url'
username = 'your_user'
password = 'your_passw'
port = 5439
url = "{d}+{driver}://{u}:{p}#{h}:{port}/{db}".\
format(d="redshift",
driver='psycopg2',
u=username,
p=password,
h=host,
port=port,
db=db)
engine = SA.create_engine(url)
cnn = engine.connect()
strSQL = "your_SQL ..."
try:
cnn.execute(strSQL)
except:
raise
import sqlalchemy as db
engine = db.create_engine('postgres://username:password#url:5439/db_name')
This worked for me

pyodbc autocommit does not appear to work with sybase and sqlalchemy

I am connecting to a sybase ASE 15 database from Python 3.4 using pyodbc and executing a stored procedure.
All works as expected if I use native pyodbc:
import pd
import pyodbc
con = pyodbc.connect('DSN=dsn_name;UID=username;PWD=password', autocommit=True)
df = pd.read_sql("exec p_procecure #GroupName='GROUP'", con)
[Driver is Adaptive Server Enterprise].
I have to have autocommit=True and if I do no I get the following error:
DatabaseError: Execution failed on sql 'exec ....': ('ZZZZZ', "[ZZZZZ]
[SAP][ASE ODBC Driver][Adaptive Server Enterprise]Stored procedure
'p_procedure' may be run only in unchained transaction mode. The 'SET
CHAINED OFF' command will cause the current session to use unchained
transaction mode.\n (7713) (SQLExecDirectW)")
I attempt to achieve the same using SQLAlchemy (1.0.9):
from sqlalchemy import create_engine, engine
from sqlalchemy.orm import sessionmaker
from sqlalchemy.sql import text
url = r'sybase+pyodbc://username:password#dsn'
engine = create_engine(url, echo=True)
sess = sessionmaker(bind=engine).Session()
df = pd.read_sql(text("exec p_procedure #GroupName='GROUP'"),conn.execution_options(autocommit=True))
The error message is the same despite the fact I have specified autocommit=True on the connection. (I have also tested this at the session level but should not be necessary and made no difference).
DBAPIError: (pyodbc.Error) ('ZZZZZ', "[ZZZZZ] [SAP][ASE ODBC
Driver][Adaptive Server Enterprise]....
Can you see anything wrong here?
As always, any help would be much appreciated.
Passing the autocommit=True argument as an item in the connect_args argument dictionary does work:
connect_args = {'autocommit': True}
create_engine(url, connect_args=connect_args)
connect_args – a dictionary of options which will be passed directly
to the DBAPI’s connect() method as additional keyword arguments.
I had some problems with autocommit option. The only thing that worked for me was to change this option to True after establishing connection.
ConnString = 'Driver=%SQL_DRIVER%;Server=%SQL_SERVER%;Uid=%SQL_LOGIN%;Pwd=%SQL_PASSWORD%;'
SQL_CONNECTION = pyodbc.connect(ConnString)
SQL_CONNECTION.autocommit = True

Connect to an URI in postgres

I'm guessing this is a pretty basic question, but I can't figure out why:
import psycopg2
psycopg2.connect("postgresql://postgres:postgres#localhost/postgres")
Is giving the following error:
psycopg2.OperationalError: missing "=" after
"postgresql://postgres:postgres#localhost/postgres" in connection info string
Any idea? According to the docs about connection strings I believe it should work, however it only does like this:
psycopg2.connect("host=localhost user=postgres password=postgres dbname=postgres")
I'm using the latest psycopg2 version on Python2.7.3 on Ubuntu12.04
I would use the urlparse module to parse the url and then use the result in the connection method. This way it's possible to overcome the psycop2 problem.
from urlparse import urlparse # for python 3+ use: from urllib.parse import urlparse
result = urlparse("postgresql://postgres:postgres#localhost/postgres")
username = result.username
password = result.password
database = result.path[1:]
hostname = result.hostname
port = result.port
connection = psycopg2.connect(
database = database,
user = username,
password = password,
host = hostname,
port = port
)
The connection string passed to psycopg2.connect is not parsed by psycopg2: it is passed verbatim to libpq. Support for connection URIs was added in PostgreSQL 9.2.
To update on this, Psycopg3 does actually include a way to parse a database connection URI.
Example:
import psycopg # must be psycopg 3
pg_uri = "postgres://jeff:hunter2#example.com/db"
conn_dict = psycopg.conninfo.conninfo_to_dict(pg_uri)
with psycopg.connect(**conn_dict) as conn:
...
Another option is using SQLAlchemy for this. It's not just ORM, it consists of two distinct components Core and ORM, and it can be used completely without using ORM layer.
SQLAlchemy provides such functionality out of the box by create_engine function. Moreover, via URI you can specify DBAPI driver or many various postgresql settings.
Some examples:
# default
engine = create_engine("postgresql://user:pass#localhost/mydatabase")
# psycopg2
engine = create_engine("postgresql+psycopg2://user:pass#localhost/mydatabase")
# pg8000
engine = create_engine("postgresql+pg8000://user:pass#localhost/mydatabase")
# psycopg3 (available only in SQLAlchemy 2.0, which is currently in beta)
engine = create_engine("postgresql+psycopg://user:pass#localhost/test")
And here is a fully working example:
import sqlalchemy as sa
# set connection URI here ↓
engine = sa.create_engine("postgresql://user:password#db_host/db_name")
ddl_script = sa.DDL("""
CREATE TABLE IF NOT EXISTS demo_table (
id serial PRIMARY KEY,
data TEXT NOT NULL
);
""")
with engine.begin() as conn:
# do DDL and insert data in a transaction
conn.execute(ddl_script)
conn.exec_driver_sql("INSERT INTO demo_table (data) VALUES (%s)",
[("test1",), ("test2",)])
conn.execute(sa.text("INSERT INTO demo_table (data) VALUES (:data)"),
[{"data": "test3"}, {"data": "test4"}])
with engine.connect() as conn:
cur = conn.exec_driver_sql("SELECT * FROM demo_table LIMIT 2")
for name in cur.fetchall():
print(name)
# you also can obtain raw DBAPI connection
rconn = engine.raw_connection()
SQLAlchemy provides many other benefits:
You can easily switch DBAPI implementations just by changing URI (psycopg2, psycopg2cffi, etc), or maybe even databases.
It implements connection pooling out of the box (both psycopg2 and psycopg3 has connection pooling, but API is different)
asyncio support via create_async_engine (psycopg3 also supports asyncio).

How to create db in MySQL with SQLAlchemy?

I need to create a db in MySQL using SQLAlchemy, I am able to connect to a db if it already exists, but I want to be able to create it if it does not exist. These are my tables:
#def __init__(self):
Base = declarative_base()
class utente(Base):
__tablename__="utente"
utente_id=Column(Integer,primary_key=True)
nome_utente=Column(Unicode(20))
ruolo=Column(String(10))
MetaData.create_all()
def __repr(self):
return "utente: {0}, {1}, id: {2}".format(self.ruolo,self.nome_utente,self.utente_id)
class dbmmas(Base):
__tablename__="dbmmas"
db_id=Column(Integer,primary_key=True,autoincrement=True)
nome_db=Column(String(10))
censimento=Column(Integer)
versione=Column(Integer)
ins_data=Column(DateTime)
mod_data=Column(DateTime)
ins_utente=Column(Integer)
mod_utente=Column(Integer)
MetaData.create_all()
def __repr(self):
return "dbmmas: {0}, censimento {1}, versione {2}".format(self.nome_db,self.censimento,self.versione)
class funzione(Base):
__tablename__="funzione"
funzione_id=Column(Integer,primary_key=True,autoincrement=True)
categoria=Column(String(10))
nome=Column(String(20))
def __repr__(self):
return "funzione:{0},categoria:{1},id:{2} ".format(self.nome,self.categoria,self.funzione_id)
class profilo(Base):
__tablename__="rel_utente_funzione"
utente_id=Column(Integer,primary_key=True)
funzione_id=Column(Integer,primary_key=True)
amministratore=Column(Integer)
MetaData.create_all()
def __repr(self):
l=lambda x: "amministratore" if x==1 else "generico"
return "profilo per utente_id:{0}, tipo: {1}, funzione_id: {2}".format(self.utente_id,l(self.amministratore),self.funzione_id)
class aree(Base):
__tablename__="rel_utente_zona"
UTB_id=Column(String(10), primary_key=True) # "in realta' si tratta della seatureSignature della feature sullo shapefile"
utente_id=Column(Integer, primary_key=True)
amministratore=Column(Integer)
MetaData.create_all()
def __repr(self):
l=lambda x: "amministratore" if x==1 else "generico"
return "zona: {0}, pe utente_id:{1}, {2}".format(self.UTB_id,self.utente_id,l(self.amministratore))
class rel_utente_dbmmas(Base):
__tablename__="rel_utente_dbmmas"
utente_id=Column(Integer,primary_key=True)
db_id=Column(Integer,primary_key=True)
amministratore=(Integer)
MetaData.create_all()
def __repr(self):
l=lambda x: "amministratore" if x==1 else "generico"
return "dbregistrato: {0} per l'utente{1} {2}".format(self.db_id,self.utente_id,l(self.amministratore))
To create a mysql database you just connect to the server an create the database:
import sqlalchemy
engine = sqlalchemy.create_engine('mysql://user:password#server') # connect to server
engine.execute("CREATE DATABASE dbname") #create db
engine.execute("USE dbname") # select new db
# use the new db
# continue with your work...
of course your user has to have the permission to create databases.
You can use SQLAlchemy-Utils for that.
pip install sqlalchemy-utils
Then you can do things like
from sqlalchemy_utils import create_database, database_exists
url = 'mysql://{0}:{1}#{2}:{3}'.format(user, pass, host, port)
if not database_exists(url):
create_database(url)
I found the answer here, it helped me a lot.
I don't know what the canonical way is, but here's a way to check to see if a database exists by checking against the list of databases, and to create it if it doesn't exist.
from sqlalchemy import create_engine
# This engine just used to query for list of databases
mysql_engine = create_engine('mysql://{0}:{1}#{2}:{3}'.format(user, pass, host, port))
# Query for existing databases
existing_databases = mysql_engine.execute("SHOW DATABASES;")
# Results are a list of single item tuples, so unpack each tuple
existing_databases = [d[0] for d in existing_databases]
# Create database if not exists
if database not in existing_databases:
mysql_engine.execute("CREATE DATABASE {0}".format(database))
print("Created database {0}".format(database))
# Go ahead and use this engine
db_engine = create_engine('mysql://{0}:{1}#{2}:{3}/{4}'.format(user, pass, host, port, db))
Here's an alternative method if you don't need to know if the database was created or not.
from sqlalchemy import create_engine
# This engine just used to query for list of databases
mysql_engine = create_engine('mysql://{0}:{1}#{2}:{3}'.format(user, pass, host, port))
# Query for existing databases
mysql_engine.execute("CREATE DATABASE IF NOT EXISTS {0} ".format(database))
# Go ahead and use this engine
db_engine = create_engine('mysql://{0}:{1}#{2}:{3}/{4}'.format(user, pass, host, port, db))
CREATE DATABASE IF NOT EXISTS dbName;
Would recommend using with:
from sqlalchemy import create_engine
username = ''
password = ''
host = 'localhost'
port = 3306
DB_NAME = 'db_name'
engine = create_engine(f"mysql://{username}:{password}#{host}:{port}")
with engine.connect() as conn:
# Do not substitute user-supplied database names here.
conn.execute(f"CREATE DATABASE IF NOT EXISTS {DB_NAME}")
The mysqlclient seems to be up to 10 times faster in benchmark tests than PyMySQL, see: What's the difference between MySQLdb, mysqlclient and MySQL connector/Python?.
Yet, why not use a Python-ready package for Python, at least, if it is not about every second of query time? PyMySQL is suggested by the following links, for example:
Using SQLAlchemy to access MySQL without frustrating library installation issues
How to connect MySQL database using Python+SQLAlchemy remotely?.
Python packages:
Install with pip, at best put in "requirements.txt":
PyMySQL
SQLAlchemy
Again, if it is about the best speed of the query, use mysqlclient package. Then you need to install an additional Linux package with sudo apt-get install libmysqlclient-dev.
import statements
Only one needed:
import sqlalchemy
Connection string (= db_url)
Connection string starting with {dialect/DBAPI}+{driver}:
db_url = mysql+pymysql://
where pymysql stands for the used Python package "PyMySQL" as the driver.
Again, if it is about the best speed of the query, use mysqlclient package. Then you need mysql+msqldb:// at this point.
For a remote connection, you need to add to the connection string:
host
user
password
database
port (the port only if it is not the standard 3306)
You can create your db_url with several methods. Do not write user and password and at best any other variable value directly in the string to avoid possible attacks:
sqlalchemy.engine.URL.create(), or with .url.URL, see an example at Connecting from Cloud Functions to Cloud SQL or an example which automatically adds ? suffixes, for example ?driver=SQL+Server, at the end of the string at Building a connection URL for mssql+pyodbc with sqlalchemy.engine.url.URL
f"""...{my_var}..."""
"""...{my_var}...""".format(my_var=xyz_var)
...
Example without the url helper of SQLAlchemy:
db_url = "{dialect}+{driver}://{user}:{password}#{host}:{port}/{database}".format(
or:
db_url = "{dialect}+{driver}://{user}:{password}#{host}/{database}?host={host}?port={port}".format(
dialect = 'mysql',
driver = 'pymysql',
username=db_user,
password=db_pass,
database=db_name,
host=db_host,
port=db_port
)
Other engine configurations
For other connection drivers, dialects and methods, see the SQLAlchemy 1.4 Documentation - Engine Configuration
Create the db if not exists
See How to create a new database using SQLAlchemy?.
engine = sqlalchemy.create_engine(db_url)
if not sqlalchemy.database_exists(engine.url):
create_database(engine.url)
with engine.connect() as conn:
conn.execute("commit")
conn.execute("create database test")

Categories

Resources