How to specify Schema in psycopg2 connection method? - python

Using the psycopg2 module to connect to the PostgreSQL database using python. Able to execute all queries using the below connection method. Now I want to specify a different schema than public to execute my SQL statements. Is there any way to specify the schema name in the connection method?
conn = psycopg2.connect(host="localhost",
port="5432",
user="postgres",
password="password",
database="database",
)
I tried to specify schema directly inside the method.
schema="schema2"
But I am getting the following programming error.
ProgrammingError: invalid dsn: invalid connection option "schema"

When we were working on ThreadConnectionPool which is in psycopg2 and creating connection pool, this is how we did it.
from psycopg2.pool import ThreadedConnectionPool
db_conn = ThreadedConnectionPool(
minconn=1, maxconn=5,
user="postgres", password="password", database="dbname", host="localhost", port=5432,
options="-c search_path=dbo,public"
)
You see that options key there in params. That's how we did it.
When you execute a query using the cursor from that connection, it will search across those schemas mentioned in options i.e., dbo,public in sequence from left to right.
You may try something like this:
psycopg2.connect(host="localhost",
port="5432",
user="postgres",
password="password",
database="database",
options="-c search_path=dbo,public")
Hope this might help you.

If you are using the string form you need to URL escape the options argument:
postgresql://localhost/airflow?options=-csearch_path%3Ddbo,public
(%3D = URL encoding of =)
This helps if you are using SQLAlchemy for example.

Related

How to specify a search path with SQL Alchemy and pg8000?

I'm trying to connect to a postgres db using SQL Alchemy and the pg8000 driver. I'd like to specify a search path for this connection. With the Psycopg driver, I could do this by doing something like
engine = create_engine(
'postgresql+psycopg2://dbuser#dbhost:5432/dbname',
connect_args={'options': '-csearch_path={}'.format(dbschema)})
However, this does not work for the pg8000 driver. Is there a good way to do this?
You can use pg8000 pretty much in the same way as psycopg2, just need to swap scheme from postgresql+psycopg2 to postgresql+pg8000.
The full connection string definition is in the SQLAlchemy pg8000 docs:
postgresql+pg8000://user:password#host:port/dbname[?key=value&key=value...]
But while psycopg2.connect will pass kwargs to the server (like options and its content), pg8000.connect will not, so there is no setting search_path with pg8000.
The SQLAlchemy docs describe how to do this. For example:
from sqlalchemy import create_engine, event, text
engine = create_engine("postgresql+pg8000://postgres:postgres#localhost/postgres")
#event.listens_for(engine, "connect", insert=True)
def set_search_path(dbapi_connection, connection_record):
existing_autocommit = dbapi_connection.autocommit
dbapi_connection.autocommit = True
cursor = dbapi_connection.cursor()
cursor.execute("SET SESSION search_path='myschema'")
cursor.close()
dbapi_connection.autocommit = existing_autocommit
with engine.connect() as connection:
result = connection.execute(text("SHOW search_path"))
for row in result:
print(row)
However, as it says in the docs:
SQLAlchemy is generally organized around the concept of keeping this
variable at its default value of public

SQL Server connection - Works in pyodbc, but not SQLAlchemy

This is a fairly common question but even using the answers on SO like here but I still can't connect.
When I setup my connection to pyodbc I can connect with the following:
cnxn = pyodbc.connect('DRIVER={SQL Server Native Client 11.0};SERVER=ip,port;DATABASE=db;UID=user;PWD=pass')
cursor = cnxn.cursor()
cursor.execute("some select query")
for row in cursor.fetchall():
print(row)
and it works.
However to do a .read_sql() in pandas I need to connect with sqlalchemy.
I have tried with both hosted connections and pass-through pyodbc connections like the below:
quoted = urllib.parse.quote_plus('DRIVER={SQL Server Native Client 11.0};Server=ip;Database=db;UID=user;PWD=pass;Port=port;')
engine = sqlalchemy.create_engine('mssql+pyodbc:///?odbc_connect={}'.format(quoted))
engine.connect()
I have tried with both SERVER=ip,port format and the separate Port=port parameter like above but still no luck.
The error I'm getting is Login failed for user 'user'. (18456)
Any help is much appreciated.
I assume that you want to create a DataFrame so when you have a cnxn you can pass it to Pandas read_sql_query function.
Example:
cnxn = pyodbc.connect('your connection string')
query = 'some query'
df = pandas.read_sql_query(query, conn)

Load table to Oracle through pandas io SQL

Im executing the following code, the purposes of the exeuction is to create a lookup-table in the Oracle data base to speed up my load of data. The table I want to load in is simply a vector with ID values, so only one column is loaded.
The code is written per below:
lookup = df.id_variable.drop_duplicates()
conn = my_oracle_connection()
obj = lookup.to_sql(name = 'lookup', con = conn, if_exists = 'replace')
I get the following error when exeucting this:
DatabaseError: Execution failed on sql 'SELECT name FROM sqlite_master
WHERE type='table' AND name=?;': ORA-01036: illegal variable
name/number
I can execute a psql.read_sql() query but above fails.
Now, I dont exactly know how to go about fixing it, im quite new to the technical aspects of getting this to work so any pointers in what direction to take it would be greately appriciated.
Thanks for any time and input!
I had the same issue when using cx_Oracle connection (I was able to use .read_sql function, but not the .to_sql one)
Use SQLalchemy connection instead:
import sqlalchemy as sa
oracle_db = sa.create_engine('oracle://username:password#database')
connection = oracle_db.connect()
dataframe.to_sql('table_name', connection, schema='schema_name', if_exists='append', index=False)
I think the problem happens writing to the Oracle DB using a connection object created by cx_Oracle. SqlAlchemy has a work around:
import cx_Oracle
from sqlalchemy import types, create_engine
conn = create_engine('oracle+cx_oracle://Jeremy:SuperSecret#databasehost:1521/?service_name=gdw')
df.to_sql('TEST', conn, if_exists='replace')

pyodbc autocommit does not appear to work with sybase and sqlalchemy

I am connecting to a sybase ASE 15 database from Python 3.4 using pyodbc and executing a stored procedure.
All works as expected if I use native pyodbc:
import pd
import pyodbc
con = pyodbc.connect('DSN=dsn_name;UID=username;PWD=password', autocommit=True)
df = pd.read_sql("exec p_procecure #GroupName='GROUP'", con)
[Driver is Adaptive Server Enterprise].
I have to have autocommit=True and if I do no I get the following error:
DatabaseError: Execution failed on sql 'exec ....': ('ZZZZZ', "[ZZZZZ]
[SAP][ASE ODBC Driver][Adaptive Server Enterprise]Stored procedure
'p_procedure' may be run only in unchained transaction mode. The 'SET
CHAINED OFF' command will cause the current session to use unchained
transaction mode.\n (7713) (SQLExecDirectW)")
I attempt to achieve the same using SQLAlchemy (1.0.9):
from sqlalchemy import create_engine, engine
from sqlalchemy.orm import sessionmaker
from sqlalchemy.sql import text
url = r'sybase+pyodbc://username:password#dsn'
engine = create_engine(url, echo=True)
sess = sessionmaker(bind=engine).Session()
df = pd.read_sql(text("exec p_procedure #GroupName='GROUP'"),conn.execution_options(autocommit=True))
The error message is the same despite the fact I have specified autocommit=True on the connection. (I have also tested this at the session level but should not be necessary and made no difference).
DBAPIError: (pyodbc.Error) ('ZZZZZ', "[ZZZZZ] [SAP][ASE ODBC
Driver][Adaptive Server Enterprise]....
Can you see anything wrong here?
As always, any help would be much appreciated.
Passing the autocommit=True argument as an item in the connect_args argument dictionary does work:
connect_args = {'autocommit': True}
create_engine(url, connect_args=connect_args)
connect_args – a dictionary of options which will be passed directly
to the DBAPI’s connect() method as additional keyword arguments.
I had some problems with autocommit option. The only thing that worked for me was to change this option to True after establishing connection.
ConnString = 'Driver=%SQL_DRIVER%;Server=%SQL_SERVER%;Uid=%SQL_LOGIN%;Pwd=%SQL_PASSWORD%;'
SQL_CONNECTION = pyodbc.connect(ConnString)
SQL_CONNECTION.autocommit = True

Connect to an URI in postgres

I'm guessing this is a pretty basic question, but I can't figure out why:
import psycopg2
psycopg2.connect("postgresql://postgres:postgres#localhost/postgres")
Is giving the following error:
psycopg2.OperationalError: missing "=" after
"postgresql://postgres:postgres#localhost/postgres" in connection info string
Any idea? According to the docs about connection strings I believe it should work, however it only does like this:
psycopg2.connect("host=localhost user=postgres password=postgres dbname=postgres")
I'm using the latest psycopg2 version on Python2.7.3 on Ubuntu12.04
I would use the urlparse module to parse the url and then use the result in the connection method. This way it's possible to overcome the psycop2 problem.
from urlparse import urlparse # for python 3+ use: from urllib.parse import urlparse
result = urlparse("postgresql://postgres:postgres#localhost/postgres")
username = result.username
password = result.password
database = result.path[1:]
hostname = result.hostname
port = result.port
connection = psycopg2.connect(
database = database,
user = username,
password = password,
host = hostname,
port = port
)
The connection string passed to psycopg2.connect is not parsed by psycopg2: it is passed verbatim to libpq. Support for connection URIs was added in PostgreSQL 9.2.
To update on this, Psycopg3 does actually include a way to parse a database connection URI.
Example:
import psycopg # must be psycopg 3
pg_uri = "postgres://jeff:hunter2#example.com/db"
conn_dict = psycopg.conninfo.conninfo_to_dict(pg_uri)
with psycopg.connect(**conn_dict) as conn:
...
Another option is using SQLAlchemy for this. It's not just ORM, it consists of two distinct components Core and ORM, and it can be used completely without using ORM layer.
SQLAlchemy provides such functionality out of the box by create_engine function. Moreover, via URI you can specify DBAPI driver or many various postgresql settings.
Some examples:
# default
engine = create_engine("postgresql://user:pass#localhost/mydatabase")
# psycopg2
engine = create_engine("postgresql+psycopg2://user:pass#localhost/mydatabase")
# pg8000
engine = create_engine("postgresql+pg8000://user:pass#localhost/mydatabase")
# psycopg3 (available only in SQLAlchemy 2.0, which is currently in beta)
engine = create_engine("postgresql+psycopg://user:pass#localhost/test")
And here is a fully working example:
import sqlalchemy as sa
# set connection URI here ↓
engine = sa.create_engine("postgresql://user:password#db_host/db_name")
ddl_script = sa.DDL("""
CREATE TABLE IF NOT EXISTS demo_table (
id serial PRIMARY KEY,
data TEXT NOT NULL
);
""")
with engine.begin() as conn:
# do DDL and insert data in a transaction
conn.execute(ddl_script)
conn.exec_driver_sql("INSERT INTO demo_table (data) VALUES (%s)",
[("test1",), ("test2",)])
conn.execute(sa.text("INSERT INTO demo_table (data) VALUES (:data)"),
[{"data": "test3"}, {"data": "test4"}])
with engine.connect() as conn:
cur = conn.exec_driver_sql("SELECT * FROM demo_table LIMIT 2")
for name in cur.fetchall():
print(name)
# you also can obtain raw DBAPI connection
rconn = engine.raw_connection()
SQLAlchemy provides many other benefits:
You can easily switch DBAPI implementations just by changing URI (psycopg2, psycopg2cffi, etc), or maybe even databases.
It implements connection pooling out of the box (both psycopg2 and psycopg3 has connection pooling, but API is different)
asyncio support via create_async_engine (psycopg3 also supports asyncio).

Categories

Resources