I am writing code to create a GUI in Python on the Spyder environment of Anaconda. within this code I operate with a PostgreSQL database and I therefore use the psycopg2 database adapter so that I can interact with directly from the GUI.
The code is too long to post here, as it is over 3000 lines, but to summarize, I have no problem interacting with my database except when I try to drop a table.
When I do so, the GUI frames become unresponsive, the drop table query doesn't drop the intended table and no errors or anything else of that kind are thrown.
Within my code, all operations which result in a table being dropped are processed via a function (DeleteTable). When I call this function, there are no problems as I have inserted several print statements previously which confirmed that everything was in order. The problem occurs when I execute the statement with the cur.execute(sql) line of code.
Can anybody figure out why my tables won't drop?
def DeleteTable(table_name):
conn=psycopg2.connect("host='localhost' dbname='trial2' user='postgres' password='postgres'")
cur=conn.cursor()
sql="""DROP TABLE """+table_name+""";"""
cur.execute(sql)
conn.commit()
That must be because a concurrent transaction is holding a lock that blocks the DROP TABLE statement.
Examine the pg_stat_activity view and watch out for sessions with state equal to idle in transaction or active that have an xact_start of more than a few seconds ago.
This is essentially an application bug: you must make sure that all transactions are closed immediately, otherwise Bad Things can happen.
I am having the same issue when using psycopg2 within airflow's postgres hook and I resolved it with with statement. Probably this resolves the issue because the connection becomes local within the with statement.
def drop_table():
with PostgresHook(postgres_conn_id="your_connection").get_conn() as conn:
cur = conn.cursor()
cur.execute("DROP TABLE IF EXISTS your_table")
task_drop_table = PythonOperator(
task_id="drop_table",
python_callable=drop_table
)
And a solution is possible for the original code above like this (I didn't test this one):
def DeleteTable(table_name):
with psycopg2.connect("host='localhost' dbname='trial2' user='postgres' password='postgres'") as conn:
cur=conn.cursor()
sql="""DROP TABLE """+table_name+""";"""
cur.execute(sql)
conn.commit()
Please comment if anyone tries this.
Related
I have a python script which creates a database and then enters an infinite loop which runs once per second querying the database with some selects.
At the same time I connect to the database with a sqlite cli and try to make an update but I get a database is locked error.
Here the (anonymized) code of the script:
import sqlite3
import time
con = sqlite3.connect(r'path\to\database.sqlite')
con.execute('DROP TABLE IF EXISTS blah;')
con.execute('CREATE TABLE blah;')
con.execute('INSERT INTO blah;')
con.commit()
while True:
result = con.execute('SELECT blah')
print(result.fetchone()[0])
time.sleep(1)
Python's sqlite3 module tries to be clever and manages transactions for you.
To ensure that you can access the database from other threads/processes, disable that (set isolation_level to None), and use explicit transactions, when needed.
Alternatively, call con.commit() whenever you are finished.
I'm using PostgreSQL 9.3, and SQLAlchemy 1.0.11
I have code that looks like this:
import sqlalchemy as sa
engine = sa.create_engine('postgresql+psycopg2://me#myhost/mydb')
conn = engine.connect()
metadata = sa.MetaData()
# Real table has more columns
mytable = sa.Table(
'my_temp_table', metadata,
sa.Column('id', sa.Integer, primary_key=True),
sa.Column('something', sa.String(200)),
prefixes=['TEMPORARY'],
)
metadata.create_all(engine)
pg_conn = engine.raw_connection()
with pg_conn.cursor() as cursor:
cursor.copy_expert('''COPY my_temp_table (id, something)
FROM STDIN WITH CSV''',
open('somecsvfile', 'r'))
Now this works just fine - cursor.rowcount reports the expected number of rows inserted. I can even run cursor.execute('SELECT count(*) FROM my_temp_table'); print(cursor.fetchone()) and it will display the same #. The problem is when I try to run a query from SQLAlchemy's connection, e.g
result = conn.execute(sa.text('SELECT count(*) FROM my_temp_table'))
It doesn't matter where I put that. I've tried several places:
inside the with block
outside the with block
after a cursor.close()
after a pg_conn.close()
Nothing seems to work - no matter where I run the query from, it barfs with:
sqlalchemy.exc.ProgrammingError: (psycopg2.ProgrammingError) relation "my_temp_table" does not exist
The funny thing is that if I wrap that code in a try/except then I can do cursor.execute(...) in the except block successfully.
Actually, now that I'm writing this out, it appears that using the sqlalchemy connection anywhere fails to see that those tables exists.
So what gives? Why doesn't my SQLAlchemy connection see these tables, but the postgres (engine.raw_connection()) does?
Edit:
To further the mystery - if I create the the connection after the metadata.create_all(engine), it works! Well, sort of.
I can select from the tables, but then when I get the engine.raw_connection() it fails on .copy_expert because it can't find the table.
The first thing to note is that temporary tables are only visible to the connection which created them.
The second is that an Engine doesn't encapsulate a single connection; it manages a connection pool.
Finally, the documentation points out that operations performed directly on an Engine (engine.execute("select ...") in their example) will internally acquire and release their own connections.
With all of this in mind, it's clear what's going on in your example:
conn = engine.connect() acquires Connection #1 from the pool.
metadata.create_all(engine) implicitly acquires Connection #2 (as #1 is still "in use" from the engine's perspective), uses it to create the table, and releases it back to the pool.
pg_conn = engine.raw_connection() acquires #2 again, so the COPY executed via this object can still see the table.
conn is still using #1, and nothing you do via this object will be able to see your temp table.
In your second case:
metadata.create_all(engine) implicitly acquires/uses/releases Connection #1.
conn = engine.connect() acquires #1 and holds it.
pg_conn = engine.raw_connection() acquires #2, and the COPY fails to find the temp table.
The moral of the story: if you're doing something which relies on the connection state, you'd better be sure which connection you're using. Running commands directly on the engine is fine for standalone operations, but for anything involving temp tables, you should acquire one connection and stick with it through every step (including the table creation, which I suggest you change to metadata.create_all(conn)).
Well, this doesn't answer the why but it it is how to accomplish what I want.
Rather than:
pg_conn = engine.raw_connection()
with pg_conn.cursor() as cursor:
Just replace it with:
with conn.connection.cursor() as cursor:
The SQLAlchemy connection object exposes its underlying DBAPI connection via the .connection property. And whatever magic involved there does the right thing.
I am trying to use psycopg2 to add some new columns to a table. PostgreSQL lacks a ALTER TABLE table ADD COLUMN IF NOT EXISTS, so I am adding each column in it's own transaction. If the column exists, there will be a python & postgres error, that's OK, I want my programme to just continue and try to add the next column. The goal is for this to be idempotent, so it can be run many times in a row.
It currently looks like this:
def main():
# <snip>
with psycopg2.connect("") as connection:
create_columns(connection, args.table)
def create_columns(connection, table_name):
def sql(sql):
with connection.cursor() as cursor:
cursor.execute(sql.format(table_name=table_name))
sql("ALTER TABLE {table_name} ADD COLUMN my_new_col numeric(10,0);")
sql("ALTER TABLE {table_name} ADD COLUMN another_new_col INTEGER NOT NULL;")
However, if my_new_col exists, there is an exception ProgrammingError('column "parent_osm_id" of relation "relations" already exists\n',), which is to be expected, but when it tried to add another_new_col, there is the exception InternalError('current transaction is aborted, commands ignored until end of transaction block\n',).
The psycogpg2 document for the with statement implies that the with connection.cursor() as cursor: will wrap that code in a transaction. This is clearly not happening. Experimentation has shown me that I need 2 levels of with statements, to including the pscyopg2.connect call, and then I get a transaction.
How can I pass a connection object around and have queries run in their own transaction to allow this sort of "graceful error handling"? I would like to keep the postgres connection code separate, in a "clean architecture" style. Is this possible?
The psycogpg2 document for the with statement implies that the with connection.cursor() as cursor: will wrap that code in a transaction.
this is actually not true it says:
with psycopg2.connect(DSN) as conn:
with conn.cursor() as curs:
curs.execute(SQL)
When a connection exits the with block, if no exception has been raised by the block, the transaction is committed. In case of exception the transaction is rolled back. In no case the connection is closed: a connection can be used in more than a with statement and each with block is effectively wrapped in a transaction.
So it's not about cursor object being handled by with but the connection object
Also worth noting that all resource held by cursor will be released when we leave the with clause
When a cursor exits the with block it is closed, releasing any resource eventually associated with it. The state of the transaction is not affected.
So back to your code you could probably rewrite it to be more like:
def main():
# <snip>
with psycopg2.connect("") as connection:
create_columns(connection, args.table)
def create_columns(con, table_name):
def sql(connection, sql):
with connection:
with connection.cursor() as cursor:
cursor.execute(sql.format(table_name=table_name))
sql(con, "ALTER TABLE {table_name} ADD COLUMN my_new_col numeric(10,0);")
sql(con, "ALTER TABLE {table_name} ADD COLUMN another_new_col INTEGER NOT NULL;")
ensuring your connection is wrapped in with for each query you execute, so if it fails connection context manager will revert the transaction
I made a loop in Python that calls itself to repeatedly check for new entries in a database. On first execution, all affected rows are shown fine. Meanwhile, I add more rows into the database. On the next query in my loop, the new rows are not shown.
This is my query-loop:
def loop():
global mysqlconfig # username, passwd...
tbd=[] # this is where I save the result
conn = MySQLdb.connect(**mysqlconfig)
conn.autocommit(True)
c = conn.cursor()
c.execute("SELECT id, message FROM tasks WHERE date <= '%s' AND done = 0;" % now.isoformat(' '))
conn.commit()
tbd = c.fetchall()
print tbd
c.close()
conn.close()
time.sleep(5)
loop()
loop()
This is the SQL part of my Python insertion-script:
conn = MySQLdb.connect(**mysqlconfig)
conn.autocommit(1)
c = conn.cursor()
c.execute("INSERT INTO tasks (date, message) VALUES ('{0}', '{1}');".format("2012-10-28 23:50", "test"))
conn.commit()
id = c.lastrowid
c.close()
conn.close()
I tried SQLite, I tried Oracle MySQL's connector, I tried MySQLdb on a Windows and Linux system and all had the same problem. I looked through many, many threads on Stackoverflow that recommended to turn on autocommit or use commit() after an SQL statement (ex. one, two, three), which I tried and failed.
When I added data with HeidiSQL to my database it showed up in the loop query, but I don't really know why this is. Rows inserted with mysql-client on Linux and my Python insertion script never show up until I restart my loop script.
I don't know if it's the fact that I open 2 connections, each in their own script, but I close every connection and every cursor when I'm done with them.
The problem could be with your variable now. I don't see anywhere in the loop that it is being reset.
I'd probably use the mysql NOW() function:
c.execute("SELECT id, message FROM tasks WHERE date <= NOW() AND done = 0;")
It looks like the time you are inserting into the database is a time in the future. I don't think your issue is with your database connection, I think it's something to do with the queries you are doing.
I'm using Psycopg2 in Python to access a PostgreSQL database. I'm curious if it's safe to use the with closing() pattern to create and use a cursor, or if I should use an explicit try/except wrapped around the query. My question is concerning inserting or updating, and transactions.
As I understand it, all Psycopg2 queries occur within a transaction, and it's up to calling code to commit or rollback the transaction. If within a with closing(... block an error occurs, is a rollback issued? In older versions of Psycopg2, a rollback was explicitly issued on close() but this is not the case anymore (see http://initd.org/psycopg/docs/connection.html#connection.close).
My question might make more sense with an example. Here's an example using with closing(...
with closing(db.cursor()) as cursor:
cursor.execute("""UPDATE users
SET password = %s, salt = %s
WHERE user_id = %s""",
(pw_tuple[0], pw_tuple[1], user_id))
module.rase_unexpected_error()
cursor.commit()
What happens when module.raise_unexpected_error() raises its error? Is the transaction rolled back? As I understand transactions, I either need to commit them or roll them back. So in this case, what happens?
Alternately I could write my query like this:
cursor = None
try:
cursor = db.cursor()
cursor.execute("""UPDATE users
SET password = %s, salt = %s
WHERE user_id = %s""",
(pw_tuple[0], pw_tuple[1], user_id))
module.rase_unexpected_error()
cursor.commit()
except BaseException:
if cursor is not None:
cursor.rollback()
finally:
if cursor is not None:
cursor.close()
Also I should mention that I have no idea if Psycopg2's connection class cursor() method could raise an error or not (the documentation doesn't say) so better safe than sorry, no?
Which method of issuing a query and managing a transaction should I use?
Your link to the Psycopg2 docs kind of explains it itself, no?
... Note that closing a connection without committing the changes first will
cause any pending change to be discarded as if a ROLLBACK was
performed (unless a different isolation level has been selected: see
set_isolation_level()).
Changed in version 2.2: previously an explicit ROLLBACK was issued by
Psycopg on close(). The command could have been sent to the backend at
an inappropriate time, so Psycopg currently relies on the backend to
implicitly discard uncommitted changes. Some middleware are known to
behave incorrectly though when the connection is closed during a
transaction (when status is STATUS_IN_TRANSACTION), e.g. PgBouncer
reports an unclean server and discards the connection. To avoid this
problem you can ensure to terminate the transaction with a
commit()/rollback() before closing.
So, unless you're using a different isolation level, or using PgBouncer, your first example should work fine. However, if you desire some finer-grained control over exactly what happens during a transaction, then the try/except method might be best, since it parallels the database transaction state itself.