Fail to use psycopg2 to pull out data from postgre database - python

I have problem with the python PostgreSQL. I am using psycopg2.
here's my postgres DB looks like:
Database: qcdata
---information_schema
---pg_catalog
---prod
---activation
--- *** Many other table ***
---public
I want to pull out the information in schema: prod - table: activation
here's my code
import psycopg2
conn = psycopg2.connect(host='10.0.80.180', port = '5432',
dbname = 'qcdata',
user = 'username', password = 'pwd')
cur = conn.cursor()
print cur.execute("SELECT * FROM prod.activation")
But it returns None... I am sure there's data in it. How could that be?
Thanks guys.

According to the docs (http://initd.org/psycopg/docs/cursor.html), the cur.execute() cursor always returns None. You have to follow up with one of the fetch methods, like:
print cur.fetchall()
-g

While I will use cursor.fetchone() in cases where I'm getting a single value, i.e. when using COUNT:
SELECT count(*) FROM prod.activation
If I want to process a set of rows, I like to leverage the cursor's ability to iterate over the result set. That allows more control over the results, how they're processed, and how they're returned.
The cursor can be iterated over just like a Python list:
for row in cur:
dostuff()
If you use named cursors (simply set the name attribute when constructing the cursor), psycopg2 will chunk the SELECT for you, by 2000 by default. This will automatically happen in the background while iterating.
You can also use the with syntax to automatically close the cursor when you're done with it. (Useful for smaller-scoped uses.)

Related

How to get psycopg2's description from PostgreSQL server side cursor

I built a class that gets an arbitrary Postgres SQL query, fetch data and creates a CSV file. I'm using cursor.description to get column names, passing it as my csv header. However data sets got too large and I'm moving to server side cursors.
Server Side cursors doesn't seem to have any data under description. When I run:
import psycopg2
conn = psycopg2.connect(**conn_info)
cursor = conn.cursor("server_side")
cursor.execute("select * from foo")
print(cursor.description)
It prints None, probably because query didn't actually ran. But is there a way to get column names in this configuration?
The query in cursor.execute('select...') is executed on the server side but the application has no data yet, hence cursor.description is undefined. To get the description you need to get at least a row from the server-side cursor, e.g.:
cursor = conn.cursor("server_side")
# or
# cursor = conn.cursor("server_side", scrollable=True)
# see below
cursor.execute("select * from my_table")
first_row = cursor.fetchone()
print(cursor.description)
# you can place the cursor in the initial position if needed:
# cursor.scroll(-1)
Note that you won't get the description when the table is empty.
There is no better (faster or simpler) way to get a query result description of a named cursor. This is due to the way named cursors are implemented. The commands
cursor = conn.cursor("server_side")
cursor.execute("select * from my_table")
are implemented by declaring a cursor using the Postgres command:
DECLARE "server_side" CURSOR WITHOUT HOLD FOR select * from my_table
Per the documentation:
DECLARE allows a user to create cursors, which can be used to retrieve a small number of rows at a time out of a larger query. After the cursor is created, rows are fetched from it using FETCH.
The cursor declaration itself does not give any information on the structure of the results. We can obtain it only after fetching a row or rows by the FETCH command.
The other answers here are; unfortunately, the answer and here's why. There is no way to get the description or even rowcount back from a server-side cursor without first invoking a fetch. It returns None as per PEP-249:
This attribute will be None for operations that do not return rows or if the cursor has not had an operation invoked via the .execute*() method yet.
This is because even though you've called execute, the server may not have yet executed the query and we can confirm that by checking the logs (where logging is set to all)
Using the following code with a 30-second sleep for clarity
cursor = conn.cursor("server_side")
cursor.execute("select * from foo")
time.sleep(30)
cursor.fetchall()
print(cursor.description)
The logs will show
2020-06-19 12:11:37.687 BST [11916] LOG: statement: BEGIN
2020-06-19 12:11:37.687 BST [11916] LOG: statement: DECLARE "server_side" CURSOR WITHOUT HOLD FOR select * from foo
2020-06-19 12:12:07.693 BST [11916] LOG: statement: FETCH FORWARD ALL FROM "server_side"
Notice the 30~ second gap between the declaration and the FETCH, the latter being the invocation that allows us to get the description from the cursor.
Without server_side for comparison
2020-06-19 12:11:01.310 BST [3012] LOG: statement: BEGIN
2020-06-19 12:11:01.311 BST [3012] LOG: statement: select * from foo
Your only options are to use scroll or perform a LIMIT 1 select prior to your larger query.
A less attractive option is to use the INFORMATION_SCHEMA table like so
select column_name, data_type, character_maximum_length
from INFORMATION_SCHEMA.COLUMNS where table_name = 'foo';
Could you not do:
import psycopg2
conn = psycopg2.connect(**conn_info)
cursor_desc = conn.cursor()
cursor_desc.execute("select * from foo limit 1")
print(cursor_desc.description)
cursor = conn.cursor("server_side")
cursor.execute("select * from foo")
Then you are not messing with the server side data returning query.

How to disable query cache with mysql.connector

I'm connecting mysql on my Kivy application.
import mysql.connector
con = mysql.connector.Connect(host='XXX', port=XXX, user='XXX', password='XXX', database='XXX')
cur = con.cursor()
db = cur.execute("""select SELECT SQL_NO_CACHE * from abc""")
data = cur.fetchall()
print (data)
After inserting or deleting on table abc from another connection; i call the same query on python; but data is not updating.
I add the query "SET SESSION query_cache_type = OFF;" before select query, but it didn't work. Someone said "select NOW() ..." query is not cachable but it didn't work again. What should I do?
I solved this by adding the code after fetchall()
con.commit()
Calling the same select query without doing a commit, won't update the results.
The solution is to use:
Once:
con.autocommit(True)
Or, after each select query:
con.commit()
With this option, there will be a commit after each select query.
Otherwise, subsequent selects will render the same result.
This error seems to be Bug #42197 related to Query cache and auto-commit in MySQL. The status is won't fix!
In a few months, this should be irrelevant because MySQL 8.0 is dropping Query Cache.
I encounterd the same problem that has been solved and used the above method.
conn.commit()
and I found that different DBMS has different behavior,not all DBMS exist in the connection cache
try this,
conn.autocommit(True);
this will auto commit after each of you select query.
The MySQL query cache is flushed when tables are modified, so it wouldn't have that effect. It's impossible to say without seeing the rest of your code, but it's most likely that your INSERT / DELETE query is failing to run.

MySQLdb.cursors.Cursor.execute does not work

I have done the following:
import MySQLdb as mdb
con = mdb.connect(hostname, username, password, dbname)
cur = con.cursor()
count = cur.execute(query)
cur.close()
con.close()
I have two queries, I execute them in the mysql console I can view the results.
But when I give the same through python one query works and the other one does not.
I am sure it is not problem with mysql or query or python code. I suspect cur.execute(query) function.
Have anyone come through similar situation? Any solutions?
Use conn.commit() after execution, to commit/finish insertion and deletion based changes.
I have two queries, I execute them in the mysql console I can view the results.
But I only see one query:
import MySQLdb as mdb
con = mdb.connect(hostname, username, password, dbname)
cur = con.cursor()
count = cur.execute(query)
cur.close()
con.close()
My guess is query contains the both queries separated by a semin-colon and is an INSERT statement? You probably need to use executemany().
See Executing several SQL queries with MySQLdb
On the other hand, if both of your queries are SELECT statements (you say "I see the result"), I'm not sure you can fetch both results from only one call to execute(). I would consider that as bad style, anyway.
This is a function and the query is passed to this function. When I
execute one query after the other. I dont get the result for few
queries, there is no problem with the queries because I have crossed
checked them with the mysql console.
As you clarified your question in a comment, I post an other answer -- completely different approach.
Are you connected to your DB in autocommit mode? If no, for changes to be permanently applied, you have to COMMIT them. In normal circumstances, you shouldn't create a new connection for each request. That put excessive load on the DB server for almost nothing:
# Open a connection once
con = mdb.connect(hostname, username, password, dbname)
# Do that *for each query*:
cur = con.cursor()
try:
count = cur.execute(query)
conn.commit() # don't forget to commit the transaction
else:
print "DONE:", query # for "debug" -- in real app you migth have an "except:" clause instead
finally:
cur.close() # close anyway
# Do that *for each query*:
cur = con.cursor()
try:
count = cur.execute(query)
conn.commit() # don't forget to commit the transaction
else:
print "DONE:", query # for "debug" -- in real app you migth have an "except:" clause instead
finally:
cur.close() # close anyway
# Close *the* connection
con.close()
The above code is directly typed into SO. Please forgive typos and other basic syntax errors. But that's the spirit of it.
A last word, while typing I was wondering how you deal with exceptions? By any chance could the MySQLdb error be silently ignored at some upper level of your program?
Use this query, this will update multiple rows of column in one query
sql=cursor.executemany("UPDATE `table` SET `col1` = %s WHERE `col2` = %s",
[(col1_val1, col2_val1),(col2_val2, col_val2)])
and also commit with database to see the changes.
conn.commit()

MSSQL2008 - Pyodbc - Previous SQL was not a query

I can't figure out what's wrong with the following code,
The syntax IS ok (checked with SQL Management Studio), i have access as i should so that works too.. but for some reason as soon as i try to create a table via PyODBC then it stops working.
import pyodbc
def SQL(QUERY, target = '...', DB = '...'):
cnxn = pyodbc.connect('DRIVER={SQL Server};SERVER=' + target + DB+';UID=user;PWD=pass')
cursor = cnxn.cursor()
cursor.execute(QUERY)
cpn = []
for row in cursor:
cpn.append(row)
return cpn
print SQL("CREATE TABLE dbo.Approvals (ID SMALLINT NOT NULL IDENTITY PRIMARY KEY, HostName char(120));")
It fails with:
Traceback (most recent call last):
File "test_sql.py", line 25, in <module>
print SQL("CREATE TABLE dbo.Approvals (ID SMALLINT NOT NULL IDENTITY PRIMARY KEY, HostName char(120));")
File "test_sql.py", line 20, in SQL
for row in cursor:
pyodbc.ProgrammingError: No results. Previous SQL was not a query.
Anyone have any idea to why this is?
I got a "SQL Server" driver installed (it's default), running Windows 7 against a Windows 2008 SQL Server environment (Not a express database).
Just in case some lonely net nomad comes across this issue, the solution by Torxed didn't work for me. But the following worked for me.
I was calling an SP which inserts some values into a table and then returns some data back. Just add the following to the SP :
SET NOCOUNT ON
It'll work just fine :)
The Python code :
query = "exec dbo.get_process_id " + str(provider_id) + ", 0"
cursor.execute(query)
row = cursor.fetchone()
process_id = row[0]
The SP :
USE [DBNAME]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER procedure [dbo].[GET_PROCESS_ID](
#PROVIDER_ID INT,
#PROCESS_ID INT OUTPUT
)
AS
BEGIN
SET NOCOUNT ON
INSERT INTO processes(provider_id) values(#PROVIDER_ID)
SET #PROCESS_ID= SCOPE_IDENTITY()
SELECT #PROCESS_ID AS PROCESS_ID
END
Using the "SET NOCOUNT ON" value at the top of the script will not always be sufficient to solve the problem.
In my case, it was also necessary to remove this line:
Use DatabaseName;
Database was SQL Server 2012,
Python 3.7,
SQL Alchemy 1.3.8
Hope this helps somebody.
I got this because I was reusing a cursor that I was looping over:
rows = cursor.execute(...)
for row in rows:
# run query that returns nothing
cursor.execute(...)
# next iteration of this loop will throw 'Previous SQL' error when it tries to fetch next row because we re-used the cursor with a query that returned nothing
Use 2 different cursors instead
rows = cursor1.execute(...)
for row in rows:
cursor2.execute(...)
or get all results of the first cursor before using it again:
Use 2 different cursors instead
rows = cursor.execute(...)
for row in list(rows):
cursor.execute(...)
As others covered, SET NOCOUNT ON will take care of extra resultsets inside a stored procedure, however other things can also cause extra output that NOCOUNT will not prevent (and pyodbc will see as a resultset) such as forgetting to remove a print statement after debugging your stored procedure.
As Travis and others have mentioned, other things can also cause extra output that SET NOCOUNT ON will not prevent.
I had SET NOCOUNT ON at the start of my procedure but was receiving warning messages in my results set.
I set ansi warnings off at the beginning of my script in order to remove the error messages.
SET ANSI_WARNINGS OFF
Hopefully this helps someone.
If your stored procedure calls RAISERROR, pyodbc may create a set for that message.
CREATE PROCEDURE some_sp
AS
BEGIN
RAISERROR ('Some error!', 1, 1) WITH NOWAIT
RETURN 777
END
In python, you need to skip the first sets until you find one containing some results (see https://github.com/mkleehammer/pyodbc/issues/673#issuecomment-631206107 for details).
sql = """
SET NOCOUNT ON;
SET ANSI_WARNINGS OFF;
DECLARE #ret int;
EXEC #ret = some_sp;
SELECT #ret as ret;
"""
cursor = con.cursor()
cursor.execute(sql)
rows = None
#this section will only return the last result from the query
while cursor.nextset():
try:
rows = cursor.fetchall()
except Exception as e:
print("Skipping non rs message: {}".format(e))
continue
row = rows[0]
print(row[0]) # 777.
I think the root cause of the issue described above might be related with the fact that you receive the same error message when you execute for example a DELETE query which will not return a result. So if you run
result = cursor.fetchall()
you get this error, because a DELETE operation by definition does not return anything. Try to catch the exception as recommended here: How to check if a result set is empty?
In case your SQL is not Stored Proc.
usage of 'xyz != NULL' in query, will give the same error i.e. "pyodbc.ProgrammingError: No results. Previous SQL was not a query."
Use 'is not null' instead.
First off:
if you're running a Windows SQL Server 2008, use the "Native Client" that is included with the installation of the SQL software (it gets installed with the database and Toolkits so you need to install the SQL Management applicaton from Microsoft)
Secondly:
Use "Trusted_Connection=yes" in your SQL connection statement:
cnxn = pyodbc.connect('DRIVER={SQL Server Native Client 10.0};SERVER=ServerAddress;DATABASE=my_db;Trusted_Connection=yes')
This should do the trick!
I have solved this problem by splitting the use database and sql query into two execute statements.

How to update records in SQL Alchemy in a Loop

I am trying to use SQLSoup - the SQLAlchemy extention, to update records in a SQL Server 2008 database. I am using pyobdc for the connections. There are a number of issues which make it hard to find a relevant example.
I am reprojection a geometry field in a very large table (2 million + records), so many of the standard ways of updating fields cannot be used. I need to extract coordinates from the geometry field to text, convert them and pass them back in. All this is fine, and all the individual pieces are working.
However I want to execute a SQL Update statement on each row, while looping through the records one by one. I assume this places locks on the recordset, or the connection is in use - as if I use the code below it hangs after successfully updating the first record.
Any advice on how to create a new connection, reuse the existing one, or accomplish this another way is appreciated.
s = select([text("%s as fid" % id_field),
text("%s.STAsText() as wkt" % geom_field)],
from_obj=[feature_table])
rs = s.execute()
for row in rs:
new_wkt = ReprojectFeature(row.wkt)
update_value = "geometry :: STGeomFromText('%s',%s)" % (new_wkt, "3785")
update_sql = ("update %s set GEOM3785 = %s where %s = %i" %
(full_name, update_value, id_field, row.fid))
conn = db.connection()
conn.execute(update_sql)
conn.close() #or not - no effect..
Updated working code now looks like this. It works fine on a few records, but hangs on the whole table, so I guess it is reading in too much data.
db = SqlSoup(conn_string)
#create outer query
Session = sessionmaker(autoflush=False, bind=db.engine)
session = Session()
rs = session.execute(s)
for row in rs:
#create update sql...
session.execute(update_sql)
session.commit()
I now get connection busy errors.
DBAPIError: (Error) ('HY000', '[HY000] [Microsoft][ODBC SQL Server Driver]Connection is busy with results for another hstmt (0) (SQLExecDirectW)')
It looks like this could be a problem with the ODBC driver - http://sourceitsoftware.blogspot.com/2008/06/connection-is-busy-with-results-for.html
Further Update:
On the server using profiler, it shows the select statement then the first update statement are "starting" but neither complete.
If I set the Select statement to return the top 10 rows, then it does complete and the updates run.
SQL: Batch Starting Select...
SQL: Batch Starting Update...
I believe this is an issue with pyodbc and SQL Server drivers. If I remove SQL Alchemy and execute the same SQL with pyodbc it also hangs. Even if I create a new connection object for the updates.
I also tried the SQL Server Native Client 10.0 driver which is meant to allow MARS - Multiple Active Record Sets but it made no difference. In the end I have resorted to "paging the results" and updating these batches using pyodbc and SQL (see below), however I thought SQLAlchemy would have been able to do this for me automatically.
Try using a Session.
rs = s.execute() then becomes session.execute(rs) and you can replace the last three lines with session.execute(update_sql). I'd also suggest configuring your Session with autocommit off and call session.commit() at the end.
Can I suggest that when your process hangs you do a sp_who2 on the Sql box and see what is happening. Check for blocked spid's and see if you can find anything in the Sql code that can suggest what is happening. If you do find a spid that is blocking others you can do a dbcc inputbuffer(*spidid*) and see if that tells you what the query was it executed. Otherwise you can also attach the Sql profiler and trace your calls.
In some cases it could also be parallelism on the Sql server that cause blocks. Unless this is a data warehouse, I suggest turn your Max DOP off, (set it to 1). Let me know and when I check this again in the morning and you need help, I'll be glad to help.
Until I find another solution I am using a single connection and custom SQL to return sets of records, and updating these in batches. I don't think what I am doing is a particulary unique case, so I am not sure why I cannot handle multiple result sets simultaneously.
Below works but is very, very slow..
cnxn = pyodbc.connect(conn_string, autocommit=True)
cursor = cnxn.cursor()
#get total recs in the database
s = "select count(fid) as count from table"
count = cursor.execute(s).fetchone().count
#choose number of records to update in each iteration
batch_size = 100
for i in range(1,count, batch_size):
#sql to bring back relevant records in each batch
s = """SELECT fid, wkt from(select ROW_NUMBER() OVER(ORDER BY FID ASC) AS 'RowNumber'
,FID
,GEOM29902.STAsText() as wkt
FROM %s) features
where RowNumber >= %i and RowNumber <= %i""" % (full_name,i,i+batch_size)
rs = cursor.execute(s).fetchall()
for row in rs:
new_wkt = ReprojectFeature(row.wkt)
#...create update sql statement for the record
cursor.execute(update_sql)
counter += 1
cursor.close()
cnxn.close()

Categories

Resources