Raise StopIteration - can't use a tailable cursor on my MongoDB - python

I would like to stream new entries from my DB but i keep getting this error.
I took the code from the official docs of MongoDB and i just adapted it for my database, but i keep getting this error:
File "C:\Users\Davide\lib\site-packages\pymongo\cursor.py", line 1197, in next
raise StopIteration
StopIteration
Where could the problem be?
import time
from pymongo import MongoClient
import pymongo
client = MongoClient(port=27017)
oplog = client.local.oplog.rs
first = oplog.find().sort('$natural', pymongo.ASCENDING).limit(-1).next()
print(first)
ts = first['ts']
while True:
# For a regular capped collection CursorType.TAILABLE_AWAIT is the
# only option required to create a tailable cursor. When querying the
# oplog the oplog_replay option enables an optimization to quickly
# find the 'ts' value we're looking for. The oplog_replay option
# can only be used when querying the oplog.
cursor = oplog.find({'ts': {'$gt': ts}},
cursor_type=pymongo.CursorType.TAILABLE_AWAIT,
oplog_replay=True)
while cursor.alive:
for doc in cursor:
ts = doc['ts']
print(doc)
# We end up here if the find() returned no documents or if the
# tailable cursor timed out (no new documents were added to the
# collection for more than 1 second).
time.sleep(1)

Related

Python MySQL cursor fails to fetch rows

I am trying to fetch data from AWS MariaDB:
cursor = self._cnx.cursor()
stmt = ('SELECT * FROM flights')
cursor.execute(stmt)
print(cursor.rowcount)
# prints 2
for z in cursor:
print(z)
# Does not iterate
row = cursor.fetchone()
# row is None
rows = cursor.fetchall()
# throws 'No result set to fetch from.'
I can verify that table contains data using MySQL Workbench. Am I missing some step?
EDIT: re 2 answers:
res = cursor.execute(stmt)
# res is None
EDIT:
I created new Python project with a single file:
import mysql.connector
try:
cnx = mysql.connector.connect(
host='foobar.rds.amazonaws.com',
user='devuser',
password='devpasswd',
database='devdb'
)
cursor = cnx.cursor()
#cursor = cnx.cursor(buffered=True)
cursor.execute('SELECT * FROM flights')
print(cursor.rowcount)
rows = cursor.fetchall()
except Exception as exc:
print(exc)
If I run this code with simple cursor, fetchall raises "No result set to fetch from". If I run with buffered cursor, I can see that _rows property of cursor contains my data, but fetchall() returns empty array.
Your issue is that cursor.execute(stmt) returns an object with results and you're not storing that.
results = cursor.execute(stmt)
print(results.fetchone()) # Prints out and pops first row
For the future googlers with the same Problem I found a workaround which may help in some cases:
I didn't find the source of the problem but a solution which worked for me.
In my case .fetchone() also returned none whatever I did on my local(on my own Computer) Database. I tried the exact same code with the Database on our companies server and somehow it worked. So I copied the complete server Database onto my local Database (by using database dumps) just to get the server settings and afterwards I also could get data from my local SQL-Server with the code which didn't work before.
I am a SQL-newbie but maybe some crazy setting on my local SQL-Server prevented me from fetching data. Maybe some more experienced SQL-user knows this setting and can explain.

Python PYODBC - Previous SQL was not a query

I have the following python code, it reads through a text file line by line and takes characters x to y of each line as the variable "Contract".
import os
import pyodbc
cnxn = pyodbc.connect(r'DRIVER={SQL Server};CENSORED;Trusted_Connection=yes;')
cursor = cnxn.cursor()
claimsfile = open('claims.txt','r')
for line in claimsfile:
#ldata = claimsfile.readline()
contract = line[18:26]
print(contract)
cursor.execute("USE calms SELECT XREF_PLAN_CODE FROM calms_schema.APP_QUOTE WHERE APPLICATION_ID = "+str(contract))
print(cursor.fetchall())
When including the line cursor.fetchall(), the following error is returned:
Programming Error: Previous SQL was not a query.
The query runs in SSMS and replace str(contract) with the actual value of the variable results will be returned as expected.
Based on the data, the query will return one value as a result formatted as NVARCHAR(4).
Most other examples have variables declared prior to the loop and the proposed solution is to set NO COUNT on, this does not apply to my problem so I am slightly lost.
P.S. I have also put the query in its own standalone file without the loop to iterate through the file in case this was causing the problem without success.
In your SQL query, you are actually making two commands: USE and SELECT and the cursor is not set up with multiple statements. Plus, with database connections, you should be selecting the database schema in the connection string (i.e., DATABASE argument), so TSQL's USE is not needed.
Consider the following adjustment with parameterization where APPLICATION_ID is assumed to be integer type. Add credentials as needed:
constr = 'DRIVER={SQL Server};SERVER=CENSORED;Trusted_Connection=yes;' \
'DATABASE=calms;UID=username;PWD=password'
cnxn = pyodbc.connect(constr)
cur = cnxn.cursor()
with open('claims.txt','r') as f:
for line in f:
contract = line[18:26]
print(contract)
# EXECUTE QUERY
cur.execute("SELECT XREF_PLAN_CODE FROM APP_QUOTE WHERE APPLICATION_ID = ?",
[int(contract)])
# FETCH ROWS ITERATIVELY
for row in cur.fetchall():
print(row)
cur.close()
cnxn.close()

Get all documents of a collection using Pymongo

I want to write a function to return all the documents contained in mycollection in mongodb
from pymongo import MongoClient
if __name__ == '__main__':
client = MongoClient("localhost", 27017, maxPoolSize=50)
db=client.mydatabase
collection=db['mycollection']
cursor = collection.find({})
for document in cursor:
print(document)
However, the function returns: Process finished with exit code 0
Here is the sample code which works fine when you run from command prompt.
from pymongo import MongoClient
if __name__ == '__main__':
client = MongoClient("localhost", 27017, maxPoolSize=50)
db = client.localhost
collection = db['chain']
cursor = collection.find({})
for document in cursor:
print(document)
Please check the collection name.
pymongo creates a cursor. Hence you'll get the object 'under' the cursor. To get all objects in general try:
list(db.collection.find({}))
This will force the cursor to iterate over each object and put it in a list()
Have fun...
I think this will work fine in your program.
cursor = db.mycollection # choosing the collection you need
for document in cursor.find():
print (document)
it works fine for me,try checking the exact database name and collection name.
and try changing from db=client.mydatabase to db=client['mydatabase'] .
If your database name is such that using attribute style access won’t work (like test-database), you can use dictionary style access instead.
source !

How can Python Observe Changes to Mongodb's Oplog

I have multiple Python scripts writing to Mongodb using pyMongo. How can another Python script observe changes to a Mongo query and perform some function when the change occurs? mongodb is setup with oplog enabled.
I wrote a incremental backup tool for MongoDB some time ago, in Python. The tool monitors data changes by tailing the oplog. Here is the relevant part of the code.
Updated answer, MongDB 3.6+
As datdinhquoc cleverly points out in the comments below, for MongoDB 3.6 and up there are Change Streams.
Updated answer, pymongo 3
from time import sleep
from pymongo import MongoClient, ASCENDING
from pymongo.cursor import CursorType
from pymongo.errors import AutoReconnect
# Time to wait for data or connection.
_SLEEP = 1.0
if __name__ == '__main__':
oplog = MongoClient().local.oplog.rs
stamp = oplog.find().sort('$natural', ASCENDING).limit(-1).next()['ts']
while True:
kw = {}
kw['filter'] = {'ts': {'$gt': stamp}}
kw['cursor_type'] = CursorType.TAILABLE_AWAIT
kw['oplog_replay'] = True
cursor = oplog.find(**kw)
try:
while cursor.alive:
for doc in cursor:
stamp = doc['ts']
print(doc) # Do something with doc.
sleep(_SLEEP)
except AutoReconnect:
sleep(_SLEEP)
Also see http://api.mongodb.com/python/current/examples/tailable.html.
Original answer, pymongo 2
from time import sleep
from pymongo import MongoClient
from pymongo.cursor import _QUERY_OPTIONS
from pymongo.errors import AutoReconnect
from bson.timestamp import Timestamp
# Tailable cursor options.
_TAIL_OPTS = {'tailable': True, 'await_data': True}
# Time to wait for data or connection.
_SLEEP = 10
if __name__ == '__main__':
db = MongoClient().local
while True:
query = {'ts': {'$gt': Timestamp(some_timestamp, 0)}} # Replace with your query.
cursor = db.oplog.rs.find(query, **_TAIL_OPTS)
cursor.add_option(_QUERY_OPTIONS['oplog_replay'])
try:
while cursor.alive:
try:
doc = next(cursor)
# Do something with doc.
except (AutoReconnect, StopIteration):
sleep(_SLEEP)
finally:
cursor.close()
I ran into this issue today and haven't found an updated answer anywhere.
The Cursor class has changed as of v3.0 and no longer accepts the tailable and await_data arguments. This example will tail the oplog and print the oplog record when it finds a record newer than the last one it found.
# Adapted from the example here: https://jira.mongodb.org/browse/PYTHON-735
# to work with pymongo 3.0
import pymongo
from pymongo.cursor import CursorType
c = pymongo.MongoClient()
# Uncomment this for master/slave.
oplog = c.local.oplog['$main']
# Uncomment this for replica sets.
#oplog = c.local.oplog.rs
first = next(oplog.find().sort('$natural', pymongo.DESCENDING).limit(-1))
ts = first['ts']
while True:
cursor = oplog.find({'ts': {'$gt': ts}}, cursor_type=CursorType.TAILABLE_AWAIT, oplog_replay=True)
while cursor.alive:
for doc in cursor:
ts = doc['ts']
print doc
# Work with doc here
Query the oplog with a tailable cursor.
It is actually funny, because oplog-monitoring is exactly what the tailable-cursor feature was added for originally. I find it extremely useful for other things as well (e.g. implementing a mongodb-based pubsub, see this post for example), but that was the original purpose.
I had the same issue. I put together this rescommunes/oplog.py. Check comments and see __main__ for an example of how you could use it with your script.

Celery and SQLAlchemy - This result object does not return rows. It has been closed automatically

I have a celery project connected to a MySQL databases. One of the tables is defined like this:
class MyQueues(Base):
__tablename__ = 'accepted_queues'
id = sa.Column(sa.Integer, primary_key=True)
customer = sa.Column(sa.String(length=50), nullable=False)
accepted = sa.Column(sa.Boolean, default=True, nullable=False)
denied = sa.Column(sa.Boolean, default=True, nullable=False)
Also, in the settings I have
THREADS = 4
And I am stuck in a function in code.py:
def load_accepted_queues(session, mode=None):
#make query
pool = session.query(MyQueues.customer, MyQueues.accepted, MyQueues.denied)
#filter conditions
if (mode == 'XXX'):
pool = pool.filter_by(accepted=1)
elif (mode == 'YYY'):
pool = pool.filter_by(denied=1)
elif (mode is None):
pool = pool.filter(\
sa.or_(MyQueues.accepted == 1, MyQueues.denied == 1)
)
#generate a dictionary with data
for i in pool: #<---------- line 90 in the error
l.update({i.customer: {'customer': i.customer, 'accepted': i.accepted, 'denied': i.denied}})
When running this I get an error:
[20130626 115343] Traceback (most recent call last):
File "/home/me/code/processing/helpers.py", line 129, in wrapper
ret_value = func(session, *args, **kwargs)
File "/home/me/code/processing/test.py", line 90, in load_accepted_queues
for i in pool: #generate a dictionary with data
File "/home/me/envs/me/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2341, in instances
fetch = cursor.fetchall()
File "/home/me/envs/me/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 3205, in fetchall
l = self.process_rows(self._fetchall_impl())
File "/home/me/envs/me/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 3174, in _fetchall_impl
self._non_result()
File "/home/me/envs/me/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 3179, in _non_result
"This result object does not return rows. "
ResourceClosedError: This result object does not return rows. It has been closed automatically
So mainly it is the part
ResourceClosedError: This result object does not return rows. It has been closed automatically
and sometimes also this error:
DBAPIError: (Error) (, AssertionError('Result length not requested
length:\nExpected=1. Actual=0. Position: 21. Data Length: 21',))
'SELECT accepted_queues.customer AS accepted_queues_customer,
accepted_queues.accepted AS accepted_queues_accepted,
accepted_queues.denied AS accepted_queues_denied \nFROM
accepted_queues \nWHERE accepted_queues.accepted = %s OR
accepted_queues.denied = %s' (1, 1)
I cannot reproduce the errror properly as it normally happens when processing a lot of data. I tried to change THREADS = 4 to 1 and errors disappeared. Anyway, it is not a solution as I need the number of threads to be kept on 4.
Also, I am confused about the need to use
for i in pool: #<---------- line 90 in the error
or
for i in pool.all(): #<---------- line 90 in the error
and could not find a proper explanation of it.
All together: any advise to skip these difficulties?
All together: any advise to skip these difficulties?
yes. you absolutely cannot use a Session (or any objects which are associated with that Session), or a Connection, in more than one thread simultaneously, especially with MySQL-Python whose DBAPI connections are very thread-unsafe*. You must organize your application such that each thread deals with it's own, dedicated MySQL-Python connection (and therefore SQLAlchemy Connection/ Session / objects associated with that Session) with no leakage to any other thread.
Edit: alternatively, you can make use of mutexes to limit access to the Session/Connection/DBAPI connection to just one of those threads at a time, though this is less common because the high degree of locking needed tends to defeat the purpose of using multiple threads in the first place.
I got the same error while making a query to SQL-Server procedure using SQLAlchemy.
In my case, adding SET NOCOUNT ON to the stored procedure fixed the problem.
ALTER PROCEDURE your_procedure_name
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for your procedure here
SELECT *
FROM your_table_name;
END;
Check out this article for more details
I was using an INSERT statment. Adding
RETURNING id
at the end of the query worked for me. As per this issue
That being said it's a pretty weird solution, maybe something fixed in later versions of SQLAlchemy, I am using 1.4.39.
This error occurred for me when I used a variable in Python
and parsed it with an UPDATE
statement using pandas pd.read_sql()
Solution:
I simply used mycursor.execute() instead of pd.read_sql()
import mysql.connector and from sqlalchemy import create_engine
Before:
pd.read_sql("UPDATE table SET column = 1 WHERE column = '%s'" % variable, dbConnection)
After:
mycursor.execute("UPDATE table SET column = 1 WHERE column = '%s'" % variable)
Full code:
import mysql.connector
from sqlalchemy import create_engine
import pandas as pd
# Database Connection Setup >
sqlEngine = create_engine('mysql+pymysql://root:root#localhost/db name')
dbConnection = sqlEngine.connect()
db = mysql.connector.connect(
host="localhost",
user="root",
passwd="root",
database="db name")
mycursor = db.cursor()
variable = "Alex"
mycursor.execute("UPDATE table SET column = 1 WHERE column = '%s'" % variable)
For me I got this error when I forgot to write the table calss name for the select function query = select().where(Assessment.created_by == assessment.created_by) so I had only to fix this by adding the class table name I want to get entries from like so:
query = select(Assessment).where(
Assessment.created_by == assessment.created_by)

Categories

Resources