Deleting documents from collection in Pymongo? - python

I have the following:
from pymongo import MongoClient
client = MongoClient()
db=client.localhost
collection=db['accounts']
db.collection.remove({})
cursor = collection.find({})
for document in cursor:
print(document)
This second part is to just print all the documents in the collection. However, the collection isn't clearing every time I rerun the program. Does anyone know why?

Just do this
db.accounts.drop()

Instead of
db.collection.remove({})
you can try this
collection.delete_many({})
Hope that solves your problem.

Instead of doing this
db.collection.remove({})
do this
db.accounts.remove({})
Also you won't need this line collection=db['accounts']
If you want dynamic collection name, you can do the following:
collection_name = 'accounts'
getattr(db, collection_name).remove({})

Related

MongoDB and Pymongo, query FULLTEXT in all collections

I have a local MongoDB database with multiple collections.
I use pymongo in jupyter notebook, what I would like to do is run a query FULLTEXT looking for the data on all the collections present.
is it possible to do this? if so how could i proceed?
from pymongo import MongoClient
client = MongoClient('mongodb://localhost:27017/')
client.list_database_names()
out: ['admin', 'config', 'local']
In local I have a more collection:
this is what I do with just one collection
db = client["local"]
firstdb = db["firstdb"]
result = db.firstdb.find({"email": {"$regex":"test","$options": 'i'}})
for item in result:
print(item['email'],item['log'])
in essence I would like to perform an email query also on secondb, thirdb, fourthdb, etc. etc.
no one can help me? basically I have to do a FULLTEXT query on all collections.
the only solution I found is this:
result = db.firstdb.find({"email": {"$regex":"test","$options": 'i'}})
result1 = db.secondb.find({"email": {"$regex":"test","$options": 'i'}})
result2 = db.thirdb.find({"email": {"$regex":"test","$options": 'i'}})
for item in result:
print(item['email'],item['log'])
for item in result1:
print(item['email'],item['date'])
for item in result2:
print(item['email'],item['account'])
but I'm not sure I'm going on the right track!
I thank anyone who can help me!
PS: I would not like to change the structure of the collections, the problem could occur in other Database

Python MongoDB Find One

I am trying to find by id a document in the database, but I get None. What am I doing wrong?
python:
card = mongo.db['grl'].find_one({'id': 448510476})
or:
card = mongo.db['grl'].find_one({'id': '448510476'})
document:
{"_id":{"$oid":"5f25b1d787fc4c34a7d9aabe"},
"id":{"$numberInt":"448510476"},"first_name":"Arc","last_name":"Fl"}
I'm not sure how you are initializing your database but try this:
from pymongo import MongoClient
client = MongoClient("mongodb://127.0.0.1:27017")
db = client.database #Selecting database named "database"
#find one in collection named "collection"
card = db.collection.find_one({"id": "448510476"})
print(card)

Get all documents of a collection using Pymongo

I want to write a function to return all the documents contained in mycollection in mongodb
from pymongo import MongoClient
if __name__ == '__main__':
client = MongoClient("localhost", 27017, maxPoolSize=50)
db=client.mydatabase
collection=db['mycollection']
cursor = collection.find({})
for document in cursor:
print(document)
However, the function returns: Process finished with exit code 0
Here is the sample code which works fine when you run from command prompt.
from pymongo import MongoClient
if __name__ == '__main__':
client = MongoClient("localhost", 27017, maxPoolSize=50)
db = client.localhost
collection = db['chain']
cursor = collection.find({})
for document in cursor:
print(document)
Please check the collection name.
pymongo creates a cursor. Hence you'll get the object 'under' the cursor. To get all objects in general try:
list(db.collection.find({}))
This will force the cursor to iterate over each object and put it in a list()
Have fun...
I think this will work fine in your program.
cursor = db.mycollection # choosing the collection you need
for document in cursor.find():
print (document)
it works fine for me,try checking the exact database name and collection name.
and try changing from db=client.mydatabase to db=client['mydatabase'] .
If your database name is such that using attribute style access won’t work (like test-database), you can use dictionary style access instead.
source !

How can Python Observe Changes to Mongodb's Oplog

I have multiple Python scripts writing to Mongodb using pyMongo. How can another Python script observe changes to a Mongo query and perform some function when the change occurs? mongodb is setup with oplog enabled.
I wrote a incremental backup tool for MongoDB some time ago, in Python. The tool monitors data changes by tailing the oplog. Here is the relevant part of the code.
Updated answer, MongDB 3.6+
As datdinhquoc cleverly points out in the comments below, for MongoDB 3.6 and up there are Change Streams.
Updated answer, pymongo 3
from time import sleep
from pymongo import MongoClient, ASCENDING
from pymongo.cursor import CursorType
from pymongo.errors import AutoReconnect
# Time to wait for data or connection.
_SLEEP = 1.0
if __name__ == '__main__':
oplog = MongoClient().local.oplog.rs
stamp = oplog.find().sort('$natural', ASCENDING).limit(-1).next()['ts']
while True:
kw = {}
kw['filter'] = {'ts': {'$gt': stamp}}
kw['cursor_type'] = CursorType.TAILABLE_AWAIT
kw['oplog_replay'] = True
cursor = oplog.find(**kw)
try:
while cursor.alive:
for doc in cursor:
stamp = doc['ts']
print(doc) # Do something with doc.
sleep(_SLEEP)
except AutoReconnect:
sleep(_SLEEP)
Also see http://api.mongodb.com/python/current/examples/tailable.html.
Original answer, pymongo 2
from time import sleep
from pymongo import MongoClient
from pymongo.cursor import _QUERY_OPTIONS
from pymongo.errors import AutoReconnect
from bson.timestamp import Timestamp
# Tailable cursor options.
_TAIL_OPTS = {'tailable': True, 'await_data': True}
# Time to wait for data or connection.
_SLEEP = 10
if __name__ == '__main__':
db = MongoClient().local
while True:
query = {'ts': {'$gt': Timestamp(some_timestamp, 0)}} # Replace with your query.
cursor = db.oplog.rs.find(query, **_TAIL_OPTS)
cursor.add_option(_QUERY_OPTIONS['oplog_replay'])
try:
while cursor.alive:
try:
doc = next(cursor)
# Do something with doc.
except (AutoReconnect, StopIteration):
sleep(_SLEEP)
finally:
cursor.close()
I ran into this issue today and haven't found an updated answer anywhere.
The Cursor class has changed as of v3.0 and no longer accepts the tailable and await_data arguments. This example will tail the oplog and print the oplog record when it finds a record newer than the last one it found.
# Adapted from the example here: https://jira.mongodb.org/browse/PYTHON-735
# to work with pymongo 3.0
import pymongo
from pymongo.cursor import CursorType
c = pymongo.MongoClient()
# Uncomment this for master/slave.
oplog = c.local.oplog['$main']
# Uncomment this for replica sets.
#oplog = c.local.oplog.rs
first = next(oplog.find().sort('$natural', pymongo.DESCENDING).limit(-1))
ts = first['ts']
while True:
cursor = oplog.find({'ts': {'$gt': ts}}, cursor_type=CursorType.TAILABLE_AWAIT, oplog_replay=True)
while cursor.alive:
for doc in cursor:
ts = doc['ts']
print doc
# Work with doc here
Query the oplog with a tailable cursor.
It is actually funny, because oplog-monitoring is exactly what the tailable-cursor feature was added for originally. I find it extremely useful for other things as well (e.g. implementing a mongodb-based pubsub, see this post for example), but that was the original purpose.
I had the same issue. I put together this rescommunes/oplog.py. Check comments and see __main__ for an example of how you could use it with your script.

How to delete a MongoDB collection in PyMongo

How to check in PyMongo if collection exists and if exists empty (remove all from collection)?
I have tried like
collection.remove()
or
collection.remove({})
but it doesn't delete collection. How to do that ?
Sample code in Pymongo with comment as explanation:
from pymongo import MongoClient
connection = MongoClient('localhost', 27017) #Connect to mongodb
print(connection.database_names()) #Return a list of db, equal to: > show dbs
db = connection['testdb1'] #equal to: > use testdb1
print(db.list_collection_names()) #Return a list of collections in 'testdb1'
print("posts" in db.list_collection_names()) #Check if collection "posts"
# exists in db (testdb1)
collection = db['posts']
print(collection.count() == 0) #Check if collection named 'posts' is empty
collection.drop() #Delete(drop) collection named 'posts' from db and all documents contained.
You should use .drop() instead of .remove(), see documentation for detail: http://api.mongodb.org/python/current/api/pymongo/collection.html#pymongo.collection.Collection.drop
=====
Sorry for misunderstanding your question.
To check if a collection exists, use method collection_names on database:
>>> collection_name in database.list_collection_names()
To check if a collection is empty, use:
>>> collection.count() == 0
both will return True or False in result.
Have you tried this:
db.collection.remove();

Categories

Resources