MongoDB PyMongo Listing all keys in a document - python

I have a question about how to manipulate a document in PyMongo to have it list all of its current keys, and I'm not quite sure how to do it. For example, if I had a document that looked like this:
{
"_id" : ObjectID("...")
"name": ABCD,
"info": {
"description" : "XYZ",
"type" : "QPR"
}
}
and I had a variable "document" that had this current document as its value, how could I write code to print the three keys:
"_id"
"name"
"info"
I don't want it to list the values, simply the names. The motivation for this is that the user would type one of the names and my program would do additional things after that.

As mentioned in the documentation:
In PyMongo we use dictionaries to represent documents.
So you can get all keys using .keys():
print(document.keys())

Using Python we can do the following which is to fetch all the documents in a variable as mydoc
mydoc = collections.find()
for x in mydoc:
l=list(x.keys())
print(l)
Using this we can get all the keys as a list and then we can use them for further user's need

the document is a python dictionary so you can just print its keys
e.g.
document = db.collection_name.find_one()
for k in document:
print(k)

Related

How to add document for search in marqo

i recently started using the marqo library and i am trying to add document so that marqo can search and return the relevant part of the document but i keep getting error when i run the the code.
i used the
add_document()
method and i pass the document as a string for search but it returns an error. Here is what my code look like;
import marqo
DOCUMENT = 'the document'
mq = marqo.Client(url='http://localhost:8882')
mq.index("my-first-index").add_documents(DOCUMENT)
and when i run it i get a
MarqoWebError
you are getting the error because the add_document() method takes a list of python dictionaries as an argument not a string, so you are to pass the document as a value to any key you assign to it. But it is advisable to add a title and also an id for later referencing. Here is what i mean;
mq.index("my-first-index").add_documents([
{
"Title": the_title_of_your_document,
"Description": your_document,
"_id": your_id,
}]
)
the id can be any string of your choice. You can add as many dictionaries as you want to the list, each dictionary represents a document.
I think the documents need to be a list of dicts. See here https://marqo.pages.dev/API-Reference/documents/

Convert search string to searchable keys from dictionary in python

Folks,
I am using solution described at following location
How to use a dot "." to access members of dictionary?
I will minimize the solution here to explain the objective.
Basically I am looking for a solution that can take nested keys structure in string format from the user and queries dictionary. I used solution above to convert dictionary in dot notation searchable format. However following is where I am stuck
data = {
"Country" : {
"US" : [ {"Connecticut" : "Hartford"} , {"California" : "Sacramento"} ]
}
### Following works
f = Map(data)
print(f.Country.US[1].California)
Sacramento
### Following does not work
s = 'Country.US[1].California'
print(f[s])
What I need is a solution that could take externally provided key structure and turns into searchable dot-notation
What can be done?

Troubleshoot JSON Parsing/Adding Property

I have a json whose first few lines are:
{
"type": "Topology",
"objects": {
"counties": {
"type": "GeometryCollection",
"bbox": [-179.1473399999999, 17.67439566600018, 179.7784800000003, 71.38921046500008],
"geometries": [{
"type": "MultiPolygon",
"id": 53073,
"arcs": [
[
[0, 1, 2]
]
]
},
I built a python dictionary from that data as follows:
import json
with open('us.json') as f:
data = json.load(f)
It's a very long json (each county in the US). Yet when I run: len(data) it returns 4. I was a bit confused by that. So I set out to probe further and explore the data:
data['id']
data['geometry']
both of which return key errors. Yet I know that this json file is defined for those properties. In fact, that's all the json is, its the id for each county 'id' and a series of polygon coordinates for each county 'geometry'. Entering data does indeed return the whole json, and I can see the properties that way, but that doesn't help much.
My ultimate aim is to add a property to the json file, somewhat similar to this:
Add element to a json in python
The difference is I'm adding a property that is from a tsv. If you'd like all the details you may find my json and tsv here:
https://gist.github.com/diggetybo/ca9d3c2fed76ddc7185cf966a65b8718
For clarity, let me summarize what I'm asking:
My question is: Why can't I access the properties in the above way? Can someone provide a way to access the properties I'm interested in ('id','geometries') Or better yet, demonstrate how to add a property?
Thank you
json.load
Deserialize fp (a .read()-supporting file-like object containing a
JSON document) to a Python object using this conversion table.
[] are for lists and {} are for dictionaries.So this is an example to get id:
with open("us.json") as f:
c=json.load(f)
for i in c["objects"]["counties"]["geometries"]:
print i["id"]
And the structure of your data is like this:
{
"type":"xx",
"objects":"xx",
"arcs":"xx",
"transform":"xx"
}
So the length of data is 4.You can append data or add a new element just like using list and dict.See more details from Json.
Hope this helps.

json template and python coding should not be tightly coupled to each other

I have a json file and after I load the file, python's json.loads converts it into dictionary. For example if i have a json file like,
{ "Family" :
{
"Father" : "Name of the person",
"Mother" : "Name of the person",
"Children" : [
{
"Name" : "Name of the kid",
"Age" : "Age value of the kid"
}
]
}
I could access all the keys and values from the dictionary.
Question and requirement: I don't want to do a literal comparison like
if 'key' == 'Family':
do some operations
elif 'key' == 'Mother':
do other operations
else:
do something else
The above json template if modified either the key or nesting, i want the keys and values to be updated immediately in my python code. I don't want the tight bonding between the python code and the json template. Is that possible???
I came up with 2 solutions,
1) use constant values for keys. so any update to key in json template should also update the constant values in python code. But if nesting is changed then there is problem
2) use INI file to have mapping of json keys to some alias. so my INI file will be like
[Family]
father : Father
mother : Mother
[etc...]
so the Python code will always refer to father, mother instead of the "json keys", in which way if json template is updated then INI file can be updated on the right hand side. Again this has the problem if json template is changed with the representation instead of just changing the key name.
Please let me know if there is any solution to this.

Pluck in Python

I started reading about underscore.js today, it is a library for javascript that adds some functional programming goodies I'm used to using in Python. One pretty cool shorthand method is pluck.
Indeed in Python I often need to pluck out some specific attribute, and end up doing this:
users = [{
"name" : "Bemmu",
"uid" : "297200003"
},
{
"name" : "Zuck",
"uid" : "4"
}]
uids = map(lambda x:x["uid"], users)
If the underscore shorthand is somewhere in Python, this would be possible:
uids = pluck(users, "uid")
It's of course trivial to add, but is that in Python somewhere already?
Just use a list comprehension in whatever function is consuming uids:
instead of
uids = map(operator.itemgetter("uid"), users)
foo(uids)
do
foo([x["uid"] for x in users])
If you just want uids to iterate over, you don't need to make a list -- use a generator instead. (Replace [] with ().)
For example:
def print_all(it):
""" Trivial function."""
for i in it:
print i
print_all(x["uid"] for x in users)
From funcy module (https://github.com/Suor/funcy) you can pick pluck function.
In this case, provided that funcy is available on your host, the following code should work as expected:
from funcy import pluck
users = [{
"name" : "Bemmu",
"uid" : "297200003"
},
{
"name" : "Zuck",
"uid" : "4"
}]
uids = pluck("uid", users)
Pay attention to the fact that the order of arguments is different from that used with underscore.js

Categories

Resources