I am a noob at Python and MongoDB and would really appreciate your help with my problem. My collection in MongoDB looks like this:
{
"Segments" : [
{
Devices : [
"IP" : "",
"Interfaces" :
[
{
"Name" :""
}
],
],
"DeviceName" : "",
"SegmentName" : ""
}
]
}
I have an object like so:
Node Details: {'node:98a': ['Sweden', 'Stockholm', '98a-3470'], 'node:98b': ['Denmark', 'Copenhagen', '98b-3471', '98b-3472']}
I need to update the 'Name' within 'Interfaces' part in the collection above, with values from the Node Details dictionary. I have tried using $set, $addToSet, $push etc., but nothing is helping. I have already added the Segment and DeviceName information.
The output should be as follows:
{
"Segments" : [
{
Devices : [
{
"Interfaces" :
[
{
"Name" :"98a-3470"
}
],
"DeviceName" : "node:98a",
}
{
"Interfaces" :
[
{
"Name" :"98b-3471"
},
{
"Name" :"98b-3472"
}
],
"DeviceName" : "node:98b",
}
],
"SegmentName" : "segmentA"
}
]
}
Any help would be greatly appreciated. I have tried a lot in the MongoDB shell and also on Google, but to no avail. Thank you all.
Regards,
trupsster
[[ EDITED ]]
Okay, here is what I have got so far after continuing to poke around after posing the question: I used the following query in MongoDB shell:
db.test.mycoll.update({'Segments.SegmentName':'segmentA','Segments.Devices.Name':'node:98a'}, {$set: {"Segments.$.Devices.0.Interfaces.Name" : "98b-3470"}})
Now this inserted in the correct place as per my 'schema', but when I try to add the second interface, it simply replaces the earlier one. I tried using $push (complained about it not being an array), and $addToSet (showed another error), but none helped. Can you please help me from this point on?
Thanks,
trupsster
[[ Edited again ]]
I found the solution! Here is what I did:
To add an interface to an existing device:
db.test.mycoll.update({'Segments.SegmentName':'segmentA','Segments.Devices.Name':'node:98a'}, {$addToSet: {"Segments.$.Devices.0.Interfaces.Name" : "98a-3471"}})
Now, to append to the dict with a new 'Name' within the array 'Interfaces':
db.test.mycoll.update({'Segments.SegmentName':'segmentA','Segments.Devices.Name':'node:98a'}, {$addToSet: {"Segments.$.Devices.0.Interfaces" : {"Name" : "98a-3472"}}})
As you can see, I used $addToSet.
Now, next step was to add the same information (with different values) to 2nd device, which was done like so:
db.test.mycoll.update({'Segments.SegmentName':'segmentA','Segments.Devices.Name':'node:98b'}, {$addToSet: {"Segments.$.Devices.1.Interfaces" : {"Name" : "98b-3473"}}})
So that was it! I am so chuffed with myself! Thank you all who took time to read my problem. I hope my solution will help someone.
Regards,
trupsster
You did not say what you actually tried. To access a sub-document inside an array, you need to use dot notation with numeric indices. So to address the Name field in your example:
Segments.Devices.0.Interfaces.0.Name
Did you try that? Does it work?
Related
This question already has answers here:
How to get key names from JSON using jq
(9 answers)
Closed 4 years ago.
I would like to get first objects (don't know if it's the right name) of my json file that is huge (more than 120k lines), so I can't parse it manually.
Format is like this :
"datanode": [
{
"isWhitelisted": true,
"metricname": "write_time",
"seriesStartTime": 1542037566944,
"supportsAggregation": true
},
{
"isWhitelisted": true,
"metricname": "dfs.datanode.CacheReportsNumOps",
"seriesStartTime": 1542037501137,
"supportsAggregation": true,
"type": "COUNTER"
},
{
"isWhitelisted": true,
"metricname": "FSDatasetState.org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.EstimatedCapacityLostTotal",
"seriesStartTime": 1542037495521,
"supportsAggregation": true,
"type": "GAUGE"
},
],
"toto": [
....
And what I need is to extract this : datanode, toto, etc. Only the name.
Can you help me please ?
I tried using jq without success.
You can use jq's keys functionality
jq 'keys' file.json
In the future try to improve on which words you use to describe the different parts the json data. You asked about objects in the text, but actually refer to the keys.
A more fitting title for the question would have been: "How to get all top level keys of json data using jq?" And with this, more correct, wording you find already answered questions like this one: How to get key names from JSON using jq
Also provide a complete and valid example structure and the expected result like this:
{
"one_key": {
"foo": "bar"
},
"another_one": {
"bla": "bla"
}
}
And desired result:
[
"another_one",
"one_key"
]
Basically I am designing and developing an application in Python that each night executes and takes a website and a list of keywords and queries the Google API to obtain their position given a specific keyword.
I want to use a none sql approach and using objects that Mongodb offers this seems like the best approach however I'm confused about how to structure the data inside the database.
Each night new data will be generated this will contain 50 keywords and their positions this I presume will be stored inside its own object and will be able to be identified by a specific url.
So therefore will it be possible to query the database given a url and use a data range of say the past 30 days or 60 days? I'm confused if I will be able to fetch all of objects back
The main requirement for that structure will be ability to query on daily basis.
so let say we have a website www.stackoverflow.com and our X keywords.
The basic document shape could look like that:
{
_id : objectId, // this have timestamp
www : "www.stackoverflow.com",
rankings : [{
"key1" : "val1"
}, {
"key2" : "val2"
}
],
}
then, if we want to see a ranking history per key1, we can use aggregation framework to query:
db.ranking.aggregate(
[{
$unwind : "$rankings"
}, {
$match : {
"rankings.key1" : { $exists : true}
}
}
])
and response will be similar to:
{
"_id" : ObjectId("584dbe04f4ce077869fee3dc"),
"www" : "www.stackoverflow.com",
"rankings" : {
"key1" : "val1"
}
},
{
"_id" : ObjectId("584dbe07f4ce077869fee3dd"),
"www" : "www.stackoverflow.com",
"rankings" : {
"key1" : "val1"
}
}
seek more about grouping in aggregation framework to uncover power of mongo!
Supposed I have a mongo document that looks similar to the following:
{
'foo':1
'listOfLists' : [ [1,2],[3,4] ]
}
(Yes I am aware this isn't how it "really" looks but it should be simple enough for explanation purposes.)
If I wanted to write a query that would check to see if the listsOfLists list object contains the combination of [3,4], how could I go about doing that?
Could I do something like
collection.find({'listsOfLists' : {'$elemMatch' : [3,4] } })
collection.find({ 'listsOfLists': [3,4] }).
It's just a "direct match" on the property. MongoDB will look at each array element automatically. You don't need $elemMatch here.
If you were to use it, you need an operator expression, such as $eq:
collection.find({ 'listsOfLists': { '$elemMatch': { '$eq': [3,4] } } }).
But that of course is not required unless there are "two or more" conditions that actually need to match on the array elements. Which is what $elemMatch is actually for.
I'm using pymongo in python
I have a mongodb document like this
{u'_id': ObjectId('55110d55a5bd910f2513fc91'), u'ghi': u'jkl'}
I want to update the document by replacing
db['table_name'].update({'ghi':'jkl'},{'ghio':'jkl'}, True)
The problem is that I wanted to use $currentDate along with the update query as I'm required to add update time with the document. How do I do that?
This is what I've tried out so far
db['table_name'].update({'ghi':'jkl'},{'$set':{'ghik':'jkl'}, '$currentDate':{'date':True}}, True)
The issue with the above code is that I do not want to use $set as it will retain the other fields which I do not require.
db['table_name'].update({'ghi':'jkl'},{'$set':{'ghik':'jkl'}, '$unset':{'ghi':True}, '$currentDate':{'date':True}}, True)
The above code works, but I would like to know if there is a better way to do it.
currentDate only works with update operators like $set and not with a full document update. You can use the $unset update as you pointed out, although this only wipes out fields you specifically name, you can set the timestamp clientside
db.test.update({ "ghi" : "jkl" }, { "ghio" : "jkl", "date" : datetime.today() })
or you can do two updates
db.test.update({ "ghi" : "jkl" }, { "ghio" : "jkl" })
db.test.update({ "ghio" : "jkl" }, { "$currentDate" : { "date" : true } })
I have a mongo document:
{ "_id" : 0, "name" : "Vasya", "fav" : [ { "type" : "t1", "weight" : 1.4163 }, { "type" : "t2", "weight" : 11.7772 }, { "type" : "t2", "weight" : 6.4615 }, { "type" : "homework", "score" : 35.8742 } ] }
For delete lowest element in array "fav", I use the following Python code:
db.people.update({"fav":{"type":"t2", "weight":lowest}}, {"$pull":{"fav"{"type":"t2", "weight":lowest}}})
where variable lowest is the lowest value between 6.4615 and 35.8742.
The problem is that this code does nothing. There are no errors, and the values are not deleted from the array. But if I write in the mongo shell the same code, the result is positive.
Unfortunately my experience in pymongo and in mongo is not so good. So if someone knows what the problem is, that would be great.
The syntax works fine for me in Mongo shell and with pymongo, so as suspected the issue is the precision of floating point numbers.
I don't know how you are deriving/computing lowest but you may want to consider standardizing on maximum number of significant digits after the decimal point, or maybe even have a function that normalizes your floats to the same precision, both when you are originally saving documents and when you are later querying or updating them.
Neither Mongo nor Python consider 6.676176060654615 to be equal to 6.67617606065 which explains why your update is having no effect.