How do I call networkx.add_node(..) with optional properties? - python

I'm looping through a dictionary of objects constructed from JSON, and I'm creating vertices from them using networkx. The problem I'm experiencing is that some of the JSON object have missing properties, and if I do this:
self.graph.add_node(valueToCheck,
id=self.vertexDict[valueToCheck],
namespace=component["namespace"],
tenant=component["tenant"],
type=component.get("type")+"Component",
artifactFileName=component.get("artifactFileName"),
className=component.get("className"),
userConfig=component.get("userConfig"),
sourceType=component.get("sourceType"),
sinkType=component.get("sinkType"))
then I can't export my graph using nx.write_graphml(..) because some of the vertex properties have the value None (which is the expected output of component.get(..) when the property is missing).
How do I use networkx to construct vertices when some of my properties might be missing in the JSON objects?
Here's what my JSON looks like:
[{'type': 'function',
'namespace': 'campaigns',
'name': 'campaign-record-transformer',
'tenant': 'osp',
'artifactFileName': 'osp-functions-1.1-SNAPSHOT-jar-with-dependencies.jar',
'className': 'com.overstock.dataeng.pulsar.functions.CampaignRecordTransformer',
'inputs': ['persistent://osp/campaigns/campaign-manager'],
'logTopic': 'persistent://osp/logging/pulsar-log-topic',
'output': 'persistent://osp/campaigns/campaign-records'},
{'type': 'function',
'namespace': 'campaignsTest',
'name': 'campaign-metadata-transformer',
'tenant': 'osp',
'artifactFileName': 'osp-functions-1.1-SNAPSHOT-jar-with-dependencies.jar',
'className': 'com.overstock.dataeng.pulsar.functions.CampaignMetadataTransformer',
'logTopic': 'persistent://osp/logging/pulsar-log-topic',
'output': 'persistent://osp/campaigns/campaign-metadata-output'}]
Notice that the inputs property is missing from the second object. In the actual data, there are at least 8 optional properties that can be missing in different combinations, and there are hundreds of objects like this.

I do not have the reputation for a comment, so despite this not being a full answer, I am posting it as such
Have you tried simply excluding the properties that are missing from your add_node step?
That is, instead of providing a key value pair where the value is None, don't provide a key/value pair at all if the key is missing.
You can probably achieve this quite easily by loading your json using python and then just unpacking your component:
components = json.load(...)
for component in components:
self.graph.add_node(value, **component)
See https://docs.python.org/3/tutorial/controlflow.html#unpacking-argument-lists

Related

Firebase Realtime Database - cannot read document as json/dictionary

In my realtime database I have a path /stats which contains a set of documents.
I want to using the python sdk get the /stats document as a dict. My code looks like that
path = "/stats"
ref = db.reference(path, firebase_app)
document = ref.get()
print(document)
And the output is
[None, {'name': 'Full Time Statistics', 'thumbnail': 'https://***', 'url': 'https://***'}]
which is a list not a dictionary. How to change it and read this document path as a dictionary something like that
{"1": {'name': 'Full Time Statistics', 'thumbnail': 'https://***', 'url': 'https://***'}}
On the other hand I can get other documents with similar structures as a dictionary with no issue. Why is it like that and how to solve it ?
Two things are happening here:
Since you are retrieving /stats you are getting all nodes under it. Since this is a repeated list and Firebase Realtime Database keys are strings, you'd normally get a dictionary (with the keys in the dictionary being the keys in the JSON).
Since your keys are numeric values, Firebase "thinks" you are trying to store an array/list and it tries to coerce the data into an array for you. That's why you get a None entry in the list: that's Firebase filling in the zeroth element for you.
There's unfortunately no way to disable this array coercion. I typically get around it by prefixing the keys with a fixed string, so that Firebase bypasses its array logic. So:
stats: {
stat1: { ... },
stat2: { ... }
}
Also see:
Best Practices: Arrays in Firebase

Python: Parsing JSON data from API get - referring to the dictionary key?

I'm pretty new to Python so I'm only just starting to work with API's. I have retrieved the data I need from an API and it returns in the following format:
{u'id': u'000123', u'payload': u"{'account_title': u’sam b’, 'phone_num': ‘1234567890’, 'security_pin': u'000000', 'remote_word': u’secret123’, 'email_address': ‘email#gmail.com’, 'password': u’password123’}”}
So when I print my variable, it returns the above...
If I just want part of what it returns, how do I go about writing this? I was able to return id without issue, and if I specified 'payload' it returned everything after that. It seems like account_title, phone_num, security_pin, remote_word, email_address and password are all nested inside of 'payload'
Would would the best way be to have a variable, when printed, return just the email_address for example?
Thanks!
Welcome to Python! Sounds like you're getting right into it. It would be best to begin reading fundamentals, specifically about the Dictionary Data Structure
The Dictionary, or dict is what you are referencing in your question. It's a key-value store that is generally[1] un-ordered. The dict is a great way to represent JSON data.
Now you are asking how to extract information from a dictionary. Well, you seem to have it working out thus far! Let's use your example:
d = {u'id': u'000123', u'payload': u"{'account_title': u’sam b’, 'phone_num': ‘1234567890’, 'security_pin': u'000000', 'remote_word': u’secret123’, 'email_address': ‘email#gmail.com’, 'password': u’password123’}"}
Now if we write d['id'], we'll get the id (which is 000123)
If we write d['payload'], we'll get the dictionary within this larger dictionary. Cool part about dicts, they can be nested like this! As many times as you need.
d['payload']
"{'account_title': u’sam b’, 'phone_num': ‘1234567890’, 'security_pin': u'000000', 'remote_word': u’secret123’, 'email_address': ‘email#gmail.com’, 'password': u’password123’}"
Then as per your question, if you wanted to get email, it's the same syntax and you're just nesting the accessor. Like so:
d['payload']['email_address']
Hope that helps!
For the longest time, dicts were un-ordered in Python. In versions 3.6 and up, things began changing. This answer provides great detail on that. Otherwise, prior to that, using collections.OrderedDict was the only way to get a dict ordered by insertion-order

Work with nested objects using couchdb-python

Disclaimer: Both Python and CouchDB are new for me. So far my "programming" has mostly consisted of Bash scripts.
I'm trying to create a small script that updates objects in a CouchDB database. The objects however aren't created by my script but by an App called Tap Forms that uses CouchDB for sync. Basically I'm trying to automatically update the content of the app. That also means I can't really influence the structure or names of the objects in CouchDB.
The Database is mostly filled with objects of this structure:
{
"_id": "rec-3b17...",
"_rev": "21-cdf6...",
"values": {
"fld-c3d4...": 4,
"fld-1def...": 1000000000000,
"fld-bb44...": 760000000000,
"fld-a44f...": "admin,name",
"fld-5fc0...": "SSD",
"fld-642c...": true,
},
"deviceName": "MacBook Air",
"dateModified": "2019-02-08T14:47:06.051Z",
"dateCreated": "2019-02-08T11:33:00.018Z",
"type": "frm-7ff3...",
"dbID": "db-1435...",
"form": "frm-7ff3..."
}
I shortened the numbers a bit and removed some entries to increase readability.
Now the actual values I'm trying to update are within the "values" : {...} array (or object, or list, guess I don't have much experience with JSON either).
As I know some of these values, I managed to create view that finds the _id of an object on the server. I then use the python-couchdb module as described in documentation:
for item in db.view('CustomViews/test2', key="GENERIC"):
doc = db[item.id]
This gives me the object. However I want to update one of the values within the values array, lets say fld-c3d4.... But how? Using doc['values'] = 'new_value' updates the whole array. I tried other (seemingly logical) ways along the lines of doc['values['fld-c3d4']'] = 'new_value' but couldn't wrap my head around it. I couldn't find an example in any documentation.
So here's a example how to update the fld-c3d4.
You have your document that represent a dictionary with nested dictionary.
If you want to get the values, you will do something like this:
values = doc['values']
Now the variable values points to the values in your document.
From there, you can access a sub value:
values['fld-c3d4'] = 'new value'
If you want to directly update the value from the doc, you just have to chain those operations:
doc['values']['fld-c3d4'] = 'new value'

Combining separate self.request.sessions into one request

I am trying to optimize and reduce some of my code, and just generally understand it better as this is my first development project.
The below works fine but is it possible to simplify it?
self.request.session['path_one_images'] = PATH_ONE_IMAGES
self.request.session['images'] = images
self.request.session['slider_DV_values'] = slider_DV_values
self.request.session['instruction_task_one_images'] = INSTRUCTION_TASK_ONE_IMAGES
self.request.session['instruction_task_two_images'] = INSTRUCTION_TASK_TWO_IMAGES
I tried to combine the separate requests in one using a dict but get the error:
Exception Value: unhashable type: 'list'
self.request.session({['path_one_images'] : PATH_ONE_IMAGES,
['images'] : images,
['slider_DV_values'] : slider_DV_values,
['instruction_task_one_images'] : INSTRUCTION_TASK_ONE_IMAGES,
['instruction_task_two_images'] : INSTRUCTION_TASK_TWO_IMAGES,})
request.session is a basically a Python mapping just like a dictionary, and it supports all dictionary methods. Like dict.update() to set multiple key-value pairs:
self.request.session.update({
'path_one_images': PATH_ONE_IMAGES,
'images': images,
'slider_DV_values': slider_DV_values,
'instruction_task_one_images': INSTRUCTION_TASK_ONE_IMAGES,
'instruction_task_two_images': INSTRUCTION_TASK_TWO_IMAGES
})
Note that the keys are not lists; you were getting confused by the object[...] subscription syntax there.
you know this is wrong syntax for a dict, yes?
{['path_one_images'] : PATH_ONE_IMAGES}
...should be
{'path_one_images': PATH_ONE_IMAGES, etc}
https://docs.python.org/2/library/stdtypes.html#dict
this explains the error you're getting ("unhashable type: 'list'")... Python thinks you're trying to use a list ['path_one_images'] as the dict key. Dict keys don't have to be strings but they have to be hashable. In this case you just want to use the string 'path_one_images'.
Then additionally, as #Martijn Pieters pointed out, the session dict itself isn't callable, you should use the update method, eg:
self.request.session.update({
'path_one_images': PATH_ONE_IMAGES,
'images': images,
'slider_DV_values': slider_DV_values,
'instruction_task_one_images': INSTRUCTION_TASK_ONE_IMAGES,
'instruction_task_two_images': INSTRUCTION_TASK_TWO_IMAGES
})

Mongokit How do I specify a key in a dict which is in a list as a required field?

I'm using mongokit and I have a structure similar to this in my document.
class MyDoc(Document):
structure = {
'sections': [{
'title': unicode,
'description': unicode
}]
}
required_fields = []
I want to make description a required field in this document. I know nested keys can be accessed via the dot notation, but sections.description does not work. How do I achieve what I want?
You may want to try sections.$.description
http://docs.mongodb.org/manual/reference/operator/update/positional/

Categories

Resources