I have a variable, jdata, that holds data read from a JSON data file. It consists of many levels of dictionaries and lists of dictionaries. I have a search routine that returns a tuple containing path-like information to the element I want to access. I'm struggling to turn the tuple into a variable index. For example, the search routine may return ('name', 5, 'pic', 3). So I want to access jdata['name'][5]['pic'][3]. The number of levels down into the data can change for each search, so the tuple length is variable.
Addendum:
for everyone asking for code and what I've done:
I don't have code to share because I don't know how to do it and that's why I'm asking here. My first thought was to try and create the text for accessing the variable, for the example above,
"x = jdata['name'][5]['pic'][3]"
and then looking for a python way of executing that line of code. I figured there has to be a more elegant solution.
I thought the description of tuple to variable access was pretty straight forward, but here is an expanded version of my problem.
jdata = { 'thing1': 1,
'name': [
{},
{},
{},
{},
{},
{ 'thing11': 1,
'pic': [ 'LocationOfPicA',
'LocationOfPicB',
'LocationOfPicC',
'LocationOfPicD',
'LocationOfPicE'],
'thing12: 2},
{},
{} ],
'thing2': 2}
I searched for 'PicD' and my search code returns: ('name', 5, 'pic', 3)
Now I want to do some stuff, for example, accessing the value 'LocationOfPicD', copy the file located there to some other place, and update the value of 'LocationOfPicD' to the new value. All of this I can code. I just need to be able to turn the tuple into an accessible variable.
Edit: I was just reading about mutability in python. Instead of generating a path to an element in the dictionary, I think I can just assign that element value to a variable (x, for example) that gets passed back up the recursion chain of the initial search. From my understanding, I can change x and that also changes the element within the jdata variable. If that doesn't work, I can resort to using the eval() command on my generated text statement using the tuple as originally planned.
If I understand the problem correctly, you just need to avoid getting the lowest level item by value. So, you could do something like
indexes = ('name', 5, 'pic', 3)
x = jdata
for index in indexes[:-1]:
x = x[index]
x[indexes[-1]] = <new_value_here>
Easy and quick recursive implementation.
def get_d(d, tup, ind=0):
if ind == len(tup) - 1: # last item just return value
return d[tup[ind]]
return get_d(d[tup[ind]], tup, ind + 1) # else keep getting sub-item
# input input datastructure (could be dict, list, or gettable item) and tuple of items to recursively get
value = get_d(jdata, ('name', 5, 'pic', 3))
Note: this implementation is super basic and has no error handling. It's just here to give you an idea on how it could be done.
Related
I'm reading a json and want to get the label field with a specific id. What I currently have is:
with open("local_en.json") as json_file:
parsed_dict = json.load(json_file)
print(parsed_dict) # works
print(parsed_dict["interface"]) # works
print(parsed_dict["interface"]["testkey"])
My json has data blocks (being "interface" or "settings") and those data blocks contain arrays.
{
"interface":[
{"id": "testkey", "label": "The interface block local worked!"}
{"id": "testkey2", "label": "The interface block local worked, AGAIN!"}
],
"settings":[
],
"popup_success":[
],
"popup_error":[
],
"popup_warning":[
],
"other_strings":[
]
}
You can "find" the element in the interface list via a list-comprehension, and fetch the label from that element. For instance:
label = [x['label'] for x in parsed_dict['interface'] if x['id'] == 'testkey'][0]
If you cannot assume that the relevant id exists, then you can wrap this in a try-except, or you can get a list of the labels and validate that it isn't of length 0, or whatever you think would work best for you.
key = 'testkey'
labels = [x['label'] for x in parsed_dict['interface'] if x['id'] == key]
assert len(labels) > 0, f"There's no matching element for key {key}"
label = labels[0] # Takes the first if there are multiple such elements in the interface array
And while you're at it, you might want to explicitly deal with there being multiple elements with the same id, etc.
Clarification regarding your error(s):
parsed_dict["interface"] is a list, so you can index it with ints (and slices and stuff, but that's besides the point), and not with strings.
Each of the list elements is a dict, with two keys: id and label, so even if you were to take a specific element, say -
el = parsed_dict["interface"][0]
you still couldn't do el['testkey'], because that's a value of the dict, not a key.
You could check if the id is the one you're looking for though, via -
if el['id'] == 'testkey':
print('Yup, this is it')
label = el['label']
In fact, the single line I gave above is really just shorthand for running over all the elements with a loop and doing just that...
You need to browse through all the values and check if it matches expected value. Because values are not guaranteed to be unique in a dictionary, you can't simply refer to them directly like you do with keys.
print([el for el in d["interface"] if "testkey" in el.values()])
I have a problem and I want to determine whether my approach is sound. Here is the idea:
I would be creating a primary dict called zip_codes, of which respective zipcodes (from a list) were the names of each of the nested dicts. Each would have keys for "members", "offices", "members per office"
It would look like this:
zips {
90219: {
"members": 120,
"offices": 18,
"membersperoffice": 28
},
90220: {
"members": 423,
"offices": 37,
"membersperoffice": 16
}
}
and so on and so forth.
I think I need to build the nested dicts, and then process several lists against conditionals, passing resulting values into the corresponding dicts on the fly (i.e. based on how many times a zip code exists in the list).
Is using nested dictionaries the most pythonic way of doing this? Is it cumbersome? Is there a better way?
Can someone drop me a hint about how to push key values into nested dicts from a loop? I've not been able to find a good resource describing what I'm trying to do (if this is, indeed, the best path).
Thanks.
:edit: a more specific example:
determine how many instances of a zipcode are in list called membersperzip
find corresponding nested dict with same name as zipcode, inside dict called zips
pass value to corresponding key, called "members" (or whatever key)
:edit 2:
MadPhysicist requested I give code examples (I don't even know where to start with this one and I can't find examples. All I've been able to do thus far is:
area_dict = {}
area_dict = dict.fromkeys(all_areas, 0) #make all of the zipscodes keys, add a zero in the first non-key index
dictkeys = list (area_dict.keys())
That gets me a dict with a bunch of zip codes as keys. I've discovered no way to iterate through a list and create nested dicts (yet). Hence the actual question.
Please don't dogpile me and do the usual stack overflow thing. This is not me asking anyone to do my homework. This is merely me asking someone to drop me a HINT.
:edit 3:
Ok. This is convoluted (my fault). Allow me to clarify further:
So, I have an example of what the nested dicts should look like. They'll start out empty, but I need to iterate through one of the zip code lists to create all the nested dicts... inside of zips.
This is a sample of the list that I want to use to create the nested dicts inside of the zips dict:
zips = [90272, 90049, 90401, 90402, 90403, 90404, 90291, 90292, 90290, 90094, 90066, 90025, 90064, 90073]
And this is what I want it to look like
zips {
90272: {
"members": ,
"offices": ,
"membersperoffice":
},
90049: {
"members": ,
"offices": ,
"membersperoffice":
}
}
....
etc, etc. ( creating a corresponding nested dict for each zipcode in the list)
After I achieve this, I have to iterate through several more zip code lists... and those would spit out the number of times a zip code appears in a given list, and then find the dict corresponding to the zip code in question, and append that value to the relevant key.
One I figure out the first part, I can figure this second part out on my own.
Thanks again. Sorry for any confusion.
You can do something like this:
all_areas = [90219, 90220]
zips = {zipcode: code_members(zipcode) for zipcode in all_areas}
def code_members(zipcode):
if zipcode == 90219:
return dict(members=120, offices=18, membersperoffice=28)
return dict(members=423, offices=37, membersperoffice=16)
I think I need to build the nested dicts, and then process several
lists against conditionals, passing resulting values into the
corresponding dicts on the fly (i.e. based on how many times a zip
code exists in the list).
Using the above approach, if a zipcode appears multiple times in the all_areas list, the resulting zip dictionary will only contain one instance of the zipcode.
Is using nested dictionaries the most pythonic way of doing this? Is
it cumbersome? Is there a better way?
May I suggest making a simple object that represents the value of each zipcode. Something simple like:
Using dataclass:
#dataclass.dataclass
class ZipProperties(object):
members: int
offices: int
membersperoffice: int
Using named tuple:
ZipProperties = collections.namedtuple('ZipProperties', ['members', 'offices', 'membersperoffice'])
You can then change the code_members function to this:
def code_members(zipcode):
if zipcode == 90219:
return ZipProperties(120, 18, 28)
return ZipProperties(423, 37, 16)
Addressing your concrete example:
determine how many instances of a zipcode are in list called membersperzip
find corresponding nested dict with same name as zipcode, inside dict called zips
pass value to corresponding key, called "members" (or whatever key)
membersperzip: typings.List[Tuple[int, int]] = [(90219, 54)]
for zip, members in membersperzip:
for zipcode, props in zips.items():
if zipcode == zip:
props.members = members
I would suggest you to append it when you have the actual value instead of initializing dictionary with empty values for each key. You have list of keys and I do not see why you want to put all of them to the dictionary without having value in the first place.
zips = [90272, 90049, 90401, 90402, 90403, 90404, 90291, 90292, 90290, 90094, 90066, 90025, 90064, 90073]
zips_dict = {}
for a_zip in zips:
if a_zip not in zips_dict:
# Initialize proper value here for members etc.
zips_dict[a_zip] = proper_value
If you insist to initialize dict with empty value for each keys, you could use this, which will also iterate through the list anyway but in python comprehension.
zips = [90272, 90049, 90401, 90402, 90403, 90404, 90291, 90292, 90290, 90094, 90066, 90025, 90064, 90073]
zips_dict = {
x:{
"members":None,
"offices":None,
"membersperoffice":None,
} for x in zips
}
Hope this helps
I have a list of dictionaries in union_dicts. To give you an idea it's structured as follows
union_dicts = [{'bla' : 6, 'blub': 9}, {'lub': 20, 'pul':12}]
(The actual lists of dicts is many times longer, but this is to give the idea)
For this particular list of dictionaries I want to make a wordcloud. The function that makes a wordcloud is as follows (nothing wrong with this one):
def make_words(words):
return ' '.join([('<font size="%d">%s</font>'%(min(1+words[x]*5/max(words.values()), 5), x)) for x in words])
Now I have written the following code that should give every dictionary back. Return gives only the first dictionary back in the following function below:
def bupol():
for element in union_dicts:
return HTML(make_words(element))
bupol()
I have already tried to simply print it out, but then I simply get ''Ipython display object'' and not the actual display. I do want the display. Yield also doesn't work on this function and for some reason using list = [] along with list.apped() return list instead of returning in the current way also doesn't work. I'm quite clueless as how to properly iterate over this so I get a display of every single dictionary inside union_dicts which is a list of dictionaries.
How about something like this?
def bupol():
result = []
for element in union_dicts:
result.append(HTML(make_words(element)))
return result
i have a PHP background and am fairly new to python, I am creating a helper class which is used to return some ldap results from an ldap server.
The standard result from pythons ldap library: ldap.search_s() is a list of tuples with dictionaries inside with lists inside ie:
[('uid', {'cn': ['cnvalue']}), ('uid2', {'cn': ['cnvalue2']})]
I would like to convert this into a simple list of dictionaries or in php talk an array of associative arrays.
For the life of me I cannot figure out how to do this.
This is how I have attempted it:
output = []
for i, result in enumerate(results):
d = {
'firstname': result[1].['givenName'][1],
'phone': result[1].['telephoneNumber'][1]
}
output.append(d)
return output
If telephoneNumber does not exist for an entry in ldap the python library does not populate that key, so some times I was running into invalid key exceptions, therefore I modified to the below.
output = []
for i, result in enumerate(results):
d = {
'firstname': result[1].get('givenName', '')[1],
'phone': result[1].get('telephoneNumber' ,'')[1]
}
output.append(d)
return output
Even so if a telephoneNumber does not exist then neither does the list entry [1], and so now I am running into "out of range" errors.
Help.
Thanks all.
result[1].get('givenName', '')
This will return either a list (if 'givenName' is a key in the dictionary whose value is a list) or an empty string (if 'givenName' is not a key).
You will then apply [1] to the result. Presumably you mean [0], but even so an empty string has length 0. You can't look up either index 0 or index 1 in it.
You could instead do:
result[1].get('givenName', [''])[0]
Now in the case where the key is missing you have a list containing an empty string. Therefore applying [0] to it gives you an empty string.
One alternative (which doesn't deal with an empty list, but I don't know whether or not that ever occurs in the data you're handling):
def getfirst(attrs, key):
list_or_none = attrs.get(key)
return '' if list_or_none is None else list_or_none[0]
Then:
d = {
'firstname': getfirst(result[1], 'givenName'),
}
Btw, you use enumerate but then never actually mention i. So you could just as well write for result in results:
I've tried my best to figure this out, but I can't for the life of me. I have a dictionary with many different values in it, including another dictionary. Before setting the values for the dictionary within the dictionary, I try to set the values equal to a "blank" dictionary such that in subsequent steps I can update it.
The short story is: I have two lines that somehow are changing a dictionary that I wouldn't expect. Given some dicts:
blankresiduedict = {stuff:[stuff], stuff:[stuff]}; blankidentifiers = {stuff:stuff, stuff:stuff}
the lines
self.pdb_chain_dict['1MH9_B'] = (blankresiduedict.copy(),blankidentifiers.copy())
self.pdb_chain_dict['1MH9_B'][0][(stuff)][0] = residuedict[('A','D','41')]
are somehow changing the values of blankresiduedict to be equal to residuedict.
Any idea how this is happening? There is literally no other reference to blankresiduedict in that section of code, and when I look at the output, blankresiduedict starts out accurate and then with each loop keeps changing value to equal whatever residuedict was for that loop.
(Below is a more detailed description)
This is a small part of a very large project, so some of this may really be hard to represent in a compact form. I'll do my best to eliminate the unnecessary stuff. This is a method within a class that I am trying to use to update the dictionary for the class instance.
blankresiduedict = {}
blankidentifiers = {}
self.allowmultiples = True
self.ancestorline = [
'1MH9',
'A', 'D', '41',
'A', 'D', '43',
'A', 'T', '130',
#etc...
]
self.no_key_residues = 6
self.pdb_chain_dict = {
'1MH9_B': (
{
('A','D','41'): [('B','D','41')],
('A','D','43'): [('B','D','43')],
('A','T','130'): [('B','T','130')]
},
#{identifiers dictionary}
),
'1MH9_C': (
#{etc},{etc}
),
# etc...
}
for i in range(1, (3*self.no_key_residues)+1, 3): # Using this loop structure allows a variable number of key residues to be given
if not self.allowmultiples:
raise Exception("Do some stuff here")
else:
blankresiduedict[(self.ancestorline[i],self.ancestorline[i+1],self.ancestorline[i+2])] = [('-','-','-')]
blankidentifiers = {'EC Num':'-','Sprot':'-','Class':'-','Keywords':'-','Title':'-','SeqRepr':'-'}
### Begin some loop structure, where for every loop, the following is basically happening
residuedict = {
('A','D','41'): ('B','D','10'),
('A','D','43'): ('B','D','12')
} #in actuality this value would change for every loop, but just showing what a typical loop would look like
self.pdb_chain_dict['1MH9_B'] = (blankresiduedict.copy(),blankidentifiers.copy())
self.pdb_chain_dict['1MH9_B'][0][('A','D','41')][0] = residuedict[('A','D','41')]
What should happen here is that the value in the pdb_chain_dict is set to the the tuple of two blank dictionaries ({residuedict},{identifiers}) I'm mostly leaving the identifier dictionary alone in this example because it has the exact same problem. However, what I'm finding is that the blankresiduedict is actually changing. And, after doing a lot of testing, the line where it is changing is self.pdb_chain_dict['1MH9_B'][0][('A,'D','41')][0] = residuedict[('A','D','41')].
This makes no sense to me...blankresiduedict is not even involved, yet somehow it's value is being changed in that step.
That's because a copy of a dictionary is not a deep copy, and your dict values are lists, which are mutable. Here's a minimal example that reproduces your issue:
d1 = {"foo": [1, 2, 3]}
d2 = d1.copy()
# Add a new element to d2 to show that the copy worked
d2["bar"] = []
# The two dicts are different.
print d1
print d2
# However, the list wasn't copied
# it's the same object that shows up in 2 different dicts
print d1["foo"] is d2["foo"]
# So that's what happens in your code: you're mutating the list.
d1["foo"].append(5)
print d2["foo"]