Flatten nested dictionary - convert list element to string - python

I have a nested dictionary where the lowest level consists of a list with one element each. I want to change this level from list to string.
Assume I have a dictionary such as this:
dict = {id1:{'key11':['value11'],'key12':['value12']}, id2:{'key21':['value21'],'key22':['value22']}}
How can I get:
dict = {id1:{'key11': 'value11','key12':'value12'}, id2:{'key21':'value21','key22':'value22'}}
Additional question:
How does the solution change if the keys and values do not follow a certain logic but each element is unique and you have many of them; such as in the below example:
dictionary = {'ida':{'abc':['def'],'fgh':['ijk'] (...)}, 'idb':{'lmn':['opq'],'rst':['uvw']} (...)}
Thank you!!
Note:
I get this structure because I am using a list/map structure earlier in the code to extract text from a XML file which yields list values.
get_text = lambda x: x.text
content = [list(map(get_text, i)) for i in content]

This works:
dictionary = {'id1':{'key11':['value11'],'key12':['value12']}, 'id2':{'key21':['value21'],'key22':['value22']}}
new_dict = {key: {key1:value1[0] for key1, value1 in value.items()} for key, value in dictionary.items()}
new_dict
#{'id1': {'key11': 'value11', 'key12': 'value12'},
# 'id2': {'key21': 'value21', 'key22': 'value22'}}
Also, I would not use predefined terms like dict

Related

Python: Select specific columns in json.dumps()

I need to select specific columns from a python dictionary using json.dumps().
Eg.
dict={"Greet":"Hello","Bike":Yamaha","Car":"Jaguar"}
r=json.dumps(Only want "Bike":Yamaha","Car":"Jaguar")
Note: Cannot store the same into other dictionary and use it. As I want to use First K,V pair as well in my code.
Create a new dictionary and dump that.
d={"Greet":"Hello","Bike":Yamaha","Car":"Jaguar"}
r = json.dumps({"Bike": d["Bike"], "Car": d["Car"]})
If you have a list of all the keys you want to keep, you can use a dictionary comprehension:
d={"Greet":"Hello","Bike":Yamaha","Car":"Jaguar"}
keep = ['Bike', 'Car']
r = json.dumps({key, d[key] for key in keep})
If you have a list of the keys you want to omit, you can also use a dictionary comprehension
d={"Greet":"Hello","Bike":Yamaha","Car":"Jaguar"}
skip = ['Greet']
r = json.dumps({key, val for key, val in d.items() if key not in skip})
BTW, don't use dict as a variable name, it's already the name of a built-in function/class.
final = list(mydict.items())[1:] #extra key:value tuple list slice of the portion you need
r=json.dumps(dict(final)) #cast to dictionary and dump
Output
{"Bike": "Yamaha", "Car": "Jaguar"}

python3 list comprehension using a dictionary doesn't always return the same list

EDIT:
As requested, here is the problem I am trying to solve:
I have files in a directory that do not have extensions. Based on the output of the "file" command I want to assign the corresponding extension. So my dictionary is assigning strings that can be in this output, to the extensions (eg "ASCII": "txt") but can't know if it will be the exact output. For example :
$ file my_file
myfile: ASCII text
# I want to change the extension of myfile to myfile.txt
That is what the code I wrote was designed to do, but maybe there are better solutions
I have a dictionnary of strings with keys that have common strings :
MY_D = {
"first_k": "first_v",
"sec_k": "sec_v",
"some_key_name_that_includes_others": "value1",
"some_other_key": "value2",
...
"last_k": "last_v"
}
Sometimes, I have to fetch a value, if the search word is within the key. I use this code:
key_I_am_looking_for = "some"
value = list(v for k, v in MY_D.items() if value_I_am_looking_for in k)[-1]
But when the key I am looking for appears in multiple possible keys, the value I get is not always the same, in this example it can be either "value1" or "value2".
I noticed the list returned, is not always ordered the same way.
Is there a way I can make this return always the value that corresponds to the longest key matched (here would be "value1")?
Dicts, up to Python 3.7, are unordered by specification, meaning the code should not rely on an internal order for dictionaries.
The work around this is simply to sort the output of your dict interation. In your snippet, that can be done simply by changing the call to list by a call to sorted:
value = sorted(v for k, v in MY_D.items() if value_I_am_looking_for in k)[-1]
The longest key in all matches. A sorted with lambda may help.
This gets each match in a list of lists and uses sorted and key=lambda... on the length of the 1st item in each list (which is the saved key name). It will sort shortest key name to longest. Get the last list and the last item of that list to get the value of value1.
MY_D = {
"first_k": "first_v",
"sec_k": "sec_v",
"some_key_name_that_includes_others": "value1",
"some_other_key": "value2",
"last_k": "last_v"
}
key_I_am_looking_for = "some"
# List of lists of matches.
value = list([k, v] for k, v in MY_D.items() if key_I_am_looking_for in k)
# Sort by index 0 of each inner list.
key_len = sorted(value, key=lambda l: len(l[0]))
# Last list is longest index 0 so get last item of that list.
print('longest:', key_len[-1][-1])

How to convert a list of tuples containing two lists into dictionary of key value pairs?

I have a list like this-
send_recv_pairs = [(['produce_send'], ['consume_recv']), (['Send'], ['Recv']), (['sender2'], ['receiver2'])]
I want something like
[ {['produce_send']:['consume_recv']},{['Send']:['Recv']},{['sender2']:['receiver2']}
How to do this?
You can not use list as the key of dictionary.
This Article explain the concept,
https://wiki.python.org/moin/DictionaryKeys
To be used as a dictionary key, an object must support the hash function (e.g. through hash), equality comparison (e.g. through eq or cmp), and must satisfy the correctness condition above.
And
lists do not provide a valid hash method.
>>> d = {['a']: 1}
TypeError: unhashable type: 'list'
If you want to specifically differentiate the key values you can use tuple as they hash able
{ (i[0][0], ): (i[1][0], ) for i in send_recv_pairs}
{('Send',): ('Recv',),
('produce_send',): ('consume_recv',),
('sender2',): ('receiver2',)}
You can't have lists as keys, only hashable types - strings, numbers, None and such.
If you still want to use a dictionary knowing that, then:
d={}
for tup in send_recv_pairs:
d[tup[0][0]]=tup[1]
If you want the value to be string as well, use tup[1][0] instead of tup[1]
As a one liner:
d={tup[0][0]]:tup[1] for tup in list} #tup[1][0] if you want values as strings
You can check it over here, in the second way of creating distionary.
https://developmentality.wordpress.com/2012/03/30/three-ways-of-creating-dictionaries-in-python/
A Simple way of doing it,
First of all, your tuple is tuple of lists, so better change it to tuple of strings (It makes more sense I guess)
Anyway simple way of working with your current tuple list can be like :
mydict = {}
for i in send_recv_pairs:
print i
mydict[i[0][0]]= i[1][0]
As others pointed out, you cannot use list as key to dictionary. So the term i[0][0] first takes the first element from the tuple - which is a list- and then the first element of list, which is the only element anyway for you.
Do you mean like this?
send_recv_pairs = [(['produce_send'], ['consume_recv']),
(['Send'], ['Recv']),
(['sender2'], ['receiver2'])]
send_recv_dict = {e[0][0]: e[1][0] for e in send_recv_pairs}
Resulting in...
>>> {'produce_send': 'consume_recv', 'Send': 'Recv', 'sender2': 'receiver2'}
As mentioned in other answers, you cannot use a list as a dictionary key as it is not hashable (see links in other answers).
You can therefore just use the values in your lists (assuming they stay as simple as in your example) to create the following two possibilities:
send_recv_pairs = [(['produce_send'], ['consume_recv']), (['Send'], ['Recv']), (['sender2'], ['receiver2'])]
result1 = {}
for t in send_recv_pairs:
result1[t[0][0]] = t[1]
# without any lists
result2 = {}
for t in send_recv_pairs:
result2[t[0][0]] = t[1][0]
Which respectively gives:
>>> result1
{'produce_send': ['consume_recv'], 'Send': ['Recv'], 'sender2': ['receiver2']}
>>> result2
{'produce_send': 'consume_recv', 'Send': 'Recv', 'sender2': 'receiver2'}
Try like this:
res = { x[0]: x[1] for x in pairs } # or x[0][0]: x[1][0] if you wanna store inner values without list-wrapper
It's for Python 3 and when keys are unique. If you need collect list of values per key, instead of single value, than you may use something like itertools.groupby or map+reduce. Wrote about this in comments and I'll provide example.
And yes, list cannot store key-values, only dict's, but maybe it's just typo in question.
You can not use list as the dictionary key, but instead you may type-cast it as tuple to create the dict object.
Below is the sample example using a dictionary comprehension:
>>> send_recv_pairs = [(['produce_send'], ['consume_recv']), (['Send'], ['Recv']), (['sender2'], ['receiver2'])]
>>> {tuple(k): v for k, v in send_recv_pairs}
{('sender2',): ['receiver2'], ('produce_send',): ['consume_recv'], ('Send',): ['Recv']}
For details, take a look at: Why can't I use a list as a dict key in python?
However if your nested tuple pairs were not list, but any other hashable object pairs, you may have type-casted it to dict for getting the desired result. For example:
>>> my_list = [('key1', 'value1'), ('key2', 'value2')]
>>> dict(my_list)
{'key1': 'value1', 'key2': 'value2'}

Append value to list in dictionary with existing or non-existing key in Python

I have a for loop that goes through two lists and combines them in dictionary. Keys are strings (web page headers) and values are lists (containing links).
Sometimes I get the same key from the loop that already exists in the dictionary. Which is fine. But the value is different (new links) and I'd like to update the key's value in a way where I append the links instead of replacing them.
The code looks something like that below. Note: issue_links is a list of URLs
for index, link in enumerate(issue_links):
issue_soup = BeautifulSoup(urllib2.urlopen(link))
image_list = []
for image in issue_soup.findAll('div', 'mags_thumb_article'):
issue_name = issue_soup.findAll('h1','top')[0].text
image_list.append(the_url + image.a['href'])
download_list[issue_name] = image_list
Currently the new links (image_list) that belong to the same issue_name key get overwritten. I'd like instead to append them. Someone told me to use collections.defaultdict module but I'm not familiar with it.
Note: I'm using enumerate because the index gets printed to the console (not included in the code).
Something like this:
from collections import defaultdict
d = defaultdict(list)
d["a"].append(1)
d["a"].append(2)
d["b"].append(3)
Then:
print(d)
defaultdict(<class 'list'>, {'b': [3], 'a': [1, 2]})
if download_list.has_key(issume_name):
download_list[issume_name].append(image_list)
else:
download_list[issume_name] = [image_list]
is it right?If you have the same key, append the list.

Pandas Dataframe to Dictionary with Multiple Keys

I am currently working with a dataframe consisting of a column of 13 letter strings ('13mer') paired with ID codes ('Accession') as such:
However, I would like to create a dictionary in which the Accession codes are the keys with values being the 13mers associated with the accession so that it looks as follows:
{'JO2176': ['IGY....', 'QLG...', 'ESS...', ...],
'CYO21709': ['IGY...', 'TVL...',.............],
...}
Which I've accomplished using this code:
Accession_13mers = {}
for group in grouped:
Accession_13mers[group[0]] = []
for item in group[1].iteritems():
Accession_13mers[group[0]].append(item[1])
However, now I would like to go back through and iterate through the keys for each Accession code and run a function I've defined as find_match_position(reference_sequence, 13mer) which finds the 13mer in in a reference sequence and returns its position. I would then like to append the position as a value for the 13mer which will be the key.
If anyone has any ideas for how I can expedite this process that would be extremely helpful.
Thanks,
Justin
I would suggest creating a new dictionary, whose values are another dictionary. Essentially a nested dictionary.
position_nmers = {}
for key in H1_Access_13mers:
position_nmers[key] = {} # replicate key, val in new dictionary, as a dictionary
for value in H1_Access_13mers[key]:
position_nmers[key][value] = # do something
To introspect the dictionary and make sure it's okay:
print position_nmers
You can iterate over the groupby more cleanly by unpacking:
d = {}
for key, s in df.groupby('Accession')['13mer']:
d[key] = list(s)
This also makes it much clearer where you should put your function!
... However, I think that it might be better suited to an enumerate:
d2 = {}
for pos, val in enumerate(df['13mer']):
d2[val] = pos

Categories

Resources