I'm trying to convert all values of some dictionaries that are nested in another dictionary.
I want to convert:
{0: {'n': 1}, 1: {'s': 0, 'n': 2}, 2: {'s': 1}}
To this:
{0: {'n': '?'}, 1: {'s': '?', 'n': '?'}, 2: {'s': '?'}}
I tried this:
for key, value in new_dictt:
new_dictt[key][value] = '?'
But it did not work. I've been googling but have not found a way to convert all values of all dictionaries within another dictionary.
Here we go:
old_dict = {0: {'n': 1}, 1: {'s': 0, 'n': 2}, 2: {'s': 1}}
new_dict = {key: {k: '?' for k in dct} for key, dct in old_dict.items()}
print(new_dict)
Which yields
{0: {'n': '?'}, 1: {'s': '?', 'n': '?'}, 2: {'s': '?'}}
This uses two nested dict comprehensions.
Related
In a given list:
unmatched_items_array = [{'c': 45}, {'c': 35}, {'d': 5}, {'a': 3.2}, {'a': 3}]
Find all 'key' pairs and print out and if no pairs found for given dictionary print out that dictionary.
What I managed to write so far sort of works but it keeps testing some items of the list even though they were already tested. Not sure how to fix it.
for i in range(len(unmatched_items_array)):
for j in range(i + 1, len(unmatched_items_array)):
# when keys are the same print matching dictionary pairs
if unmatched_items_array[i].keys() == unmatched_items_array[j].keys():
print(unmatched_items_array[i], unmatched_items_array[j])
break
# when no matching pairs print currently processed dictionary
print(unmatched_items_array[i])
Output:
{'c': 45} {'c': 35}
{'c': 45}
{'c': 35}
{'d': 5}
{'a': 3.2} {'a': 3}
{'a': 3.2}
{'a': 3}
What the output should be:
{'c': 45} {'c': 35}
{'d': 5}
{'a': 3.2} {'a': 3}
What am I doing wrong here?
Using collections.defaultdict
Ex:
from collections import defaultdict
unmatched_items_array = [{'c': 45}, {'c': 35}, {'d': 5}, {'a': 3.2}, {'a': 3}]
result = defaultdict(list)
for i in unmatched_items_array:
key, _ = i.items()[0]
result[key].append(i) #Group by key.
for _, v in result.items(): #print Result.
print(v)
Output:
[{'a': 3.2}, {'a': 3}]
[{'c': 45}, {'c': 35}]
[{'d': 5}]
With itertools.groupby:
from itertools import groupby
unmatched_items_array = [{'d': 5}, {'c': 35}, {'a': 3}, {'a': 3.2}, {'c': 45}]
for v, g in groupby(sorted(unmatched_items_array, key=lambda k: tuple(k.keys())), lambda k: tuple(k.keys())):
print([*g])
Prints:
[{'a': 3}, {'a': 3.2}]
[{'c': 35}, {'c': 45}]
[{'d': 5}]
EDIT: If your items in the list are sorted by keys already, then you can skip the sorted() call:
for v, g in groupby(unmatched_items_array, lambda k: tuple(k.keys()) ):
print([*g])
I am pretty sure that I must be missing something really basic but, is there a way to map a dictionary with another one?.
For instance given a dictionary like this:
d = {'a': {'b': 'r1', 'c': 'r2'}, 'v': {'x': 'r4', 'o': 'r2'}}
And use a mapper like this:
mapper = {'a': 0, 'b': 1, 'c': 2, 'v': 3, 'x': 4, 'o': 5}
The expected output should be like this:
result = {0: {1: 'r1', 2: 'r2'}, 3: {4: 'r4', 5: 'r2'}}
You can use a function that recursively replaces keys with corresponding values in the mapper dict:
def map_keys(d, m):
return {m[k]: map_keys(v, m) for k, v in d.items()} if isinstance(d, dict) else d
so that map_keys(d, mapper) returns:
{0: {1: 'r1', 2: 'r2'}, 3: {4: 'r4', 5: 'r2'}}
This is not recursive, hence only works for similar input (if you're fine):
d = {'a': {'b': 'r1', 'c': 'r2'}, 'v': {'x': 'r4', 'o': 'r2'}}
mapper = {'a': 0, 'b': 1, 'c': 2, 'v': 3, 'x': 4, 'o': 5}
res = {}
for k, v in d.items():
res.update({mapper[k]: {mapper[x]: y for x, y in v.items()}})
print(res)
# {0: {1: 'r1', 2: 'r2'}, 3: {4: 'r4', 5: 'r2'}}
Shorter:
res = {mapper[k]: {mapper[x]: y for x, y in v.items()} for k, v in d.items()}
I have a dictionary of dictionaries like this small example:
small example:
dict = {1: {'A': 8520, 'C': 5772, 'T': 7610, 'G': 5518}, 2: {'A': 8900, 'C': 6155, 'T': 6860, 'G': 5505}}
I want to make an other dictionary of dictionaries in which instead of absolute numbers I would have the frequency of every number in every sub-dictionary. for example for the 1st inner dictionary I would have the following sub-dictionary:
1: {'A': 31.25, 'C': 21, 'T': 27.75, 'G': 20}
here is the expected output:
dict2 = {1: {'A': 31.25, 'C': 21, 'T': 27.75, 'G': 20}, 2: {'A': 32.5, 'C': 22.50, 'T': 25, 'G': 20}}
I am trying to do that in python using the following command:
dict2 = {}
for item in dict.items():
freq = item.items/sum(item.items())
dict2[] = freq
but the results of this code is not what I want. do you know how to fix it?
What you want is to process the inner dictionaries without modifying the keys of the big one. Outsource the frequency into a function:
def get_frequency(d):
total = sum(d.values())
return {key: value / total * 100 for key, value in d.items()}
Then use a dict comprehension to apply the function on all your sub dictionaries:
dict2 = {key: get_frequency(value) for key, value in dict1.items()}
Note that I added a * 100, it appears from your output that you are looking for percents from 0-100 and not a float from 0-1.
Edit:
If you're using python2 / is integer division so add a float like so:
return {key: float(value) / total * 100 for key, value in d.items()}
You could do the following:
dct = {1: {'A': 8520, 'C': 5772, 'T': 7610, 'G': 5518}, 2: {'A': 8900, 'C': 6155, 'T': 6860, 'G': 5505}}
result = {}
for key, d in dct.items():
total = sum(d.values())
result[key] = {k : a / total for k, a in d.items()}
print(result)
Output
{1: {'C': 0.21050328227571116, 'T': 0.2775346462436178, 'G': 0.2012399708242159, 'A': 0.31072210065645517}, 2: {'C': 0.22447118891320203, 'T': 0.25018234865062, 'G': 0.20076586433260393, 'A': 0.32458059810357404}}
I have output from python networkX code:
flow_value, flow_dict = nx.maximum_flow(T, 'O', 'T')
print(flow_dict)
#Output as followesenter
#{'O': {'A': 4, 'B': 6, 'C': 4}, 'A': {'B': 1, 'D': 3}, 'B': {'C': 0, 'E': 3,'D': 4}, 'C': {'E': 4}, 'E': {'D': 1, 'T': 6}, 'D': {'T': 8}, 'T': {}}
I want to extract all the data in the form looks like:
#('O','A',4),('O','B','6'),('O','C','4'),('A','B',1),......,('D','T',8)
Any ways can I traverse thru the nested dict and get the data I need?
I tried this and it works. Some type checking to only capture strings
def retrieve_all_strings_from_dict(nested_dict, keys_to_ignore = None):
values = []
if not keys_to_ignore:
keys_to_ignore = []
else: keys_to_ignore = to_list(keys_to_ignore)
if not isinstance(nested_dict,dict):
return values
dict_stack = []
dict_stack.append(nested_dict)
for dict_var in dict_stack:
data_list = [v for k,v in dict_var.items() if all([isinstance(v,str), k not in keys_to_ignore]) ]
additional_dicts = [v for k,v in dict_var.items() if isinstance(v,dict)]
for x in additional_dicts:
dict_stack.append(x)
for w in data_list:
values.append(w)
return values
I'm using the following code to unzip a dictionary and count the values at each site:
result = [Counter(site) for site in zip(*myDict.values())]
The output looks something like: Counter({'A': 74}), Counter({'G': 72, 'C': 2})
There are five possible values: A, T, G, C, and N
I only want the counter to spit out a value if one of the five values is less than 74. So for the above example, only the second would be outputted. How do you use an if statement within the counter? Furthermore, how can I label each site, so that above it could just say:
Site 2: 'G': 72, 'C': 2
myDict looks like this:
{'abc123': ATGGAGGACGACT, 'def332': ATGCATTGACGC}
Except there are 74 entries. Each value is the same length. Basically, I don't know how to use a counter that can give me an output for when each site of each value doesn't match up. So for the sequences above, the 4th site does not match. I want the counter to output the following:
site 4: 'G': 1, 'C': 1
You can use enumerate to index the sites and the most_common method on Counter can be used to check if the count is < 74. Here's an example with just two strings:
from collections import Counter
myDict = {'a':'ATGTTCN','b':'ATTTCCG'}
result = [(i,Counter(site)) for i,site in enumerate(zip(*myDict.values()))]
result = [x for x in result if x[1].most_common()[0][1] < 2]
for site,count in result:
print 'Site {}: {}'.format(site,str(count)[9:-2])
Output:
Site 2: 'T': 1, 'G': 1
Site 4: 'C': 1, 'T': 1
Site 6: 'G': 1, 'N': 1
using Dict Comprehension and only storing values if max(Counter(x).values())<74,
use enumerate() to get the Site number.
>>> mydict={'abc123': 'ATGGAGGACGACT', 'def332': 'ATGCATTGACGC'}
>>> result={'Site {}'.format(i+1):Counter(x) for i,x in enumerate(zip(*mydict.values())) if max(Counter(x).values())<2}
>>> result
{'Site 7': Counter({'T': 1, 'G': 1}), 'Site 6': Counter({'T': 1, 'G': 1}), 'Site 4': Counter({'C': 1, 'G': 1}), 'Site 9': Counter({'A': 1, 'C': 1}), 'Site 8': Counter({'A': 1, 'G': 1}), 'Site 11': Counter({'A': 1, 'G': 1}), 'Site 10': Counter({'C': 1, 'G': 1})}
or convert Counter to dict:
>>> {'Site {}'.format(i+1):dict(Counter(x)) for i,x in enumerate(zip(*mydict.values())) if max(Counter(x).values())<2}
{'Site 7': {'T': 1, 'G': 1}, 'Site 6': {'T': 1, 'G': 1}, 'Site 4': {'C': 1, 'G': 1}, 'Site 9': {'A': 1, 'C': 1}, 'Site 8': {'A': 1, 'G': 1}, 'Site 11': {'A': 1, 'G': 1}, 'Site 10': {'C': 1, 'G': 1}}