Python 3.6: create new dict using values from another as indices - python

In Python 3.6.3, I have the following dict D1:
D1 = {0: array([1, 2, 3], dtype=int64), 1: array([0,4], dtype=int64)}
Each value inside the array is the index of the key of another dict D2:
D2 = {'Jack': 1, 'Mike': 2, 'Tim': 3, 'Paul': 4, 'Tommy': 5}
I am trying to create a third dict, D3, with the same keys as D1, and as values the keys of D2 corresponding to the indices of D1.values().
The result I am aiming for is:
D3 = {0: ['Mike','Tim','Paul'], 1: ['Jack','Tommy']}
My approach is partial in that I struggle to figure out how to tell D3 to get the keys from D1 and the values from D2. I am not too sure about that and. Any ideas?
D3 = {key:list(D1.values())[v] for key in D1.keys() and v in D2[v]}

You could use a dict-comprehension like so:
from numpy import array
D1 = {0: array([1, 2, 3]), 1: array([0,4])}
D2 = {'Jack': 1, 'Mike': 2, 'Tim': 3, 'Paul': 4, 'Tommy': 5}
temp = dict(zip(D2.values(), D2.keys())) # inverting key-value pairs
D3 = {k: [temp.get(i+1, 'N\A') for i in v] for k, v in D1.items()}
which results in:
{0: ['Mike', 'Tim', 'Paul'], 1: ['Jack', 'Tommy']}

If you're using Python 3.6+ you can use enumerate to create a dict to look up the names in D2 by index, and then map the indices in D1 to it:
r = dict(enumerate(D2))
D3 = {k: list(map(r.get, v)) for k, v in D1.items()}
D3 would become:
{0: ['Mike', 'Tim', 'Paul'], 1: ['Jack', 'Tommy']}

This is untested, but I believe this should get you headed in the right direction. I find it helpful sometimes to break out a complicated one-liner into multiple lines
D3={}
for d1k,d1v in D1.items():
D3[d1k] = []
for idx in d1v:
D3[d1k].append(D2[idx])

Might not be the best solution but works
D3={}
for key in D1.keys():
value_list=D1.get(key)
value_list= [(lambda x: x+1)(x) for x in value_list]
temp=[]
for d2_key,value in D2.items():
if value in value_list:
temp.append(d2_key)
D3[key]=temp
Output:
{0: ['Tim', 'Mike', 'Paul'], 1: ['Jack', 'Tommy']}

Here you go!
D1 = {0:[1, 2, 3], 1: [0,4]}
D2 = {'Jack': 1, 'Mike': 2, 'Tim': 3, 'Paul': 4, 'Tommy': 5}
D2_inverted = {v: k for k, v in D2.iteritems()}
D3={}
for key in D1:
temp = []
for value in D1[key]:
temp.append(D2_inv[value+1])
D3[key] = temp
print D3
Iterate the keys from D1;
Create a temporary list to store the values you wish to assign to the new dict, and fill it with the desired values from D2. (inverted its keys and values for simplicity);
Assign to D3.

Related

Combine two dicts and replace missing values [duplicate]

This question already has answers here:
How to merge dicts, collecting values from matching keys?
(17 answers)
Closed 6 days ago.
I am looking to combine two dictionaries by grouping elements that share common keys, but I would also like to account for keys that are not shared between the two dictionaries. For instance given the following two dictionaries.
d1 = {'a':1, 'b':2, 'c': 3, 'e':5}
d2 = {'a':11, 'b':22, 'c': 33, 'd':44}
The intended code would output
df = {'a':[1,11] ,'b':[2,22] ,'c':[3,33] ,'d':[0,44] ,'e':[5,0]}
Or some array like:
df = [[a,1,11] , [b,2,22] , [c,3,33] , [d,0,44] , [e,5,0]]
The fact that I used 0 specifically to denote an entry not existing is not important per se. Just any character to denote the missing value.
I have tried using the following code
df = defaultdict(list)
for d in (d1, d2):
for key, value in d.items():
df[key].append(value)
But get the following result:
df = {'a':[1,11] ,'b':[2,22] ,'c':[3,33] ,'d':[44] ,'e':[5]}
Which does not tell me which dict was missing the entry.
I could go back and look through both of them, but was looking for a more elegant solution
You can use a dict comprehension like so:
d1 = {'a':1, 'b':2, 'c': 3, 'e':5}
d2 = {'a':11, 'b':22, 'c': 33, 'd':44}
res = {k: [d1.get(k, 0), d2.get(k, 0)] for k in set(d1).union(d2)}
print(res)
Another solution:
d1 = {"a": 1, "b": 2, "c": 3, "e": 5}
d2 = {"a": 11, "b": 22, "c": 33, "d": 44}
df = [[k, d1.get(k, 0), d2.get(k, 0)] for k in sorted(d1.keys() | d2.keys())]
print(df)
Prints:
[['a', 1, 11], ['b', 2, 22], ['c', 3, 33], ['d', 0, 44], ['e', 5, 0]]
If you do not want sorted results, leave the sorted() out.

How to turn string into dictionary with conditionals?

I have a dataframe (very large, millions of rows). Here how it looks:
id value
a1 0:0,1:10,2:0,3:0,4:7
b4 0:5,1:0,2:0,3:0,4:1
c5 0:0,1:3,2:2,3:0,4:0
k2 0:0,1:2,2:0,3:4,4:0
I want to turn those strings into dictionary, but only those key value pairs, where there is no 0. So desired result is:
id value
a1 {1:10, 4:7}
b4 {4:1}
c5 {1:3, 2:2}
k2 {1:2}
How to do that? when I try to use dict() function but it brings KeyError: 0:
df["value"] = dict(df["value"])
So I have problems with turning it into dictionary in the first place
I also have tried this:
df["value"] = json.loads(df["value"])
but it brings same error
This could do the trick, simply using list comprehensions:
import pandas as pd
dt = pd.DataFrame({"id":["a1", "b4", "c5", "k2"],
"value":["0:0,1:10,2:0,3:0,4:7","0:5,1:0,2:0,3:0,4:1","0:0,1:3,2:2,3:0,4:0","0:0,1:2,2:0,3:4,4:0"]})
def to_dict1(s):
return [dict([map(int, y.split(":")) for y in x.split(",") if "0" not in y.split(":")]) for x in s]
dt["dict"] = to_dict1(dt["value"])
Another way to obtain the same result would be using regular expressions (the pattern (?!0{1})(\d) matches any number but a single 0):
import re
def to_dict2(s):
return [dict([map(int, y) for y in re.findall("(?!0{1})(\d):(?!0{1})(\d+)", x)]) for x in s]
In terms of performance, to_dict1 is almost 20% faster, according to my tests.
This code will make a result you want. I made a sample input as you provided, and printed an expected result at the end.
import pandas as pd
df = pd.DataFrame(
{
'id': ['a1', 'b4', 'c5', 'k2'],
'value': ['0:0,1:10,2:0,3:0,4:7', '0:5,1:0,2:0,3:0,4:1', '0:0,1:3,2:2,3:0,4:0', '0:0,1:2,2:0,3:4,4:0']
}
)
value = [] # temporal value to save only key, value pairs without 0
for i, row in df.iterrows():
pairs = row['value'].split(',')
d = dict()
for pair in pairs:
k, v = pair.split(':')
k = int(k)
v = int(v)
if (k != 0) and (v != 0):
d[k] = v
value.append(d)
df['value'] = pd.Series(value)
print(df)
# id value
#0 a1 {1: 10, 4: 7}
#1 b4 {4: 1}
#2 c5 {1: 3, 2: 2}
#3 k2 {1: 2, 3: 4}
def make_dict(row):
""" Requires string list of shape
["0":"0", "1":"10", ...]"""
return {key: val for key, val
in map(lambda x: map(int, x.split(":")), row)
if key != 0 and val != 0}
df["value"] = df.value.str.split(",").apply(make_dict)
This is how I would do it:
def string_to_dict(s):
d = {}
pairs = s.split(',') # get each key pair
for pair in pairs:
key, value = pair.split(':') # split key from value
if int(value): # skip the pairs with zero value
d[key] = value
return d
df['value'] = df['value'].apply(string_to_dict)
use a dictionary comprehension to exclude key or value items equal to zero
txt="""id value
a1 0:0,1:10,2:0,3:0,4:7
b4 0:5,1:0,2:0,3:0,4:1
c5 0:0,1:3,2:2,3:0,4:0
k2 0:0,1:2,2:0,3:4,4:0 """
df = pd.DataFrame({"id":["a1", "b4", "c5", "k2"],
"value":["0:0,1:10,2:0,3:0,4:7","0:5,1:0,2:0,3:0,4:1","0:0,1:3,2:2,3:0,4:0","0:0,1:2,2:0,3:4,4:0"]})
for key,row in df.iterrows():
results=[]
{results.append({int(k),int(v)}) if int(k)!=0 and int(v)!=0 else None for k,v in (x.split(':') for x in row['value'].split(','))}
df.loc[key,'value']=results
print(df)
output:
id value
0 a1 [{1, 10}, {4, 7}]
1 b4 [{1, 4}]
2 c5 [{1, 3}, {2}]
3 k2 [{1, 2}, {3, 4}]
​

How can I concatenate dicts (values to values of the same key and new key)? [duplicate]

This question already has answers here:
How do I merge two dictionaries in a single expression in Python?
(43 answers)
Closed 6 years ago.
I have a problem with concatenating dictionaries. Have so much code so I show in example what my problem is.
d1 = {'the':3, 'fine':4, 'word':2}
+
d2 = {'the':2, 'fine':4, 'word':1, 'knight':1, 'orange':1}
+
d3 = {'the':5, 'fine':8, 'word':3, 'sequel':1, 'jimbo':1}
=
finald = {'the':10, 'fine':16, 'word':6, 'knight':1, 'orange':1, 'sequel':1, 'jimbo':1}
It is prepering wordcounts for wordcloud. I dont know how to concatenate values of the keys it is puzzle for me. Please help.
Best regards
I would use a Counter from collections for this.
from collections import Counter
d1 = {'the':3, 'fine':4, 'word':2}
d2 = {'the':2, 'fine':4, 'word':1, 'knight':1, 'orange':1}
d3 = {'the':5, 'fine':8, 'word':3, 'sequel':1, 'jimbo':1}
c = Counter()
for d in (d1, d2, d3):
c.update(d)
print(c)
Outputs:
Counter({'fine': 16, 'the': 10, 'word': 6, 'orange': 1, 'jimbo': 1, 'sequel': 1, 'knight': 1})
import itertools
d1 = {'the':3, 'fine':4, 'word':2}
d2 = {'the':2, 'fine':4, 'word':1, 'knight':1, 'orange':1}
d3 = {'the':5, 'fine':8, 'word':3, 'sequel':1, 'jimbo':1}
dicts = [d1, d2, d3]
In [31]: answer = {k:sum(d[k] if k in d else 0 for d in dicts) for k in itertools.chain.from_iterable(dicts)}
In [32]: answer
Out[32]:
{'sequel': 1,
'the': 10,
'fine': 16,
'jimbo': 1,
'word': 6,
'orange': 1,
'knight': 1}
def sumDicts(*dicts):
summed = {}
for subdict in dicts:
for (key, value) in subdict.items():
summed[key] = summed.get(key, 0) + value
return summed
Shell example:
>>> d1 = {'the':3, 'fine':4, 'word':2}
>>> d2 = {'the':2, 'fine':4, 'word':1, 'knight':1, 'orange':1}
>>> d3 = {'the':5, 'fine':8, 'word':3, 'sequel':1, 'jimbo':1}
>>> sumDicts(d1, d2, d3)
{'orange': 1, 'the': 10, 'fine': 16, 'jimbo': 1, 'word': 6, 'knight': 1, 'sequel': 1}

Dictionary Containing list data, filter based on value in list

I have test data which is gathered based on multiple inputs, and results in a single output. I'm currently storing this data in a dictionary whose keys are my parameter/ results labels, and whose values are the test conditions and results. I would like to be able to filter the data so I can generate plots based on isolated conditions.
In my example below, my test conditions would be 'a' and 'b', and the result of the experiment would be 'c'. I want to filter my data so I get a dictionary with the same key, value structure and only my filtered results. However my current dictionary comprehension returns an empty dictionary. Any advice to get the desired result?
Current Code:
data = {'a': [0, 1, 2, 0, 1, 2], 'b': [10, 10, 10, 20, 20, 20], 'c': [1.3, 1.9, 2.3, 2.3, 2.9, 3.4]}
filtered_data = {k:v for k,v in data.iteritems() if v in data['b'] >= 20}
Desired Result:
{'a': [0, 1, 2], 'b': [20, 20, 20], 'c': [2.3, 2.9, 3.4]}
Current Result:
{}
Also, is this dictionary of lists a good schema to store data of this type, given that I'm going to want to filter the results, or is there a better way to accomplish this?
use this:
k:[v[i] for i,x in enumerate(v) if data['b'][i] >= 20] for k,v in data.items()}
Desired Result:
{'a': [0, 1, 2], 'c': [2.3, 2.9, 3.4], 'b': [20, 20, 20]}
Consider using the pandas module for this type of work.
import pandas as pd
df = pd.DataFrame(data)
df = df[df["b"] >= 20]
print(df)
It appears like this will give you what you want. You are using the dictionary key to represent the column name and the values are just rows in a given column, so it is amenable to using a dataframe.
Result:
a b c
3 0 20 2.3
4 1 20 2.9
5 2 20 3.4
Are all of the dictionary value lists in matching orders? If so, you could just look at whichever list you want to filter by, say 'b' in this case, find the values you want, and then either use those indices or the same slice on the other values in the dictionary.
For example:
matching_indices = []
for i in data['b']:
if data['b'][i] >= 20:
matching_indices.append(i)
new_dict = {}
for key in data:
for item in matching_indices:
new_dict[key] = data[key][item]
You could probably figure a dictionary comprehension for it if you wanted. Hopefully this is clear.
you can change this into a method which would give it more flexibility. Your current logic means that dataset a and c are neglected because there are no values greater than or equal to 20:
data = {'a': [0, 1, 2, 0, 1, 2], 'b': [10, 10, 10, 20, 20, 20], 'c': [1.3, 1.9, 2.3, 2.3, 2.9, 3.4]}
filter_vals = ['a', 'b']
new_d = {}
for k, v in data.iteritems():
if k in filter_vals:
new_d[k] = [i for i in v if i >= 20]
print new_d
Now i'm not a big fan if many if statements, but something like this is straight forward and can be called many times
def my_filter(operator, condition, filter_vals, my_dict):
new_d = {}
for k, v in my_dict.iteritems():
if k in filter_vals:
if operator == '>':
new_d[k] = [i for i in v if i > condition]
elif operator == '<':
new_d[k] = [i for i in v if i < condition]
elif operator == '<=':
new_d[k] = [i for i in v if i <= condition]
elif operator == '>=':
new_d[k] = [i for i in v if i >= condition]
return new_d
I agree with the pandas approach above.
If for some reason you hate pandas or are an old school computer scientist, tuples are a good way to tore relational data. In your example, the a, b, and c lists are columns rather than rows. For tuples, you would want to store the rows as:
data = {'a':(0,10,1.3),'b':(1,10,1.9),'c':(2,10,2.3),'d':(0,20,2.3),'e':(1,20,2.9),'f':(2,20,3.4)}
where the tuples are stored in the (condition1, condition2, outcome) format you described and you can call a single test or filter a set as you describe. From there you can get a filtered set of results as follows:
filtered_data = {k:v for k,v in data.iteritems() if v[1]>=20}
which returns:
{'d': (0, 20, 2.3), 'e': (1, 20, 2.9), 'f': (2, 20, 3.4)}

For each value in dict?

I've got a dict with integer values, and I'd like to perform an operation on every value in the dict. I'd like to use a for loop for this, but I can't get it right. Something like:
>>>print(myDict)
{'ten': 10, 'fourteen': 14, 'six': 6}
>>>for value in myDict:
... value = value / 2
>>>print(myDict)
{'ten': 5, 'fourteen': 7, 'six': 3}
To iterate over keys and values:
for key, value in myDict.items():
myDict[key] = value / 2
The default loop over a dictionary iterates over its keys, like
for key in myDict:
myDict[key] /= 2
or you could use a map or a comprehension.
map:
myDict = map(lambda item: (item[0], item[1] / 2), myDict)
comprehension:
myDict = { k: v / 2 for k, v in myDict.items() }
for k in myDict:
myDict[k] /= 2
Using the dict.items() method and a dict comprehension:
dic = {'ten': 10, 'fourteen': 14, 'six': 6}
print({k: v/2 for k, v in dic.items()})
Output:
{'ten': 5.0, 'six': 3.0, 'fourteen': 7.0}
Python 3:
>>> my_dict = {'ten': 10, 'fourteen': 14, 'six': 6}
>>> for key, value in my_dict.items():
my_dict[key] = value / 2
>>> my_dict
{'fourteen': 7.0, 'six': 3.0, 'ten': 5.0}
This changes the original dictionary. Use // instead of / to get floor division.

Categories

Resources