Merging two dictionaries in Python with a key consisting of two values - python

I have data like --
sample 1, domain 1, value 1
sample 1, domain 2, value 1
sample 2, domain 1, value 1
sample 2, domain 3, value 1
-- stored in a dictionary --
dict_1 = {('sample 1','domain 1'): value 1, ('sample 1', 'domain 2'): value 1}
-- etc.
Now, I have a different kind of value, named value 2 --
sample 1, domain 1, value 2
sample 1, domain 2, value 2
sample 2, domain 1, value 2
sample 2, domain 3, value 2
-- which I again put in a dictionary,
dict_2 = {('sample 1','domain 1'): value 2, ('sample 1', 'domain 2'): value 2}
How can I merge these two dictionaries in python? The keys, for instance ('sample 1', 'domain 1') are the same for both dictionaries.
I expect it to look like --
final_dict = {('sample 1', 'domain 1'): (value 1, value 2), ('sample 1', 'domain 2'): (value 1, value 2)}
-- etc.

The closest you're likely to get to this would be a dict of lists (or sets). For simplicity, you usually go with collections.defaultdict(list) so you're not constantly checking if the key already exists. You need to map to some collection type as a value because dicts have unique keys, so you need some way to group the multiple values you want to store for each key.
from collections import defaultdict
final_dict = defaultdict(list)
for d in (dict_1, dict_2):
for k, v in d.items():
final_dict[k].append(v)
Or equivalently with itertools.chain, you just change the loop to:
from itertools import chain
for k, v in chain(dict_1.items(), dict_2.items()):
final_dict[k].append(v)
Side-note: If you really need it to be a proper dict at the end, and/or insist on the values being tuples rather than lists, a final pass can convert to such at the end:
final_dict = {k: tuple(v) for k, v in final_dict.items()}

You can use set intersection of keys to do this:
dict_1 = {('sample 1','domain 1'): 'value 1', ('sample 1', 'domain 2'): 'value 1'}
dict_2 = {('sample 1','domain 1'): 'value 2', ('sample 1', 'domain 2'): 'value 2'}
result = {k: (dict_1.get(k), dict_2.get(k)) for k in dict_1.keys() & dict_2.keys()}
print(result)
# {('sample 1', 'domain 1'): ('value 1', 'value 2'), ('sample 1', 'domain 2'): ('value 1', 'value 2')}
The above uses dict.get() to avoid possibilities of a KeyError being raised(very unlikely), since it will just return None by default.
As #ShadowRanger suggests in the comments, If a key is for some reason not found, you could replace from the opposite dictionary:
{k: (dict_1.get(k, dict_2.get(k)), dict_2.get(k, dict_1.get(k))) for k in dict_1.keys() | dict_2.keys()}

Does something handcrafted like this work for you?
dict3 = {}
for i in dict1:
dict3[i] = (dict1[i], dict2[i])

from collections import defaultdict
from itertools import chain
dict_1 = {('sample 1','domain 1'): 1, ('sample 1', 'domain 2'): 2}
dict_2 = {('sample 1','domain 1'): 3, ('sample 1', 'domain 2'): 4}
new_dict_to_process = defaultdict(list)
dict_list=[dict_1.items(),dict_2.items()]
for k,v in chain(*dict_list):
new_dict_to_process[k].append(v)
Output will be
{('sample 1', 'domain 1'): [1, 3],
('sample 1', 'domain 2'): [2, 4]})

Related

Convert first row of excel into dictionary

I am trying to convert first header row of excel table into dict with a value of 1. Fairly new to Python and not able to excute this code. My table in spreadhseet looks like:
Matrix
Column A
Column B
Row A
10
20
Row B
30
40
I would like my output as following dict:
{'Column A': 1,'Column B': 1}
I tried test_row = pd.read_excel("Test.xlsx", index_col=0).to_dict('index')
The column names will increase in future. So, it will be nice to have a solution that can extract n number of columns header into dict with a value of 1. Many thanks!
Given your example Dataframe as
df = pd.DataFrame({'Matrix': {0: 'Row A', 1: 'Row B'}, 'Column A': {0: 10, 1: 30}, 'Column B': {0: 20, 1: 40}})
You can use:
cols_dict = {col: 1 for col in df.columns} # {'Matrix': 1, 'Column A': 1, 'Column B': 1}
rows_dict = {row: 1 for row in df.Matrix} # {'Row A': 1, 'Row B': 1}

Finding all keys in a dictionary for which one or more value is repeated more than once in another dictionary

I have two dictionaries:
dict_1 = {'mother': ['mother', 'mom', 'mum', 'mommy', 'mummy', 'mamma', 'momma', 'ma', 'mama'],
'boy': ['boy', 'guy', 'dude', 'lad', 'son', 'schoolboy', 'young man'],
'girl': ['girl', 'daughter', 'lass', 'schoolgirl', 'young lady'],
'kitchen': ['kitchen'],
'exterior': ['exterior', 'outside', 'outdoor', 'outdoors'],
'car': ['car', 'vehicule', 'automobile'],
'water': ['water']
}
dict_2 = {'basket': 2,
'car' : 8,
'juice': 1,
'window': 6,
'outside': 2,
'oudoor': 4,
'road': 1,
'mom': 5,
'mother': 2,
'song': 1,
'vehicule': 1,
'fruits': 6
}
I'm looking for a way to find all keys in dict_1 for which one or more value is a key that has a value > 1 in dict_2 and the number of times a value associated with these keys is repeated in dict_2. Once I've found this, I would like to get another dictionary in which the keys are dict_1's keys (in this case, 'mother' and 'exterior') that are repeated more than once and the values are the number of times a value associated with these keys is repeated in dict_2 (in this case, 7 for 'mother' and 6 for 'exterior').
With the dictionaries I have, I would like my new dictionary to look something like this:
dict_final = {'mother': 7,
'exterior': 6,
'car': 9
}
Is there a way to do that in Python?
One approach:
res = {}
for k, vs in dict_1.items():
total = sum(dict_2.get(v, 0) for v in vs)
if total > 0:
res[k] = total
print(res)
Output
{'mother': 7, 'exterior': 6, 'car': 9}
As an alternative consider a dictionary comprehension (with a walrus operator in the mix):
res = {k: total for k, vs in dict_1.items() if (total := sum(dict_2.get(v, 0) for v in vs)) > 0}

I have made a dictionary using a list as the values for each key, I want to print the values without square brackets

items = ['Item 1', 'Item 2', 'Item 3']
new_items = {'bacon': items, 'bread': items, 'cheese': items}
for key, value in new_items.items():
print('{}: {}'.format(key, *value))
Output:
bacon: Item 1
bread: Item 1
cheese: Item 1
How do I get all of the items to print? If I remove the asterisk before value it prints all 3 items, but in square brackets.
You can create the string to be output right in the format() method call:
items = ['Item 1', 'Item 2', 'Item 3']
new_items = {'bacon': items, 'bread': items, 'cheese': items}
for key, values in new_items.items():
print('{}: {}'.format(key, ', '.join(values)))
Output:
bacon: Item 1, Item 2, Item 3
bread: Item 1, Item 2, Item 3
cheese: Item 1, Item 2, Item 3
The question reduces to "how to convert a list to a string?". Some possible options which might suit you:
In [1]: items = ['Item 1', 'Item 2', 'Item 3']
In [2]: ' '.join(items)
Out[2]: 'Item 1 Item 2 Item 3'
In [3]: ', '.join(items)
Out[3]: 'Item 1, Item 2, Item 3'
I think your problem is in how you construct new_items. Try this instead, then print:
items = ['Item 1', 'Item 2', 'Item 3']
labels = ['bacon', 'bread', 'cheese'}
new_items = zip(labels, items)
Using f-strings:
items = ['Item 1', 'Item 2', 'Item 3']
new_items = {'bacon': items, 'bread': items, 'cheese': items}
for key, value in new_items.items():
print(f'{key}: {", ".join(value)}')
Output:
bacon: Item 1, Item 2, Item 3
bread: Item 1, Item 2, Item 3
cheese: Item 1, Item 2, Item 3

How to update tuple when I find a duplicate

I have a tuple which consist of a number of teams that I have looped through and stored. The next step for me is to find the duplicates and store only one team, but update the number which indicate how many people are associated with the team.
_teamList = []
for obj in context['object_list']:
name = obj.team.name
number = 1
_teamList.append((name, number))
A example of a input looks something like:
[("Team bobcat", 1), ("Team Coffe", 1)]
here is the code for just getting the teams and add one to that.
I have tried something like this:
seen = set()
uniq = []
for x in _teamList:
if x not in seen:
x = 1 + x[1]
uniq.append(x)
seen.add(x)
Can anyone give me any tips?
You can refer this solution:
x=('team a', 'team b', 'team a', 'team c', 'team b', 'team b', 'team b')
l = {}
for i in x:
if i not in l:
l[i] = 1
else:
l[i] = l[i] + 1
data = list(tuple(l.items()))
print(data)
#output as: [('team a', 2), ('team b', 4), ('team c', 1)]
You can use 'Counter' from Collections.
https://docs.python.org/2/library/collections.html
This will automatically group the identical names for you. You need not calculate the number of occurrences.
For eg:
>>> from collections import Counter as c
>>> a = ('team a', 'team b', 'team a', 'team c', 'team b')
>>> c(a)
Counter({'team a': 2, 'team b': 2, 'team c': 1})
Here's a base Python solution:
a = ('team a', 'team b', 'team a', 'team c', 'team b', 'team b', 'team b')
d = {}
for x in a:
if x in d.keys():
d[x] += 1
else:
d[x] = 1
d
# {'team a': 2, 'team b': 4, 'team c': 1}
If you want the output as a tuple, add:
tuple((name, ct) for name, ct in d.items())
# (('team a', 2), ('team b', 4), ('team c', 1))

How to add a string to all integer value of dictionary

I have made a dictionary that with following code. I want to add a string "English" to all values but since there is integer in the value it does not accept.
key = ["I", "you", "we", "us", "they", "their"]
value = list(range(len(key)))
dictionary = dict(zip(key,value))
print(dictionary)
Output:
{'they': 4, 'I': 0, 'you': 1, 'we': 2, 'us': 3, 'their': 5}
I want following output:
output = {'they': 'English 4', 'I': 'English 0', 'you': 'English 1', 'we': 'English 2', 'us': 'English 3', 'their': 'English 5'}
You mean having 2 different fields for every value (one is index, one is language). To do that you just turn value into a list of tuples instead lf a single value.
So value should contain ((0, "English"), (1, "English"), ... (len(key), "English"))
You can do that easily with enumerate:
value = enumerate(["English"] * len(key))
Output:
{'their': (5, 'English'), 'you': (1, 'English'), 'us': (3, 'English'), 'I': (0, 'English'), 'they': (4, 'English'), 'we': (2, 'English')}
(iyou might have realized: enumerate(a) returns every item of a with index attached, i.e. ( (0, a[0]), (1, a[1]) , (2, a[2]), ...)
You can use list comprehension. change
value = range(len(key))
dictionary = dict(zip(key, ['English {}'.format(i) for i in value]))

Categories

Resources