Merge values of a dictionary by key based on custom function - python

Assume you have two dictionaries and you want to merge the two dictionaries by applying a function to the values that have matching keys. here I use the + operator as binary function.
x = { 1: "a", 2: "b", 3: "c" }
y = { 1: "A", 2: "B", 3: "C" }
result = { t[0][0]: t[0][1] + t[1][1] for t in zip(sorted(x.items()), sorted(y.items())) }
print result # gives { 1: "aA", 2: "bB", 3: "cC" }
I would prefer a self contained expression instead of statements, but this is unreadable.
so far I'm doing:
def dzip(f, a, b):
least_keys = set.intersection(set(a.keys()), set(b.keys()))
copy_dict = dict()
for i in least_keys.keys():
copy_dict[i] = f(a[i], b[i])
return copy_dict
print dzip(lambda a,b: a+b,x,y)
Is there a more readable solution to this than the expression I gave?

In the first case, you can directly use a dict comprehension:
>>> x = { 1: "a", 2: "b", 3: "c" }
>>> y = { 1: "A", 2: "B", 3: "C" }
>>> {key: x.get(key, "") + y.get(key, "") for key in set.intersection(set(x.keys()), set(y.keys()))}
{1: 'aA', 2: 'bB', 3: 'cC'}
So that in your second piece of code, you can simplify it to a simple one liner:
def dzip(f, a, b):
return {key: f(a.get(key, ""), b.get(key, "")) for key in set.inersection(set(a.keys()) + set(b.keys()))}
You can even define dzip as a lambda:
dzip = lambda f, a, b: {key: f(a.get(key, ""), b.get(key, ""))
for key in set.intersection(set(a.keys()), set(b.keys()))}
In a single run, this becomes:
>>> dzip = lambda f, a, b: {key: f(a.get(key, ""), b.get(key, ""))
... for key in set.intersection(set(a.keys()), set(b.keys()))}
>>>
>>> print dzip(lambda a,b: a+b,x,y)
{1: 'aA', 2: 'bB', 3: 'cC'}
Note that this will work even if x and y have different sets of keys (just something that can break in your first version of the code).

You can use Counter for this type of dict merging
from collections import Counter
>>>Counter(x)+Counter(y)
Counter({3: 'cC', 2: 'bB', 1: 'aA'})

Related

comparing inner dictionaries in Python

I am trying to create a Python function that receives a dictionary whose values are inner dictionaries. If the keys of the inner dictionaries are the same, it should return 1, if not it should return 0.
This is the code I tried:
def f(dct: dict) -> int:
for i in range(len(dct)):
for j in range(len(dct)):
dct1 = list(dct.values())
if dct1[i].keys() == dct1[j].keys():
return 1
else:
return 0
it actually worked when the input dictionary have only two inner dictionaries but didn't work for three.
For example:
f(
{
"A": {1: "a", 2: "b"},
"B": {2: "c", 3: "d"},
}
)
returned 0 (which is the result I wanted)
but
f(
{
"A": {1: "a", 2: "b"},
"B": {2: "c", 3: "d"},
"C": {1: "c", 2: "d"},
}
)
returned 1, which is not the result I wanted.
How do I fix it, please?
So you want to ensure all of the dictionaries that are the values of dct have the same keys (ignoring the values)?
def all_key_sets_equal(dct: dict) -> bool:
key_sets = [set(nd) for nd in dct.values()]
return all(key_set == key_sets[0] for key_set in key_sets)

how to write python to replace the next perl code?

I have just encountered Perl code similar to the following:
my #keys = qw/ 1 2 3 /;
my #vals = qw/ a b c /;
my %hash;
#hash{#keys} = #vals;
This code populates an associative array given a list of keys and a list of values. For example, the above code creates the following data structure (expressed as JSON):
{
"1": "a",
"2": "b",
"3": "c"
}
How would one go about doing this in Python?
Like this:
import json
keys = [1, 2, 3]
vals = ['a', 'b', 'c']
hash = dict(zip(keys, vals))
json.dumps(hash)
=> '{"1": "a", "2": "b", "3": "c"}'
That json is pretty much a polyglot with Python. Once you assign it to a name, though, it stops being a polyglot.
hf = {
"1": "a",
"2": "b",
"3": "c"
}
You can also iteratively align items into a dictionary.
letters = ('a', 'b', 'c', )
numbers = ('1', '2', '3', )
hf = { n : l for n, l in zip(numbers, letters) }
You can do:
>>> keys='123'
>>> vals='abc'
>>> dict(zip(keys,vals))
{'1': 'a', '3': 'c', '2': 'b'}
(Python note: strings are iterable, so list('abc') is the rough equivalent of my #vals = qw/ a b c /; in Perl)
Then if you want JSON:
>>> import json
>>> json.dumps(dict(zip(keys,vals)))
'{"1": "a", "3": "c", "2": "b"}'

identify a set containing correct values in dictionary of sets

I have a dictionary of sets, and two values to test. I need to identify the set containing both values (there's only one correct set) and return the key of that set. I thought I could get away with a one-liner like that below, but no success this far.
d = {"set1": {"A", "B", "C"}, "set2": {"D", "E", "F"}, "set3":{"A", "D", "C"}}
value1 = "A"
value2 = "B"
def do_values_belong_in_same_set(value1, value2):
if all(x in v for k, v in d.items() for x in [value1, value2]) is True:
return True, k
else:
return False
The desired output here would be: True, "set1"
The "v for k, v in d.items()" part doesn't do the trick. Nor does simpler "x in d.values()" What would work? Or will I just need to construct a proper for-loop for this? Thanks for your help!
>>> value1 = "A"
>>> value2 = "B"
>>> d = {"set1": {"A", "B", "C"}, "set2": {"D", "E", "F"}, "set3":{"A", "D", "C"}}
>>> [k for k, v in d.items() if value1 in v and value2 in v]
['set1']
You can use set.issubset (which uses the <= operator) by combining your needle characters into a set.
d = {"set1": {"A", "B", "C"}, "set2": {"D", "E", "F"}, "set3":{"A", "D", "C"}}
value1 = "A"
value2 = "B"
needle_set = set([value1, value2])
result = next(k for k,v in d.items() if needle_set.issubset(v))
# or needle_set <= v, or v >= needle_set, or
# v.issuperset(needle_set), all are the same condition
You could roll it into a function with your requested output like:
def do_values_belong_in_same_set(source_d, *values):
# I use the variadic argument `value` here so you can check any number of values
# and include the source dict by name as best practice
needle_set = set(values)
result = next(k for k,v in source_d.items() if needle_set <= v)
if result:
return True, result
else:
return False
You could slightly change the logic in your function to achieve this:
def do_values_belong_in_same_set(value1, value2):
r = next((k for k, v in d.items() if all(i in v for i in {value1, value2})), False)
if r:
return True, r
else:
return r
Calling next on the generator will return the first k (set name) if one exists and a default value of False is assigned if the value is not present. Then you return accordingly.
This yields the following results for different runs:
do_values_belong_in_same_set(value1, value2)
(True, 'set1')
do_values_belong_in_same_set(value1, 'F')
False
do_values_belong_in_same_set(value1, 'E')
False
do_values_belong_in_same_set('F', 'E')
(True, 'set2')
You can filter the sets for which the set composed of value1 and value2 is a subset:
def do_values_belong_in_same_set(value1, value2):
sets = [key for key, s in d.items() if {value1, value2} <= s]
if sets:
return True, sets[0]
else:
return False

Iterate over a list inside a nested dictionary

Lets say I have a dictionary like this:
myDict = {
1: {
"a": "something",
"b": [0, 1, 2],
"c": ["a", "b", "c"]
},
2: {
"a": "somethingElse",
"b": [3, 4, 5],
"c": ["d", "e", "f"]
},
3: {
"a": "another",
"b": [6, 7, 8],
"c": ["g", "h", "i"]
}
}
And this is my code:
for id, obj in myDict.items():
for key, val in obj.items():
if key is "b":
for item in val:
# apply some function to item
Is there a better way to iterate over a list inside a nested dict? Or is there a pythonic way to do this?
You absolutely do not need to iterate the list to print it (unless this is a functional requirement for the code you are writing).
Very simply, you could do this:
for id, obj in myDict.items():
if "b" in obj:
print obj["b"]
To map the list object, represented by obj['b'] to another function, you can use the map function:
map(foo, obj["b"])
If you're dictionary is always two levels deep, I don't see anything wrong with your approach. In your implementation, I would use key == "b" rather than key is "b". Using is will test for identity (e.g. id(a) == id(b)), while == will test for equality (e.g. a.__eq__(b)). This functions the same way when I test it in IDLE, but it's not a good habit to get into. There's more info on it here: How is the 'is' keyword implemented in Python?
If you want to deal with varying level dictionaries, you could use something like:
def test_dict_for_key(dictionary, key, function):
for test_key, value in dictionary.items():
if key == test_key:
dictionary[key] = type(value)(map(function, value))
if isinstance(value, dict):
test_dict_for_key(value, key, function)
An example usage might be something like:
myDict = {
1: {
"a": "something",
"b": [0, 1, 2],
"c": ["a", "b", "c"]
},
2: {
"a": "somethingElse",
"b": [3, 4, 5],
"c": ["d", "e", "f"]
},
3: {
"a": "another",
"b": [6, 7, 8],
"c": ["g", "h", "i"]
}
}
# adds 1 to every entry in each b
test_dict_for_key(myDict, "b", lambda x: x + 1)
# prints [1, 2, 3]
print(myDict[1]["b"])
I'm a fan of generator expressions.
inner_lists = (inner_dict['b'] for inner_dict in myDict.values())
# if 'b' is not guaranteed to exist,
# replace inner_dict['b'] with inner_dict.get('b', [])
items = (item for ls in inner_lists for item in ls)
Now you can either use a foo loop
for item in items:
# apply function
or map
transformed_items = map(func, items)
A couple fixes could be made.
Don't use is when comparing two strings (if key is "b":)
Simply say print(item) instead of using .format(), since you only have one variable that you're printing, with no additional string formatting
Revised code:
for id, obj in myDict.items():
for key, val in obj.items():
if key == "b":
for item in val:
print(item)
If you are sure that you will have a b key in every case, you can simply do:
for id, obj in myDict.items():
for item in obj["b"]:
print item

How to dynamically move inside a nested dict in Python

I have a dict with a dynamical number of nested dicts inside of it, something like:
my_dict = {"a": {"b": {"c: {...}}}}
I need to dynamically move inside this dict, for instance I'd like to do the following:
levels = ["a", "b", "c"]
my_dict[levels[0]][levels[1]][levels[2]] = "something"
where the number of items inside "levels" may vary.
I can partially achieve the same result for a limited number of items inside "levels" by writing something like this:
if len(levels) == 1:
my_dict[levels[0]] = "something"
elif len(levels) == 2:
my_dict[levels[0]][levels[1]] = "something"
elif len(levels) == 3:
my_dict[levels[0]][levels[1]][levels[2]] = "something"
(...)
but I'm looking for a more general and elegant solution.
Is there a way to do this?
There isn't a lot of code here to go on, but for what you have given, you can define
def get(d, keys):
for key in keys:
d = d[key]
return d
def set(d, keys, value):
d = get(d, keys[:-1])
d[keys[-1]] = value
And then use it like this
my_dict = {"a":{"b":{"c":{}}}}
set(my_dict, ["a", "b", "c"], "something")
print get(my_dict, ["a", "b", "c"])
A functional alternative for get:
def get(d, keys):
return reduce(lambda d, key: d[key], keys, d)

Categories

Resources