Compare two dicts and print non-equal values in python - python

I need to compare dictionary b against a to check whether the keys of b are in a.
If present check the values of a[key]==b[key]. If not equal print the key:value pair of both dictionaries for reference. How can I do that?
a = {'key_1': 1,'key_2': 2, 'key_3': 3}
b = {'key_1': 1,'key_2': 5}
[k for key in b if key in a if b[k]!=a[k]]
I used the above code, but not able to print both the dictionaries keys and value as like
not equal: b[key_2]=5 and a[key_2]=2

I need to compare dictionary b against a to check whether the keys of b are in a. You want to find the intersecting keys and then check their values:
a = {'key_1': 1,'key_2': 2, 'key_3': 3}
b = {'key_1': 1,'key_2': 5}
# find keys common to both
inter = a.keys() & b
diff_vals = [(k, a[k], b[k]) for k in inter if a[k] != b[k]]
# find keys common to both
inter = a.keys() & b
for k,av, bv in diff_vals:
print("key = {}, val_a = {}, val_b = {}".format(k, av, bv))
key = key_2, val_a = 2, val_b = 5
You can use many different set methods on the dict_view objetcs:
# find key/value pairings that are unique to either dict
symmetric = a.items() ^ b.items()
{('key_2', 2), ('key_2', 5), ('key_3', 3)}
# key/values in b that are not in a
difference = b.items() - a.items()
{('key_2', 5)}
# key/values in a that are not in b
difference = a.items() - b.items()
{('key_3', 3), ('key_2', 2)}
# get unique set of all keys from a and b
union = a.keys() | b
{'key_1', 'key_2', 'key_3'}
# get keys common to both dicts
inter = a.keys() & b
{'key_1', 'key_2'}

What you want is likely this:
result = [(k, a[k], b[k]) for k in a if k in b and a[k]!=b[k]]
In other words, "generate a list of tuples composed of the key, the first value and the second, whenever a key in a is also in b and the corresponding values are not equal".
Since the boolean expressions with "and" are failsafe (they evaluate from left to right and stop as soon as a False value is found), you don't have to worry that "b[k]!=a[k]" could raise an exception.
This raises another question: what if the key is in a and not b or vice-versa, e.g. ('car', 2, None) or ('car', None, 2)? Should that also be a valid answer?

I think there is a small error in the code you posted. First, you seem to mix k and key for the same object. Second, you cannot have two if-clauses in a list comprehension, but instead you can combine them with and. Here is how it could look: [k for k in b if k in a and a[k]!=b[k]]
This will produce a list with all the keys for which the values don't match. Given such a key, you can simply use e.g. "a[{k}]={a} and b[{k}]={b}".format(k=k,a=a[k],b=b[k]) to get a human-readable string describing that mismatch. Of couse, if you are only going to create that list of keys to loop over it afterwards (for printing), then there is no need to actually create that list in the first place. Simply iterate directly over the dictionary keys.

This might work
a = {'key_1': 1,'key_2': 2, 'key_3': 3}
b = {'key_1': 1,'key_2': 5}
i=[k for k in b if k in a if b[k]!=a[k]]
if i:
for k in i:
print('not equal:b[',k,']=',b[k],'and a[',k,']=',a[k])
Output
not equal:b[ key_2 ]= 5 and a[ key_2 ]= 2

Related

python iterate through an array and access the same value in a dictionary

I have a dictionary that consists of numbers and their value
dict = {1:5, 2:5, 3:5}
I have an array with some numbers
arr = [1,2]
What I want to do is:
iterate through the dict and the array
where the dictionary value is equal to the number in the array, set the dictionary value to zero
any value in the dictionary for which there isn't a value in the array matching it, add 1
so in the above example, I should end up with
arr = [1,2]
dict = {1:0, 2:0, 3:6}
The bit I am getting stuck on is creating a variable from the array value and accessing that particular number in the dictionary - using dict[i] for example
arr = [1,2]
data = {1:0, 2:0, 3:6} # don't call it dict because it shadow build-in class
unique = set(arr) # speed up search in case if arr is big
# readable
for k, v in data.items():
if k in unique:
data[k] = 0
else:
data[k] += 1
# oneliner
data = {k: (0 if k in unique else v + 1) for v, k in data.items()}
Additional example:
for a, b, c in [(1,2,3), (4,5,6)]:
print('-',a,b,c)
# will print:
# - 1 2 3
# - 4 5 6
You just need a dict-comprehension that will re-built your dictionary with an if condition for the value part.
my_dict = {1:5, 2:5, 3:5}
arr = [1,2]
my_dict = {k: (0 if k in arr else v+1) for k, v in my_dict.items()}
print(my_dict) # {1: 0, 2: 0, 3: 6}
Note that I have re-named the dictionary from dict to my_dict. That is because by using dict you are overwriting the Python built-in called dict. And you do not want to do that.
Theirs always the dict(map()) approach, which rebuilds a new dictionary with new values to each of the keys:
>>> d = {1:5, 2:5, 3:5}
>>> arr = {1, 2}
>>> dict(map(lambda x: (x[0], 0) if x[0] in arr else (x[0], x[1]+1), d.items()))
{1: 0, 2: 0, 3: 6}
This works because wrapping dict() will automatically convert mapped 2-tuples to a dictionary.
Also you should not use dict as a variable name, since it shadows the builtin dict.
Just use .update method :
dict_1 = {1:5, 2:5, 3:5}
arr = [1,2]
for i in dict_1:
if i in arr:
dict_1.update({i:0})
else:
dict_1.update({i:dict_1.get(i)+1})
print(dict_1)
output:
{1: 0, 2: 0, 3: 6}
P.S : don't use dict as variable

Combine pop() and setdefault() in python

I'm trying to build a method where if an item is not in a dictionary then it uses the last member of a list and updates the dictionary accordingly. Sort of like a combination of the pop and setdefault method. What I tried was the following:
dict1 = {1:2,3:4,5:6}
b = 7
c = [8,9,10]
e = dict1.setdefault(b, {}).update(pop(c))
So I would like the output to be where {7:10} gets updated to dict1, that is to say, if b is not in the keys of dict1 then the code updates dict1 with an item using b and the last item of c.
It might be possible for you to abuse a defaultdict:
from collections import defaultdict
c = [8, 9, 10]
dict1 = defaultdict(c.pop, {1: 2, 3: 4, 5: 6})
b = 7
e = dict1[b]
This will pop an item from c and make it a value of dict1 whenever a key missing from dict1 is accessed. (That means the expression dict1[b] on its own has side-effects.) There are many situations where that behaviour is more confusing than helpful, though, in which case you can opt for explicitness:
if b in dict1:
e = dict1[b]
else:
e = dict1[b] = c.pop()
which can of course be wrapped up in a function:
def get_or_pop(mapping, key, source):
if key in mapping:
v = mapping[key]
else:
v = mapping[key] = source.pop()
return v
⋮
e = get_or_pop(dict1, b, c)
Considering your variables, you could use the following code snippet
dict1[b] = dict1.pop(b, c.pop())
where you are updating the dictionary "dict1" with the key "b" and the value c.pop(), (last value of the list in c, equivalent to c[-1] in this case). Note that this is possible because the key value b=7 is not in you original dictionary.

assign values to list of variables in python

I have made a small demo of a more complex problem
def f(a):
return tuple([x for x in range(a)])
d = {}
[d['1'],d['2']] = f(2)
print d
# {'1': 0, '2': 1}
# Works
Now suppose the keys are programmatically generated
How do i achieve the same thing for this case?
n = 10
l = [x for x in range(n)]
[d[x] for x in l] = f(n)
print d
# SyntaxError: can't assign to list comprehension
You can't, it's a syntactical feature of the assignment statement. If you do something dynamic, it'll use different syntax, and thus not work.
If you have some function results f() and a list of keys keys, you can use zip to create an iterable of keys and results, and loop over them:
d = {}
for key, value in zip(keys, f()):
d[key] = value
That is easily rewritten as a dict comprehension:
d = {key: value for key, value in zip(keys, f())}
Or, in this specific case as mentioned by #JonClements, even as
d = dict(zip(keys, f()))

Ordering a nested dictionary by the frequency of the nested value

I have this list made from a csv which is massive.
For every item in list, I have broken it into it's id and details. id is always between 0-3 characters max length and details is variable.
I created an empty dictionary, D...(rest of code below):
D={}
for v in list:
id = v[0:3]
details = v[3:]
if id not in D:
D[id] = {}
if details not in D[id]:
D[id][details] = 0
D[id][details] += 1
aside: Can you help me understand what the two if statements are doing? Very new to python and programming.
Anyway, it produces something like this:
{'KEY1_1': {'key2_1' : value2_1, 'key2_2' : value2_2, 'key2_3' : value2_3},
'KEY1_2': {'key2_1' : value2_1, 'key2_2' : value2_2, 'key2_3' : value2_3},
and many more KEY1's with variable numbers of key2's
Each 'KEY1' is unique but each 'key2' isn't necessarily. The value2_
s are all different.
Ok so, right now I found a way to sort by the first KEY
for k, v in sorted(D.items()):
print k, ':', v
I have done enough research to know that dictionaries can't really be sorted but I don't care about sorting, I care about ordering or more specifically frequencies of occurrence. In my code value2_x is the number of times its corresponding key2_x occurs for that particular KEY1_x. I am starting to think I should have used better variable names.
Question: How do I order the top-level/overall dictionary by the number in value2_x which is in the nested dictionary? I want to do some statistics to those numbers like...
How many times does the most frequent KEY1_x:key2_x pair show up?
What are the 10, 20, 30 most frequent KEY1_x:key2_x pairs?
Can I only do that by each KEY1 or can I do it overall? Bonus: If I could order it that way for presentation/sharing that would be very helpful because it is such a large data set. So much thanks in advance and I hope I've made my question and intent clear.
You could use Counter to order the key pairs based on their frequency. It also provides an easy way to get x most frequent items:
from collections import Counter
d = {
'KEY1': {
'key2_1': 5,
'key2_2': 1,
'key2_3': 3
},
'KEY2': {
'key2_1': 2,
'key2_2': 3,
'key2_3': 4
}
}
c = Counter()
for k, v in d.iteritems():
c.update({(k, k1): v1 for k1, v1 in v.iteritems()})
print c.most_common(3)
Output:
[(('KEY1', 'key2_1'), 5), (('KEY2', 'key2_3'), 4), (('KEY2', 'key2_2'), 3)]
If you only care about the most common key pairs and have no other reason to build nested dictionary you could just use the following code:
from collections import Counter
l = ['foobar', 'foofoo', 'foobar', 'barfoo']
D = Counter((v[:3], v[3:]) for v in l)
print D.most_common() # [(('foo', 'bar'), 2), (('foo', 'foo'), 1), (('bar', 'foo'), 1)]
Short explanation: ((v[:3], v[3:]) for v in l) is a generator expression that will generate tuples where first item is the same as top level key in your original dict and second item is the same as key in nested dict.
>>> x = list((v[:3], v[3:]) for v in l)
>>> x
[('foo', 'bar'), ('foo', 'foo'), ('foo', 'bar'), ('bar', 'foo')]
Counter is a subclass of dict. It accepts an iterable as an argument and each unique element in iterable will be used as key and value is the count of element in the iterable.
>>> c = Counter(x)
>>> c
Counter({('foo', 'bar'): 2, ('foo', 'foo'): 1, ('bar', 'foo'): 1})
Since generator expression is an iterable there's no need to convert it to list in between so construction can simply be done with Counter((v[:3], v[3:]) for v in l).
The if statements you asked about are checking if the key exists in dict:
>>> d = {1: 'foo'}
>>> 1 in d
True
>>> 2 in d
False
So the following code will check if key with value of id exists in dict D and if it doesn't it will assign empty dict there.
if id not in D:
D[id] = {}
The second if does exactly the same for nested dictionaries.

Remove the smallest element(s) from a dictionary

I have a function such that there is a dictionary as parameters, with the value associated to be an integer. I'm trying to remove the minimum element(s) and return a set of the remaining keys.
I am programming in python. I cant seem to remove key value pairs with the same key or values. My code does not work for the 2nd and 3rd example
This is how it would work:
remaining({A: 1, B: 2, C: 2})
{B, C}
remaining({B: 2, C : 2})
{}
remaining({A: 1, B: 1, C: 1, D: 4})
{D}
This is what I have:
def remaining(d : {str:int}) -> {str}:
Remaining = set(d)
Remaining.remove(min(d, key=d.get))
return Remaining
One approach is to take the minimum value, then build a list of keys that are equal to it and utilise dict.viewkeys() which has set-like behaviour and remove the keys matching the minimum value from it.
d = {'A': 1, 'B': 1, 'C': 1, 'D': 4}
# Use .values() and .keys() and .items() for Python 3.x
min_val = min(d.itervalues())
remaining = d.viewkeys() - (k for k, v in d.iteritems() if v == min_val)
# set(['D'])
On a side note, I find it odd that {B: 2, C : 2} should be {} as there's not actually anything greater for those to be the minimum as it were.
That's because you're trying to map values to keys and map allows different keys to have the same values but not the other way! you should implement a map "reversal" as described here, remove the minimum key, and then reverse the map back to its original form.
from collections import defaultdict
# your example
l = {'A': 1, 'B': 1, 'C': 1, 'D': 4}
# reverse the dict
d1 = {}
for k, v in l.iteritems():
d1[v] = d1.get(v, []) + [k]
# remove the min element
del d1[min(d1, key=d1.get)]
#recover the rest to the original dict minus the min
res = {}
for k, v in d1.iteritems():
for e in v:
res[e] = k
print res
Comment:
#Jon Clements's solution is more elegant and should be accepted as the answer
Take the minimum value and construct a set with all the keys which are not associated to that value:
def remaining(d):
m = min(d.values())
return {k for k,v in d.items() if v != m}
If you don't like set comprehensions that's the same as:
def remaining(d):
m = min(d.values())
s = set()
for k,v in d.items():
if v != m:
s.add(k)
return s
This removes all the items with the minimum value.
import copy
def remaining(dic):
minimum = min([i for i in dic.values()])
for k, v in copy.copy(dic.items()):
if v == minimum: dic.pop(k)
return set(dic.keys())
An easier way would be to use pd.Series.idxmin() or pd.Series.min(). These functions allow you to find the index of the minimum value or the minimum value in a series, plus pandas allows you to create a named index.
import pandas as pd
import numpy as np
A = pd.Series(np.full(shape=5,fill_value=0))#create series of 0
A = A.reindex(['a','b','c','d','e'])#set index, similar to dictionary names
A['a'] = 2
print(A.max())
#output 2.0
print(A.idxmax())#you can also pop by index without changing other indices
#output a

Categories

Resources