Related
I know how to remove an entry, 'key' from my dictionary d, safely. You do:
if d.has_key('key'):
del d['key']
However, I need to remove multiple entries from a dictionary safely. I was thinking of defining the entries in a tuple as I will need to do this more than once.
entities_to_remove = ('a', 'b', 'c')
for x in entities_to_remove:
if x in d:
del d[x]
However, I was wondering if there is a smarter way to do this?
Using dict.pop:
d = {'some': 'data'}
entries_to_remove = ('any', 'iterable')
for k in entries_to_remove:
d.pop(k, None)
Using Dict Comprehensions
final_dict = {key: value for key, value in d if key not in [key1, key2]}
where key1 and key2 are to be removed.
In the example below, keys "b" and "c" are to be removed & it's kept in a keys list.
>>> a
{'a': 1, 'c': 3, 'b': 2, 'd': 4}
>>> keys = ["b", "c"]
>>> print {key: a[key] for key in a if key not in keys}
{'a': 1, 'd': 4}
>>>
Why not like this:
entries = ('a', 'b', 'c')
the_dict = {'b': 'foo'}
def entries_to_remove(entries, the_dict):
for key in entries:
if key in the_dict:
del the_dict[key]
A more compact version was provided by mattbornski using dict.pop()
a solution is using map and filter functions
python 2
d={"a":1,"b":2,"c":3}
l=("a","b","d")
map(d.__delitem__, filter(d.__contains__,l))
print(d)
python 3
d={"a":1,"b":2,"c":3}
l=("a","b","d")
list(map(d.__delitem__, filter(d.__contains__,l)))
print(d)
you get:
{'c': 3}
If you also need to retrieve the values for the keys you are removing, this would be a pretty good way to do it:
values_removed = [d.pop(k, None) for k in entities_to_remove]
You could of course still do this just for the removal of the keys from d, but you would be unnecessarily creating the list of values with the list comprehension. It is also a little unclear to use a list comprehension just for the function's side effect.
Found a solution with pop and map
d = {'a': 'valueA', 'b': 'valueB', 'c': 'valueC', 'd': 'valueD'}
keys = ['a', 'b', 'c']
list(map(d.pop, keys))
print(d)
The output of this:
{'d': 'valueD'}
I have answered this question so late just because I think it will help in the future if anyone searches the same. And this might help.
Update
The above code will throw an error if a key does not exist in the dict.
DICTIONARY = {'a': 'valueA', 'b': 'valueB', 'c': 'valueC', 'd': 'valueD'}
keys = ['a', 'l', 'c']
def remove_key(key):
DICTIONARY.pop(key, None)
list(map(remove_key, keys))
print(DICTIONARY)
output:
DICTIONARY = {'b': 'valueB', 'd': 'valueD'}
Some timing tests for cpython 3 shows that a simple for loop is the fastest way, and it's quite readable. Adding in a function doesn't cause much overhead either:
timeit results (10k iterations):
all(x.pop(v) for v in r) # 0.85
all(map(x.pop, r)) # 0.60
list(map(x.pop, r)) # 0.70
all(map(x.__delitem__, r)) # 0.44
del_all(x, r) # 0.40
<inline for loop>(x, r) # 0.35
def del_all(mapping, to_remove):
"""Remove list of elements from mapping."""
for key in to_remove:
del mapping[key]
For small iterations, doing that 'inline' was a bit faster, because of the overhead of the function call. But del_all is lint-safe, reusable, and faster than all the python comprehension and mapping constructs.
I have no problem with any of the existing answers, but I was surprised to not find this solution:
keys_to_remove = ['a', 'b', 'c']
my_dict = {k: v for k, v in zip("a b c d e f g".split(' '), [0, 1, 2, 3, 4, 5, 6])}
for k in keys_to_remove:
try:
del my_dict[k]
except KeyError:
pass
assert my_dict == {'d': 3, 'e': 4, 'f': 5, 'g': 6}
Note: I stumbled across this question coming from here. And my answer is related to this answer.
I have tested the performance of three methods:
# Method 1: `del`
for key in remove_keys:
if key in d:
del d[key]
# Method 2: `pop()`
for key in remove_keys:
d.pop(key, None)
# Method 3: comprehension
{key: v for key, v in d.items() if key not in remove_keys}
Here are the results of 1M iterations:
del: 2.03s 2.0 ns/iter (100%)
pop(): 2.38s 2.4 ns/iter (117%)
comprehension: 4.11s 4.1 ns/iter (202%)
So both del and pop() are the fastest. Comprehensions are 2x slower.
But anyway, we speak nanoseconds here :) Dicts in Python are ridiculously fast.
Why not:
entriestoremove = (2,5,1)
for e in entriestoremove:
if d.has_key(e):
del d[e]
I don't know what you mean by "smarter way". Surely there are other ways, maybe with dictionary comprehensions:
entriestoremove = (2,5,1)
newdict = {x for x in d if x not in entriestoremove}
inline
import functools
#: not key(c) in d
d = {"a": "avalue", "b": "bvalue", "d": "dvalue"}
entitiesToREmove = ('a', 'b', 'c')
#: python2
map(lambda x: functools.partial(d.pop, x, None)(), entitiesToREmove)
#: python3
list(map(lambda x: functools.partial(d.pop, x, None)(), entitiesToREmove))
print(d)
# output: {'d': 'dvalue'}
I think using the fact that the keys can be treated as a set is the nicest way if you're on python 3:
def remove_keys(d, keys):
to_remove = set(keys)
filtered_keys = d.keys() - to_remove
filtered_values = map(d.get, filtered_keys)
return dict(zip(filtered_keys, filtered_values))
Example:
>>> remove_keys({'k1': 1, 'k3': 3}, ['k1', 'k2'])
{'k3': 3}
It would be nice to have full support for set methods for dictionaries (and not the unholy mess we're getting with Python 3.9) so that you could simply "remove" a set of keys. However, as long as that's not the case, and you have a large dictionary with potentially a large number of keys to remove, you might want to know about the performance. So, I've created some code that creates something large enough for meaningful comparisons: a 100,000 x 1000 matrix, so 10,000,00 items in total.
from itertools import product
from time import perf_counter
# make a complete worksheet 100000 * 1000
start = perf_counter()
prod = product(range(1, 100000), range(1, 1000))
cells = {(x,y):x for x,y in prod}
print(len(cells))
print(f"Create time {perf_counter()-start:.2f}s")
clock = perf_counter()
# remove everything above row 50,000
keys = product(range(50000, 100000), range(1, 100))
# for x,y in keys:
# del cells[x, y]
for n in map(cells.pop, keys):
pass
print(len(cells))
stop = perf_counter()
print(f"Removal time {stop-clock:.2f}s")
10 million items or more is not unusual in some settings. Comparing the two methods on my local machine I see a slight improvement when using map and pop, presumably because of fewer function calls, but both take around 2.5s on my machine. But this pales in comparison to the time required to create the dictionary in the first place (55s), or including checks within the loop. If this is likely then its best to create a set that is a intersection of the dictionary keys and your filter:
keys = cells.keys() & keys
In summary: del is already heavily optimised, so don't worry about using it.
Another map() way to remove list of keys from dictionary
and avoid raising KeyError exception
dic = {
'key1': 1,
'key2': 2,
'key3': 3,
'key4': 4,
'key5': 5,
}
keys_to_remove = ['key_not_exist', 'key1', 'key2', 'key3']
k = list(map(dic.pop, keys_to_remove, keys_to_remove))
print('k=', k)
print('dic after = \n', dic)
**this will produce output**
k= ['key_not_exist', 1, 2, 3]
dic after = {'key4': 4, 'key5': 5}
Duplicate keys_to_remove is artificial, it needs to supply defaults values for dict.pop() function.
You can add here any array with len_ = len(key_to_remove)
For example
dic = {
'key1': 1,
'key2': 2,
'key3': 3,
'key4': 4,
'key5': 5,
}
keys_to_remove = ['key_not_exist', 'key1', 'key2', 'key3']
k = list(map(dic.pop, keys_to_remove, np.zeros(len(keys_to_remove))))
print('k=', k)
print('dic after = ', dic)
** will produce output **
k= [0.0, 1, 2, 3]
dic after = {'key4': 4, 'key5': 5}
def delete_keys_from_dict(dictionary, keys):
"""
Deletes the unwanted keys in the dictionary
:param dictionary: dict
:param keys: list of keys
:return: dict (modified)
"""
from collections.abc import MutableMapping
keys_set = set(keys)
modified_dict = {}
for key, value in dictionary.items():
if key not in keys_set:
if isinstance(value, list):
modified_dict[key] = list()
for x in value:
if isinstance(x, MutableMapping):
modified_dict[key].append(delete_keys_from_dict(x, keys_set))
else:
modified_dict[key].append(x)
elif isinstance(value, MutableMapping):
modified_dict[key] = delete_keys_from_dict(value, keys_set)
else:
modified_dict[key] = value
return modified_dict
_d = {'a': 1245, 'b': 1234325, 'c': {'a': 1245, 'b': 1234325}, 'd': 98765,
'e': [{'a': 1245, 'b': 1234325},
{'a': 1245, 'b': 1234325},
{'t': 767}]}
_output = delete_keys_from_dict(_d, ['a', 'b'])
_expected = {'c': {}, 'd': 98765, 'e': [{}, {}, {'t': 767}]}
print(_expected)
print(_output)
I'm late to this discussion but for anyone else. A solution may be to create a list of keys as such.
k = ['a','b','c','d']
Then use pop() in a list comprehension, or for loop, to iterate over the keys and pop one at a time as such.
new_dictionary = [dictionary.pop(x, 'n/a') for x in k]
The 'n/a' is in case the key does not exist, a default value needs to be returned.
I have created three dictionaries-dict1, dict2, and dict2. I want to update dict1 with dict2 first, and resulting dictionary with dict3. I am not sure why they are not adding up.
def wordcount_directory(directory):
dict = {}
filelist=[os.path.join(directory,f) for f in os.listdir(directory)]
dicts=[wordcount_file(file) for file in filelist]
dict1=dicts[0]
dict2=dicts[1]
dict3=dicts[2]
for k,v in dict1.iteritems():
if k in dict2.keys():
dict1[k]+=1
else:
dict1[k]=v
for k1,v1 in dict1.iteritems():
if k1 in dict3.keys():
dict1[k1]+=1
else:
dict1[k1]=v1
return dict1
print wordcount_directory("C:\\Users\\Phil2040\\Desktop\\Word_count")
Maybe I am not understanding you question right, but are you trying to add all the values from each of the dictionaries together into one final dictionary? If so:
dict1 = {'a': 1, 'b': 2, 'c': 3}
dict2 = {'b': 5, 'c': 1, 'd': 9}
dict3 = {'d': 1, 'e': 7}
def add_dict(to_dict, from_dict):
for key, value in from_dict.iteritems():
to_dict[key] = to_dict.get(key, 0) + value
result = dict(dict1)
add_dict(result, dict2)
add_dict(result, dict3)
print result
This yields: {'a': 1, 'c': 4, 'b': 7, 'e': 7, 'd': 10}
It would be really helpful to post what the expected outcome should be for your question.
EDIT:
For an arbitrary amount of dictionaries:
result = dict(dicts[0])
for dict_sum in dicts[1:]:
add_dict(result, dict_sum)
print(result)
If you really want to fix the code from your original question in the format it is in:
You are using dict1[k]+=1 when you should be performing dict1[k]+=dict2.get(k, 0).
The introduction of get removes the need to check for its existence with an if statement.
You need to iterate though dict2 and dict3 to introduce new keys from them into dict1
(not really a problem, but worth mentioning) In the if statement to check if the key is in the dictionary, it is recommended to simply the operation to if k in dict2: (see this post for more details)
With the amazing built-in library found by #DisplacedAussie, the answer can be simplified even further:
from collections import Counter
print(Counter(dict1) + Counter(dict2) + Counter(dict3))
The result yields: Counter({'d': 10, 'b': 7, 'e': 7, 'c': 4, 'a': 1})
The Counter object is a sub-class of dict, so it can be used in the same way as a standard dict.
Hmmm, here a simple function that might help:
def dictsum(dict1, dict2):
'''Modify dict1 to accumulate new sums from dict2
'''
k1 = set(dict1.keys())
k2 = set(dict2.keys())
for i in k1 & k2:
dict1[i] += dict2[i]
for i in k2 - k1:
dict1[i] = dict2[i]
return None
... for the intersection update each by adding the second value to the existing one; then for the difference add those key/value pairs.
With that defined you'd simple call:
dictsum(dict1, dict2)
dictsum(dict1, dict3)
... and be happy.
(I will note that functions modify the contents of dictionaries in this fashion are not all that common. I'm returning None explicitly to follow the convention established by the list.sort() method ... functions which modify the contents of a container, in Python, do not normally return copies of the container).
If I understand your question correctly, you are iterating on the wrong dictionary. You want to iterate over dict2 and update dict1 with matching keys or add non-matching keys to dict1.
If so, here's how you need to update the for loops:
for k,v in dict2.iteritems(): # Iterate over dict2
if k in dict1.keys():
dict1[k]+=1 # Update dict1 for matching keys
else:
dict1[k]=v # Add non-matching keys to dict1
for k1,v1 in dict3.iteritems(): # Iterate over dict3
if k1 in dict1.keys():
dict1[k1]+=1 # Update dict1 for matching keys
else:
dict1[k1]=v1 # Add non-matching keys to dict1
I assume that wordcount_file(file) returns a dict of the words found in file, with each key being a word and the associated value being the count for that word. If so, your updating algorithm is wrong. You should do something like this:
keys1 = dict1.keys()
for k,v in dict2.iteritems():
if k in keys1:
dict1[k] += v
else:
dict1[k] = v
If there's a lot of data in these dicts you can make the key lookup faster by storing the keys in a set:
keys1 = set(dict1.keys())
You should probably put that code into a function, so you don't need to duplicate the code when you want to update dict1 with the data in dict3.
You should take a look at collections.Counter, a subclass of dict that supports counting; using Counters would simplify this task considerably. But if this is an assignment (or you're using Python 2.6 or older) you may not be able to use Counters.
Given a dictionary like so:
my_map = {'a': 1, 'b': 2}
How can one invert this map to get:
inv_map = {1: 'a', 2: 'b'}
Python 3+:
inv_map = {v: k for k, v in my_map.items()}
Python 2:
inv_map = {v: k for k, v in my_map.iteritems()}
Assuming that the values in the dict are unique:
Python 3:
dict((v, k) for k, v in my_map.items())
Python 2:
dict((v, k) for k, v in my_map.iteritems())
If the values in my_map aren't unique:
Python 3:
inv_map = {}
for k, v in my_map.items():
inv_map[v] = inv_map.get(v, []) + [k]
Python 2:
inv_map = {}
for k, v in my_map.iteritems():
inv_map[v] = inv_map.get(v, []) + [k]
To do this while preserving the type of your mapping (assuming that it is a dict or a dict subclass):
def inverse_mapping(f):
return f.__class__(map(reversed, f.items()))
Try this:
inv_map = dict(zip(my_map.values(), my_map.keys()))
(Note that the Python docs on dictionary views explicitly guarantee that .keys() and .values() have their elements in the same order, which allows the approach above to work.)
Alternatively:
inv_map = dict((my_map[k], k) for k in my_map)
or using python 3.0's dict comprehensions
inv_map = {my_map[k] : k for k in my_map}
Another, more functional, way:
my_map = { 'a': 1, 'b':2 }
dict(map(reversed, my_map.items()))
We can also reverse a dictionary with duplicate keys using defaultdict:
from collections import Counter, defaultdict
def invert_dict(d):
d_inv = defaultdict(list)
for k, v in d.items():
d_inv[v].append(k)
return d_inv
text = 'aaa bbb ccc ddd aaa bbb ccc aaa'
c = Counter(text.split()) # Counter({'aaa': 3, 'bbb': 2, 'ccc': 2, 'ddd': 1})
dict(invert_dict(c)) # {1: ['ddd'], 2: ['bbb', 'ccc'], 3: ['aaa']}
See here:
This technique is simpler and faster than an equivalent technique using dict.setdefault().
This expands upon the answer by Robert, applying to when the values in the dict aren't unique.
class ReversibleDict(dict):
# Ref: https://stackoverflow.com/a/13057382/
def reversed(self):
"""
Return a reversed dict, with common values in the original dict
grouped into a list in the returned dict.
Example:
>>> d = ReversibleDict({'a': 3, 'c': 2, 'b': 2, 'e': 3, 'd': 1, 'f': 2})
>>> d.reversed()
{1: ['d'], 2: ['c', 'b', 'f'], 3: ['a', 'e']}
"""
revdict = {}
for k, v in self.items():
revdict.setdefault(v, []).append(k)
return revdict
The implementation is limited in that you cannot use reversed twice and get the original back. It is not symmetric as such. It is tested with Python 2.6. Here is a use case of how I am using to print the resultant dict.
If you'd rather use a set than a list, and there could exist unordered applications for which this makes sense, instead of setdefault(v, []).append(k), use setdefault(v, set()).add(k).
Combination of list and dictionary comprehension. Can handle duplicate keys
{v:[i for i in d.keys() if d[i] == v ] for k,v in d.items()}
A case where the dictionary values is a set. Like:
some_dict = {"1":{"a","b","c"},
"2":{"d","e","f"},
"3":{"g","h","i"}}
The inverse would like:
some_dict = {vi: k for k, v in some_dict.items() for vi in v}
The output is like this:
{'c': '1',
'b': '1',
'a': '1',
'f': '2',
'd': '2',
'e': '2',
'g': '3',
'h': '3',
'i': '3'}
For instance, you have the following dictionary:
my_dict = {'a': 'fire', 'b': 'ice', 'c': 'fire', 'd': 'water'}
And you wanna get it in such an inverted form:
inverted_dict = {'fire': ['a', 'c'], 'ice': ['b'], 'water': ['d']}
First Solution. For inverting key-value pairs in your dictionary use a for-loop approach:
# Use this code to invert dictionaries that have non-unique values
inverted_dict = dict()
for key, value in my_dict.items():
inverted_dict.setdefault(value, list()).append(key)
Second Solution. Use a dictionary comprehension approach for inversion:
# Use this code to invert dictionaries that have unique values
inverted_dict = {value: key for key, value in my_dict.items()}
Third Solution. Use reverting the inversion approach (relies on the second solution):
# Use this code to invert dictionaries that have lists of values
my_dict = {value: key for key in inverted_dict for value in my_map[key]}
Lot of answers but didn't find anything clean in case we are talking about a dictionary with non-unique values.
A solution would be:
from collections import defaultdict
inv_map = defaultdict(list)
for k, v in my_map.items():
inv_map[v].append(k)
Example:
If initial dict my_map = {'c': 1, 'd': 5, 'a': 5, 'b': 10}
then, running the code above will give:
{5: ['a', 'd'], 1: ['c'], 10: ['b']}
I found that this version is more than 10% faster than the accepted version of a dictionary with 10000 keys.
d = {i: str(i) for i in range(10000)}
new_d = dict(zip(d.values(), d.keys()))
In addition to the other functions suggested above, if you like lambdas:
invert = lambda mydict: {v:k for k, v in mydict.items()}
Or, you could do it this way too:
invert = lambda mydict: dict( zip(mydict.values(), mydict.keys()) )
I think the best way to do this is to define a class. Here is an implementation of a "symmetric dictionary":
class SymDict:
def __init__(self):
self.aToB = {}
self.bToA = {}
def assocAB(self, a, b):
# Stores and returns a tuple (a,b) of overwritten bindings
currB = None
if a in self.aToB: currB = self.bToA[a]
currA = None
if b in self.bToA: currA = self.aToB[b]
self.aToB[a] = b
self.bToA[b] = a
return (currA, currB)
def lookupA(self, a):
if a in self.aToB:
return self.aToB[a]
return None
def lookupB(self, b):
if b in self.bToA:
return self.bToA[b]
return None
Deletion and iteration methods are easy enough to implement if they're needed.
This implementation is way more efficient than inverting an entire dictionary (which seems to be the most popular solution on this page). Not to mention, you can add or remove values from your SymDict as much as you want, and your inverse-dictionary will always stay valid -- this isn't true if you simply reverse the entire dictionary once.
If the values aren't unique, and you're a little hardcore:
inv_map = dict(
(v, [k for (k, xx) in filter(lambda (key, value): value == v, my_map.items())])
for v in set(my_map.values())
)
Especially for a large dict, note that this solution is far less efficient than the answer Python reverse / invert a mapping because it loops over items() multiple times.
This handles non-unique values and retains much of the look of the unique case.
inv_map = {v:[k for k in my_map if my_map[k] == v] for v in my_map.itervalues()}
For Python 3.x, replace itervalues with values.
I am aware that this question already has many good answers, but I wanted to share this very neat solution that also takes care of duplicate values:
def dict_reverser(d):
seen = set()
return {v: k for k, v in d.items() if v not in seen or seen.add(v)}
This relies on the fact that set.add always returns None in Python.
Here is another way to do it.
my_map = {'a': 1, 'b': 2}
inv_map= {}
for key in my_map.keys() :
val = my_map[key]
inv_map[val] = key
dict([(value, key) for key, value in d.items()])
Function is symmetric for values of type list; Tuples are coverted to lists when performing reverse_dict(reverse_dict(dictionary))
def reverse_dict(dictionary):
reverse_dict = {}
for key, value in dictionary.iteritems():
if not isinstance(value, (list, tuple)):
value = [value]
for val in value:
reverse_dict[val] = reverse_dict.get(val, [])
reverse_dict[val].append(key)
for key, value in reverse_dict.iteritems():
if len(value) == 1:
reverse_dict[key] = value[0]
return reverse_dict
Since dictionaries require one unique key within the dictionary unlike values, we have to append the reversed values into a list of sort to be included within the new specific keys.
def r_maping(dictionary):
List_z=[]
Map= {}
for z, x in dictionary.iteritems(): #iterate through the keys and values
Map.setdefault(x,List_z).append(z) #Setdefault is the same as dict[key]=default."The method returns the key value available in the dictionary and if given key is not available then it will return provided default value. Afterward, we will append into the default list our new values for the specific key.
return Map
Fast functional solution for non-bijective maps (values not unique):
from itertools import imap, groupby
def fst(s):
return s[0]
def snd(s):
return s[1]
def inverseDict(d):
"""
input d: a -> b
output : b -> set(a)
"""
return {
v : set(imap(fst, kv_iter))
for (v, kv_iter) in groupby(
sorted(d.iteritems(),
key=snd),
key=snd
)
}
In theory this should be faster than adding to the set (or appending to the list) one by one like in the imperative solution.
Unfortunately the values have to be sortable, the sorting is required by groupby.
Try this for python 2.7/3.x
inv_map={};
for i in my_map:
inv_map[my_map[i]]=i
print inv_map
def invertDictionary(d):
myDict = {}
for i in d:
value = d.get(i)
myDict.setdefault(value,[]).append(i)
return myDict
print invertDictionary({'a':1, 'b':2, 'c':3 , 'd' : 1})
This will provide output as : {1: ['a', 'd'], 2: ['b'], 3: ['c']}
A lambda solution for current python 3.x versions:
d1 = dict(alice='apples', bob='bananas')
d2 = dict(map(lambda key: (d1[key], key), d1.keys()))
print(d2)
Result:
{'apples': 'alice', 'bananas': 'bob'}
This solution does not check for duplicates.
Some remarks:
The lambda construct can access d1 from the outer scope, so we only
pass in the current key. It returns a tuple.
The dict() constructor accepts a list of tuples. It
also accepts the result of a map, so we can skip the conversion to a
list.
This solution has no explicit for loop. It also avoids using a list comprehension for those who are bad at math ;-)
Taking up the highly voted answer starting If the values in my_map aren't unique:, I had a problem where not only the values were not unique, but in addition, they were a list, with each item in the list consisting again of a list of three elements: a string value, a number, and another number.
Example:
mymap['key1'] gives you:
[('xyz', 1, 2),
('abc', 5, 4)]
I wanted to switch only the string value with the key, keeping the two number elements at the same place. You simply need another nested for loop then:
inv_map = {}
for k, v in my_map.items():
for x in v:
# with x[1:3] same as x[1], x[2]:
inv_map[x[0]] = inv_map.get(x[0], []) + [k, x[1:3]]
Example:
inv_map['abc'] now gives you:
[('key1', 1, 2),
('key1', 5, 4)]
This works even if you have non-unique values in the original dictionary.
def dict_invert(d):
'''
d: dict
Returns an inverted dictionary
'''
# Your code here
inv_d = {}
for k, v in d.items():
if v not in inv_d.keys():
inv_d[v] = [k]
else:
inv_d[v].append(k)
inv_d[v].sort()
print(f"{inv_d[v]} are the values")
return inv_d
I would do it that way in python 2.
inv_map = {my_map[x] : x for x in my_map}
Not something completely different, just a bit rewritten recipe from Cookbook. It's futhermore optimized by retaining setdefault method, instead of each time getting it through the instance:
def inverse(mapping):
'''
A function to inverse mapping, collecting keys with simillar values
in list. Careful to retain original type and to be fast.
>> d = dict(a=1, b=2, c=1, d=3, e=2, f=1, g=5, h=2)
>> inverse(d)
{1: ['f', 'c', 'a'], 2: ['h', 'b', 'e'], 3: ['d'], 5: ['g']}
'''
res = {}
setdef = res.setdefault
for key, value in mapping.items():
setdef(value, []).append(key)
return res if mapping.__class__==dict else mapping.__class__(res)
Designed to be run under CPython 3.x, for 2.x replace mapping.items() with mapping.iteritems()
On my machine runs a bit faster, than other examples here
I know how to remove an entry, 'key' from my dictionary d, safely. You do:
if d.has_key('key'):
del d['key']
However, I need to remove multiple entries from a dictionary safely. I was thinking of defining the entries in a tuple as I will need to do this more than once.
entities_to_remove = ('a', 'b', 'c')
for x in entities_to_remove:
if x in d:
del d[x]
However, I was wondering if there is a smarter way to do this?
Using dict.pop:
d = {'some': 'data'}
entries_to_remove = ('any', 'iterable')
for k in entries_to_remove:
d.pop(k, None)
Using Dict Comprehensions
final_dict = {key: value for key, value in d if key not in [key1, key2]}
where key1 and key2 are to be removed.
In the example below, keys "b" and "c" are to be removed & it's kept in a keys list.
>>> a
{'a': 1, 'c': 3, 'b': 2, 'd': 4}
>>> keys = ["b", "c"]
>>> print {key: a[key] for key in a if key not in keys}
{'a': 1, 'd': 4}
>>>
Why not like this:
entries = ('a', 'b', 'c')
the_dict = {'b': 'foo'}
def entries_to_remove(entries, the_dict):
for key in entries:
if key in the_dict:
del the_dict[key]
A more compact version was provided by mattbornski using dict.pop()
a solution is using map and filter functions
python 2
d={"a":1,"b":2,"c":3}
l=("a","b","d")
map(d.__delitem__, filter(d.__contains__,l))
print(d)
python 3
d={"a":1,"b":2,"c":3}
l=("a","b","d")
list(map(d.__delitem__, filter(d.__contains__,l)))
print(d)
you get:
{'c': 3}
If you also need to retrieve the values for the keys you are removing, this would be a pretty good way to do it:
values_removed = [d.pop(k, None) for k in entities_to_remove]
You could of course still do this just for the removal of the keys from d, but you would be unnecessarily creating the list of values with the list comprehension. It is also a little unclear to use a list comprehension just for the function's side effect.
Found a solution with pop and map
d = {'a': 'valueA', 'b': 'valueB', 'c': 'valueC', 'd': 'valueD'}
keys = ['a', 'b', 'c']
list(map(d.pop, keys))
print(d)
The output of this:
{'d': 'valueD'}
I have answered this question so late just because I think it will help in the future if anyone searches the same. And this might help.
Update
The above code will throw an error if a key does not exist in the dict.
DICTIONARY = {'a': 'valueA', 'b': 'valueB', 'c': 'valueC', 'd': 'valueD'}
keys = ['a', 'l', 'c']
def remove_key(key):
DICTIONARY.pop(key, None)
list(map(remove_key, keys))
print(DICTIONARY)
output:
DICTIONARY = {'b': 'valueB', 'd': 'valueD'}
Some timing tests for cpython 3 shows that a simple for loop is the fastest way, and it's quite readable. Adding in a function doesn't cause much overhead either:
timeit results (10k iterations):
all(x.pop(v) for v in r) # 0.85
all(map(x.pop, r)) # 0.60
list(map(x.pop, r)) # 0.70
all(map(x.__delitem__, r)) # 0.44
del_all(x, r) # 0.40
<inline for loop>(x, r) # 0.35
def del_all(mapping, to_remove):
"""Remove list of elements from mapping."""
for key in to_remove:
del mapping[key]
For small iterations, doing that 'inline' was a bit faster, because of the overhead of the function call. But del_all is lint-safe, reusable, and faster than all the python comprehension and mapping constructs.
I have no problem with any of the existing answers, but I was surprised to not find this solution:
keys_to_remove = ['a', 'b', 'c']
my_dict = {k: v for k, v in zip("a b c d e f g".split(' '), [0, 1, 2, 3, 4, 5, 6])}
for k in keys_to_remove:
try:
del my_dict[k]
except KeyError:
pass
assert my_dict == {'d': 3, 'e': 4, 'f': 5, 'g': 6}
Note: I stumbled across this question coming from here. And my answer is related to this answer.
I have tested the performance of three methods:
# Method 1: `del`
for key in remove_keys:
if key in d:
del d[key]
# Method 2: `pop()`
for key in remove_keys:
d.pop(key, None)
# Method 3: comprehension
{key: v for key, v in d.items() if key not in remove_keys}
Here are the results of 1M iterations:
del: 2.03s 2.0 ns/iter (100%)
pop(): 2.38s 2.4 ns/iter (117%)
comprehension: 4.11s 4.1 ns/iter (202%)
So both del and pop() are the fastest. Comprehensions are 2x slower.
But anyway, we speak nanoseconds here :) Dicts in Python are ridiculously fast.
Why not:
entriestoremove = (2,5,1)
for e in entriestoremove:
if d.has_key(e):
del d[e]
I don't know what you mean by "smarter way". Surely there are other ways, maybe with dictionary comprehensions:
entriestoremove = (2,5,1)
newdict = {x for x in d if x not in entriestoremove}
inline
import functools
#: not key(c) in d
d = {"a": "avalue", "b": "bvalue", "d": "dvalue"}
entitiesToREmove = ('a', 'b', 'c')
#: python2
map(lambda x: functools.partial(d.pop, x, None)(), entitiesToREmove)
#: python3
list(map(lambda x: functools.partial(d.pop, x, None)(), entitiesToREmove))
print(d)
# output: {'d': 'dvalue'}
I think using the fact that the keys can be treated as a set is the nicest way if you're on python 3:
def remove_keys(d, keys):
to_remove = set(keys)
filtered_keys = d.keys() - to_remove
filtered_values = map(d.get, filtered_keys)
return dict(zip(filtered_keys, filtered_values))
Example:
>>> remove_keys({'k1': 1, 'k3': 3}, ['k1', 'k2'])
{'k3': 3}
It would be nice to have full support for set methods for dictionaries (and not the unholy mess we're getting with Python 3.9) so that you could simply "remove" a set of keys. However, as long as that's not the case, and you have a large dictionary with potentially a large number of keys to remove, you might want to know about the performance. So, I've created some code that creates something large enough for meaningful comparisons: a 100,000 x 1000 matrix, so 10,000,00 items in total.
from itertools import product
from time import perf_counter
# make a complete worksheet 100000 * 1000
start = perf_counter()
prod = product(range(1, 100000), range(1, 1000))
cells = {(x,y):x for x,y in prod}
print(len(cells))
print(f"Create time {perf_counter()-start:.2f}s")
clock = perf_counter()
# remove everything above row 50,000
keys = product(range(50000, 100000), range(1, 100))
# for x,y in keys:
# del cells[x, y]
for n in map(cells.pop, keys):
pass
print(len(cells))
stop = perf_counter()
print(f"Removal time {stop-clock:.2f}s")
10 million items or more is not unusual in some settings. Comparing the two methods on my local machine I see a slight improvement when using map and pop, presumably because of fewer function calls, but both take around 2.5s on my machine. But this pales in comparison to the time required to create the dictionary in the first place (55s), or including checks within the loop. If this is likely then its best to create a set that is a intersection of the dictionary keys and your filter:
keys = cells.keys() & keys
In summary: del is already heavily optimised, so don't worry about using it.
Another map() way to remove list of keys from dictionary
and avoid raising KeyError exception
dic = {
'key1': 1,
'key2': 2,
'key3': 3,
'key4': 4,
'key5': 5,
}
keys_to_remove = ['key_not_exist', 'key1', 'key2', 'key3']
k = list(map(dic.pop, keys_to_remove, keys_to_remove))
print('k=', k)
print('dic after = \n', dic)
**this will produce output**
k= ['key_not_exist', 1, 2, 3]
dic after = {'key4': 4, 'key5': 5}
Duplicate keys_to_remove is artificial, it needs to supply defaults values for dict.pop() function.
You can add here any array with len_ = len(key_to_remove)
For example
dic = {
'key1': 1,
'key2': 2,
'key3': 3,
'key4': 4,
'key5': 5,
}
keys_to_remove = ['key_not_exist', 'key1', 'key2', 'key3']
k = list(map(dic.pop, keys_to_remove, np.zeros(len(keys_to_remove))))
print('k=', k)
print('dic after = ', dic)
** will produce output **
k= [0.0, 1, 2, 3]
dic after = {'key4': 4, 'key5': 5}
def delete_keys_from_dict(dictionary, keys):
"""
Deletes the unwanted keys in the dictionary
:param dictionary: dict
:param keys: list of keys
:return: dict (modified)
"""
from collections.abc import MutableMapping
keys_set = set(keys)
modified_dict = {}
for key, value in dictionary.items():
if key not in keys_set:
if isinstance(value, list):
modified_dict[key] = list()
for x in value:
if isinstance(x, MutableMapping):
modified_dict[key].append(delete_keys_from_dict(x, keys_set))
else:
modified_dict[key].append(x)
elif isinstance(value, MutableMapping):
modified_dict[key] = delete_keys_from_dict(value, keys_set)
else:
modified_dict[key] = value
return modified_dict
_d = {'a': 1245, 'b': 1234325, 'c': {'a': 1245, 'b': 1234325}, 'd': 98765,
'e': [{'a': 1245, 'b': 1234325},
{'a': 1245, 'b': 1234325},
{'t': 767}]}
_output = delete_keys_from_dict(_d, ['a', 'b'])
_expected = {'c': {}, 'd': 98765, 'e': [{}, {}, {'t': 767}]}
print(_expected)
print(_output)
I'm late to this discussion but for anyone else. A solution may be to create a list of keys as such.
k = ['a','b','c','d']
Then use pop() in a list comprehension, or for loop, to iterate over the keys and pop one at a time as such.
new_dictionary = [dictionary.pop(x, 'n/a') for x in k]
The 'n/a' is in case the key does not exist, a default value needs to be returned.
Given a dictionary like so:
my_map = {'a': 1, 'b': 2}
How can one invert this map to get:
inv_map = {1: 'a', 2: 'b'}
Python 3+:
inv_map = {v: k for k, v in my_map.items()}
Python 2:
inv_map = {v: k for k, v in my_map.iteritems()}
Assuming that the values in the dict are unique:
Python 3:
dict((v, k) for k, v in my_map.items())
Python 2:
dict((v, k) for k, v in my_map.iteritems())
If the values in my_map aren't unique:
Python 3:
inv_map = {}
for k, v in my_map.items():
inv_map[v] = inv_map.get(v, []) + [k]
Python 2:
inv_map = {}
for k, v in my_map.iteritems():
inv_map[v] = inv_map.get(v, []) + [k]
To do this while preserving the type of your mapping (assuming that it is a dict or a dict subclass):
def inverse_mapping(f):
return f.__class__(map(reversed, f.items()))
Try this:
inv_map = dict(zip(my_map.values(), my_map.keys()))
(Note that the Python docs on dictionary views explicitly guarantee that .keys() and .values() have their elements in the same order, which allows the approach above to work.)
Alternatively:
inv_map = dict((my_map[k], k) for k in my_map)
or using python 3.0's dict comprehensions
inv_map = {my_map[k] : k for k in my_map}
Another, more functional, way:
my_map = { 'a': 1, 'b':2 }
dict(map(reversed, my_map.items()))
We can also reverse a dictionary with duplicate keys using defaultdict:
from collections import Counter, defaultdict
def invert_dict(d):
d_inv = defaultdict(list)
for k, v in d.items():
d_inv[v].append(k)
return d_inv
text = 'aaa bbb ccc ddd aaa bbb ccc aaa'
c = Counter(text.split()) # Counter({'aaa': 3, 'bbb': 2, 'ccc': 2, 'ddd': 1})
dict(invert_dict(c)) # {1: ['ddd'], 2: ['bbb', 'ccc'], 3: ['aaa']}
See here:
This technique is simpler and faster than an equivalent technique using dict.setdefault().
This expands upon the answer by Robert, applying to when the values in the dict aren't unique.
class ReversibleDict(dict):
# Ref: https://stackoverflow.com/a/13057382/
def reversed(self):
"""
Return a reversed dict, with common values in the original dict
grouped into a list in the returned dict.
Example:
>>> d = ReversibleDict({'a': 3, 'c': 2, 'b': 2, 'e': 3, 'd': 1, 'f': 2})
>>> d.reversed()
{1: ['d'], 2: ['c', 'b', 'f'], 3: ['a', 'e']}
"""
revdict = {}
for k, v in self.items():
revdict.setdefault(v, []).append(k)
return revdict
The implementation is limited in that you cannot use reversed twice and get the original back. It is not symmetric as such. It is tested with Python 2.6. Here is a use case of how I am using to print the resultant dict.
If you'd rather use a set than a list, and there could exist unordered applications for which this makes sense, instead of setdefault(v, []).append(k), use setdefault(v, set()).add(k).
Combination of list and dictionary comprehension. Can handle duplicate keys
{v:[i for i in d.keys() if d[i] == v ] for k,v in d.items()}
A case where the dictionary values is a set. Like:
some_dict = {"1":{"a","b","c"},
"2":{"d","e","f"},
"3":{"g","h","i"}}
The inverse would like:
some_dict = {vi: k for k, v in some_dict.items() for vi in v}
The output is like this:
{'c': '1',
'b': '1',
'a': '1',
'f': '2',
'd': '2',
'e': '2',
'g': '3',
'h': '3',
'i': '3'}
For instance, you have the following dictionary:
my_dict = {'a': 'fire', 'b': 'ice', 'c': 'fire', 'd': 'water'}
And you wanna get it in such an inverted form:
inverted_dict = {'fire': ['a', 'c'], 'ice': ['b'], 'water': ['d']}
First Solution. For inverting key-value pairs in your dictionary use a for-loop approach:
# Use this code to invert dictionaries that have non-unique values
inverted_dict = dict()
for key, value in my_dict.items():
inverted_dict.setdefault(value, list()).append(key)
Second Solution. Use a dictionary comprehension approach for inversion:
# Use this code to invert dictionaries that have unique values
inverted_dict = {value: key for key, value in my_dict.items()}
Third Solution. Use reverting the inversion approach (relies on the second solution):
# Use this code to invert dictionaries that have lists of values
my_dict = {value: key for key in inverted_dict for value in my_map[key]}
Lot of answers but didn't find anything clean in case we are talking about a dictionary with non-unique values.
A solution would be:
from collections import defaultdict
inv_map = defaultdict(list)
for k, v in my_map.items():
inv_map[v].append(k)
Example:
If initial dict my_map = {'c': 1, 'd': 5, 'a': 5, 'b': 10}
then, running the code above will give:
{5: ['a', 'd'], 1: ['c'], 10: ['b']}
I found that this version is more than 10% faster than the accepted version of a dictionary with 10000 keys.
d = {i: str(i) for i in range(10000)}
new_d = dict(zip(d.values(), d.keys()))
In addition to the other functions suggested above, if you like lambdas:
invert = lambda mydict: {v:k for k, v in mydict.items()}
Or, you could do it this way too:
invert = lambda mydict: dict( zip(mydict.values(), mydict.keys()) )
I think the best way to do this is to define a class. Here is an implementation of a "symmetric dictionary":
class SymDict:
def __init__(self):
self.aToB = {}
self.bToA = {}
def assocAB(self, a, b):
# Stores and returns a tuple (a,b) of overwritten bindings
currB = None
if a in self.aToB: currB = self.bToA[a]
currA = None
if b in self.bToA: currA = self.aToB[b]
self.aToB[a] = b
self.bToA[b] = a
return (currA, currB)
def lookupA(self, a):
if a in self.aToB:
return self.aToB[a]
return None
def lookupB(self, b):
if b in self.bToA:
return self.bToA[b]
return None
Deletion and iteration methods are easy enough to implement if they're needed.
This implementation is way more efficient than inverting an entire dictionary (which seems to be the most popular solution on this page). Not to mention, you can add or remove values from your SymDict as much as you want, and your inverse-dictionary will always stay valid -- this isn't true if you simply reverse the entire dictionary once.
If the values aren't unique, and you're a little hardcore:
inv_map = dict(
(v, [k for (k, xx) in filter(lambda (key, value): value == v, my_map.items())])
for v in set(my_map.values())
)
Especially for a large dict, note that this solution is far less efficient than the answer Python reverse / invert a mapping because it loops over items() multiple times.
This handles non-unique values and retains much of the look of the unique case.
inv_map = {v:[k for k in my_map if my_map[k] == v] for v in my_map.itervalues()}
For Python 3.x, replace itervalues with values.
I am aware that this question already has many good answers, but I wanted to share this very neat solution that also takes care of duplicate values:
def dict_reverser(d):
seen = set()
return {v: k for k, v in d.items() if v not in seen or seen.add(v)}
This relies on the fact that set.add always returns None in Python.
Here is another way to do it.
my_map = {'a': 1, 'b': 2}
inv_map= {}
for key in my_map.keys() :
val = my_map[key]
inv_map[val] = key
dict([(value, key) for key, value in d.items()])
Function is symmetric for values of type list; Tuples are coverted to lists when performing reverse_dict(reverse_dict(dictionary))
def reverse_dict(dictionary):
reverse_dict = {}
for key, value in dictionary.iteritems():
if not isinstance(value, (list, tuple)):
value = [value]
for val in value:
reverse_dict[val] = reverse_dict.get(val, [])
reverse_dict[val].append(key)
for key, value in reverse_dict.iteritems():
if len(value) == 1:
reverse_dict[key] = value[0]
return reverse_dict
Since dictionaries require one unique key within the dictionary unlike values, we have to append the reversed values into a list of sort to be included within the new specific keys.
def r_maping(dictionary):
List_z=[]
Map= {}
for z, x in dictionary.iteritems(): #iterate through the keys and values
Map.setdefault(x,List_z).append(z) #Setdefault is the same as dict[key]=default."The method returns the key value available in the dictionary and if given key is not available then it will return provided default value. Afterward, we will append into the default list our new values for the specific key.
return Map
Fast functional solution for non-bijective maps (values not unique):
from itertools import imap, groupby
def fst(s):
return s[0]
def snd(s):
return s[1]
def inverseDict(d):
"""
input d: a -> b
output : b -> set(a)
"""
return {
v : set(imap(fst, kv_iter))
for (v, kv_iter) in groupby(
sorted(d.iteritems(),
key=snd),
key=snd
)
}
In theory this should be faster than adding to the set (or appending to the list) one by one like in the imperative solution.
Unfortunately the values have to be sortable, the sorting is required by groupby.
Try this for python 2.7/3.x
inv_map={};
for i in my_map:
inv_map[my_map[i]]=i
print inv_map
def invertDictionary(d):
myDict = {}
for i in d:
value = d.get(i)
myDict.setdefault(value,[]).append(i)
return myDict
print invertDictionary({'a':1, 'b':2, 'c':3 , 'd' : 1})
This will provide output as : {1: ['a', 'd'], 2: ['b'], 3: ['c']}
A lambda solution for current python 3.x versions:
d1 = dict(alice='apples', bob='bananas')
d2 = dict(map(lambda key: (d1[key], key), d1.keys()))
print(d2)
Result:
{'apples': 'alice', 'bananas': 'bob'}
This solution does not check for duplicates.
Some remarks:
The lambda construct can access d1 from the outer scope, so we only
pass in the current key. It returns a tuple.
The dict() constructor accepts a list of tuples. It
also accepts the result of a map, so we can skip the conversion to a
list.
This solution has no explicit for loop. It also avoids using a list comprehension for those who are bad at math ;-)
Taking up the highly voted answer starting If the values in my_map aren't unique:, I had a problem where not only the values were not unique, but in addition, they were a list, with each item in the list consisting again of a list of three elements: a string value, a number, and another number.
Example:
mymap['key1'] gives you:
[('xyz', 1, 2),
('abc', 5, 4)]
I wanted to switch only the string value with the key, keeping the two number elements at the same place. You simply need another nested for loop then:
inv_map = {}
for k, v in my_map.items():
for x in v:
# with x[1:3] same as x[1], x[2]:
inv_map[x[0]] = inv_map.get(x[0], []) + [k, x[1:3]]
Example:
inv_map['abc'] now gives you:
[('key1', 1, 2),
('key1', 5, 4)]
This works even if you have non-unique values in the original dictionary.
def dict_invert(d):
'''
d: dict
Returns an inverted dictionary
'''
# Your code here
inv_d = {}
for k, v in d.items():
if v not in inv_d.keys():
inv_d[v] = [k]
else:
inv_d[v].append(k)
inv_d[v].sort()
print(f"{inv_d[v]} are the values")
return inv_d
I would do it that way in python 2.
inv_map = {my_map[x] : x for x in my_map}
Not something completely different, just a bit rewritten recipe from Cookbook. It's futhermore optimized by retaining setdefault method, instead of each time getting it through the instance:
def inverse(mapping):
'''
A function to inverse mapping, collecting keys with simillar values
in list. Careful to retain original type and to be fast.
>> d = dict(a=1, b=2, c=1, d=3, e=2, f=1, g=5, h=2)
>> inverse(d)
{1: ['f', 'c', 'a'], 2: ['h', 'b', 'e'], 3: ['d'], 5: ['g']}
'''
res = {}
setdef = res.setdefault
for key, value in mapping.items():
setdef(value, []).append(key)
return res if mapping.__class__==dict else mapping.__class__(res)
Designed to be run under CPython 3.x, for 2.x replace mapping.items() with mapping.iteritems()
On my machine runs a bit faster, than other examples here