The default behavior for python dictionary is to create a new key in the dictionary if that key does not already exist. For example:
d = {}
d['did not exist before'] = 'now it does'
this is all well and good for most purposes, but what if I'd like python to do nothing if the key isn't already in the dictionary. In my situation:
for x in exceptions:
if masterlist.has_key(x):
masterlist[x] = False
in other words, i don't want some incorrect elements in exceptions to corrupt my masterlist. Is this as simple as it gets? it FEELS like I should be able to do this in one line inside the for loop (i.e., without explicitly checking that x is a key of masterlist)
UPDATE:
To me, my question is asking about the lack of a parallel between a list and a dict. For example:
l = []
l[0] = 2 #fails
l.append(2) #works
with the subclassing answer, you could modify the dictionary (maybe "safe_dict" or "explicit_dict" to do something similar:
d = {}
d['a'] = '1' #would fail in my world
d.insert('a','1') #what my world is missing
You could use .update:
masterlist.update((x, False) for x in exceptions if masterlist.has_key(x))
You can inherit a dict class, override it's __setitem__ to check for existance of key (or do the same with monkey-patching only one instance).
Sample class:
class a(dict):
def __init__(self, *args, **kwargs):
dict.__init__(self, *args, **kwargs)
dict.__setitem__(self, 'a', 'b')
def __setitem__(self, key, value):
if self.has_key(key):
dict.__setitem__(self, key, value)
a = a()
print a['a'] # prints 'b'
a['c'] = 'd'
# print a['c'] - would fail
a['a'] = 'e'
print a['a'] # prints 'e'
You could also use some function to make setting values without checking for existence simpler.
However, I though it would be shorter... Don't use it unless you need it in many places.
You can also use in instead of has_key, which is a little nicer.
for x in exceptions:
if x in masterlist:
masterlist[x] = False
But I don't see the issue with having an if statement for this purpose.
For long lists try to use the & operator with set() function embraced with ():
for x in (set(exceptions) & set(masterlist)):
masterlist[x] = False
#or masterlist[x] = exceptions[x]
It'll improve the reading and the iterations at the same time by reading the masterlist's keys only once.
Related
I am trying to provide a function as the default argument for the dictionary's get function, like this
def run():
print "RUNNING"
test = {'store':1}
test.get('store', run())
However, when this is run, it displays the following output:
RUNNING
1
so my question is, as the title says, is there a way to provide a callable as the default value for the get method without it being called if the key exists?
Another option, assuming you don't intend to store falsy values in your dictionary:
test.get('store') or run()
In python, the or operator does not evaluate arguments that are not needed (it short-circuits)
If you do need to support falsy values, then you can use get_or_run(test, 'store', run) where:
def get_or_run(d, k, f):
sentinel = object() # guaranteed not to be in d
v = d.get(k, sentinel)
return f() if v is sentinel else v
See the discussion in the answers and comments of dict.get() method returns a pointer. You have to break it into two steps.
Your options are:
Use a defaultdict with the callable if you always want that value as the default, and want to store it in the dict.
Use a conditional expression:
item = test['store'] if 'store' in test else run()
Use try / except:
try:
item = test['store']
except KeyError:
item = run()
Use get:
item = test.get('store')
if item is None:
item = run()
And variations on those themes.
glglgl shows a way to subclass defaultdict, you can also just subclass dict for some situations:
def run():
print "RUNNING"
return 1
class dict_nokeyerror(dict):
def __missing__(self, key):
return run()
test = dict_nokeyerror()
print test['a']
# RUNNING
# 1
Subclassing only really makes sense if you always want the dict to have some nonstandard behavior; if you generally want it to behave like a normal dict and just want a lazy get in one place, use one of my methods 2-4.
I suppose you want to have the callable applied only if the key does not exist.
There are several approaches to do so.
One would be to use a defaultdict, which calls run() if key is missing.
from collections import defaultdict
def run():
print "RUNNING"
test = {'store':1}
test.get('store', run())
test = defaultdict(run, store=1) # provides a value for store
test['store'] # gets 1
test['runthatstuff'] # gets None
Another, rather ugly one, one would be to only save callables in the dict which return the apropriate value.
test = {'store': lambda:1}
test.get('store', run)() # -> 1
test.get('runrun', run)() # -> None, prints "RUNNING".
If you want to have the return value depend on the missing key, you have to subclass defaultdict:
class mydefaultdict(defaultdict):
def __missing__(self, key):
val = self[key] = self.default_factory(key)
return val
d = mydefaultdict(lambda k: k*k)
d[10] # yields 100
#mydefaultdict # decorators are fine
def d2(key):
return -key
d2[5] # yields -5
And if you want not to add this value to the dict for the next call, you have a
def __missing__(self, key): return self.default_factory(key)
instead which calls the default factory every time a key: value pair was not explicitly added.
If you only know what the callable is likely to be at he get call site you could subclass dict something like this
class MyDict(dict):
def get_callable(self,key,func,*args,**kwargs):
'''Like ordinary get but uses a callable to
generate the default value'''
if key not in self:
val = func(*args,**kwargs)
else:
val = self[key]
return val
This can then be used like so:-
>>> d = MyDict()
>>> d.get_callable(1,complex,2,3)
(2+3j)
>>> d[1] = 2
>>> d.get_callable(1,complex,2,3)
2
>>> def run(): print "run"
>>> repr(d.get_callable(1,run))
'2'
>>> repr(d.get_callable(2,run))
run
'None'
This is probably most useful when the callable is expensive to compute.
I have a util directory in my project with qt.py, general.py, geom.py, etc. In general.py I have a bunch of python tools like the one you need:
# Use whenever you need a lambda default
def dictGet(dict_, key, default):
if key not in dict_:
return default()
return dict_[key]
Add *args, **kwargs if you want to support calling default more than once with differing args:
def dictGet(dict_, key, default, *args, **kwargs):
if key not in dict_:
return default(*args, **kwargs)
return dict_[key]
Here's what I use:
def lazy_get(d, k, f):
return d[k] if k in d else f(k)
The fallback function f takes the key as an argument, e.g.
lazy_get({'a': 13}, 'a', lambda k: k) # --> 13
lazy_get({'a': 13}, 'b', lambda k: k) # --> 'b'
You would obviously use a more meaningful fallback function, but this illustrates the flexibility of lazy_get.
Here's what the function looks like with type annotation:
from typing import Callable, Mapping, TypeVar
K = TypeVar('K')
V = TypeVar('V')
def lazy_get(d: Mapping[K, V], k: K, f: Callable[[K], V]) -> V:
return d[k] if k in d else f(k)
I'm trying to write a function right now, and its purpose is to go through an object's __dict__ and add an item to a dictionary if the item is not a function.
Here is my code:
def dict_into_list(self):
result = {}
for each_key,each_item in self.__dict__.items():
if inspect.isfunction(each_key):
continue
else:
result[each_key] = each_item
return result
If I'm not mistaken, inspect.isfunction is supposed to recognize lambdas as functions as well, correct? However, if I write
c = some_object(3)
c.whatever = lambda x : x*3
then my function still includes the lambda. Can somebody explain why this is?
For example, if I have a class like this:
class WhateverObject:
def __init__(self,value):
self._value = value
def blahblah(self):
print('hello')
a = WhateverObject(5)
So if I say print(a.__dict__), it should give back {_value:5}
You are actually checking if each_key is a function, which most likely is not. You actually have to check the value, like this
if inspect.isfunction(each_item):
You can confirm this, by including a print, like this
def dict_into_list(self):
result = {}
for each_key, each_item in self.__dict__.items():
print(type(each_key), type(each_item))
if inspect.isfunction(each_item) == False:
result[each_key] = each_item
return result
Also, you can write your code with dictionary comprehension, like this
def dict_into_list(self):
return {key: value for key, value in self.__dict__.items()
if not inspect.isfunction(value)}
I can think of an easy way to find the variables of an object through the dir and callable methods of python instead of inspect module.
{var:self.var for var in dir(self) if not callable(getattr(self, var))}
Please note that this indeed assumes that you have not overrided __getattr__ method of the class to do something other than getting the attributes.
One minor annoyance with dict.setdefault is that it always evaluates its second argument (when given, of course), even when the first the first argument is already a key in the dictionary.
For example:
import random
def noisy_default():
ret = random.randint(0, 10000000)
print 'noisy_default: returning %d' % ret
return ret
d = dict()
print d.setdefault(1, noisy_default())
print d.setdefault(1, noisy_default())
This produces ouptut like the following:
noisy_default: returning 4063267
4063267
noisy_default: returning 628989
4063267
As the last line confirms, the second execution of noisy_default is unnecessary, since by this point the key 1 is already present in d (with value 4063267).
Is it possible to implement a subclass of dict whose setdefault method evaluates its second argument lazily?
EDIT:
Below is an implementation inspired by BrenBarn's comment and Pavel Anossov's answer. While at it, I went ahead and implemented a lazy version of get as well, since the underlying idea is essentially the same.
class LazyDict(dict):
def get(self, key, thunk=None):
return (self[key] if key in self else
thunk() if callable(thunk) else
thunk)
def setdefault(self, key, thunk=None):
return (self[key] if key in self else
dict.setdefault(self, key,
thunk() if callable(thunk) else
thunk))
Now, the snippet
d = LazyDict()
print d.setdefault(1, noisy_default)
print d.setdefault(1, noisy_default)
produces output like this:
noisy_default: returning 5025427
5025427
5025427
Notice that the second argument to d.setdefault above is now a callable, not a function call.
When the second argument to LazyDict.get or LazyDict.setdefault is not a callable, they behave the same way as the corresponding dict methods.
If one wants to pass a callable as the default value itself (i.e., not meant to be called), or if the callable to be called requires arguments, prepend lambda: to the appropriate argument. E.g.:
d1.setdefault('div', lambda: div_callback)
d2.setdefault('foo', lambda: bar('frobozz'))
Those who don't like the idea of overriding get and setdefault, and/or the resulting need to test for callability, etc., can use this version instead:
class LazyButHonestDict(dict):
def lazyget(self, key, thunk=lambda: None):
return self[key] if key in self else thunk()
def lazysetdefault(self, key, thunk=lambda: None):
return (self[key] if key in self else
self.setdefault(key, thunk()))
This can be accomplished with defaultdict, too. It is instantiated with a callable which is then called when a nonexisting element is accessed.
from collections import defaultdict
d = defaultdict(noisy_default)
d[1] # noise
d[1] # no noise
The caveat with defaultdict is that the callable gets no arguments, so you can not derive the default value from the key as you could with dict.setdefault. This can be mitigated by overriding __missing__ in a subclass:
from collections import defaultdict
class defaultdict2(defaultdict):
def __missing__(self, key):
value = self.default_factory(key)
self[key] = value
return value
def noisy_default_with_key(key):
print key
return key + 1
d = defaultdict2(noisy_default_with_key)
d[1] # prints 1, sets 2, returns 2
d[1] # does not print anything, does not set anything, returns 2
For more information, see the collections module.
You can do that in a one-liner using a ternary operator:
value = cache[key] if key in cache else cache.setdefault(key, func(key))
If you are sure that the cache will never store falsy values, you can simplify it a little bit:
value = cache.get(key) or cache.setdefault(key, func(key))
No, evaluation of arguments happens before the call. You can implement a setdefault-like function that takes a callable as its second argument and calls it only if it is needed.
There seems to be no one-liner that doesn't require an extra class or extra lookups. For the record, here is a easy (even not concise) way of achieving that without either of them.
try:
value = dct[key]
except KeyError:
value = noisy_default()
dct[key] = value
return value
In reading the specifications for the with statement (link), I have some things I'd like to play around with. This isn't for any production code or anything, I'm just exploring, so please don't be too harsh if this is a bad idea.
What I'd like to do is grab the piece called "BLOCK" in the linked docs above, and actually tinker around with it inside of the call to __enter__. (See the linked doc, just after the start of the motivation and summary section.)
The idea is to create my own sort of on-the-fly local namespace. Something like this:
with MyNameSpace(some_object):
print a #Should print some_object.a
x = 4 #Should set some_object.x=4
Basically, I want the statements inside of the with block to be subordinate to the local variables and assignment conventions of some_object.
In my specific case, some_object might be a special data array that has my own column-wise operations or something. In which case saying something like x = y + 5 if y > 4 else y - 2 might be some fancy NumPy vectorized operation under the hood, but I don't need to explicitly call some_object's interface to those methods. In the namespace, the expressions should "just work" (however I define them to be inferred in the MyNameSpace class.
My first idea is to somehow interrupt the with process and get a hold of the code that goes in the try block. Then interpret that code when __enter__ gets called, and replace the code in the try block with something else (perhaps pass if that would work, but possibly something that restores some_object back to the original variable scope with its new changed variables preserved).
A simple test case would be something like this:
my_dict = {'a':3, 'b':2}
with MyNameSpace(my_dict):
print a # Should print 3
x = 5 # When the block finishes, my_dict['x'] should now be 5
I'm interested if this idea exists somewhere already.
I am aware of best practices things for assigning variables. This is a pet project, so please assume that, just for the sake of this idea, we can ignore best practices. Even if you wouldn't like assigning variables this way, it could be useful in my current project.
Edit
To clarify the kinds of tricky stuff I might want to do, and to address the answer below claiming that it can't be done, consider the example file testLocals.py below:
my_dict = {'a':1, 'b':2}
m = locals()
print m["my_dict"]['a']
m["my_dict"]['c'] = 3
print my_dict
class some_other_scope(object):
def __init__(self, some_scope):
x = 5
g = locals()
some_scope.update(g)
some_scope["my_dict"]["d"] = 4
sos = some_other_scope(m)
print my_dict
print x
which gives the following when I run it non-interactively:
ely#AMDESK:~/Desktop/Programming/Python$ python testLocals.py
1
{'a': 1, 'c': 3, 'b': 2}
{'a': 1, 'c': 3, 'b': 2, 'd': 4}
5
Try this.
import sys
class MyNameSpace(object):
def __init__(self,ns):
self.ns = ns
def __enter__(self):
globals().update(self.ns)
def __exit__(self, exc_type,exc_value,traceback):
self.ns.update(sys._getframe(1).f_locals)
my_dict = {'a':3, 'b':2}
with MyNameSpace(my_dict) as ns:
print(a) # Should print 3
x = 5 # When the block finishes, my_dict['x'] should now be 5
print(my_dict['x'])
Here is something similar I tried a few months ago:
import sys
import inspect
import collections
iscallable = lambda x: isinstance(x, collections.Callable)
class Namespace(object):
def __enter__(self):
"""store the pre-contextmanager scope"""
f = inspect.currentframe(1)
self.scope_before = dict(f.f_locals)
return self
def __exit__(self, exc_type, exc_value, traceback):
"""determine the locally declared objects"""
f = inspect.currentframe(1)
scope_after = dict(f.f_locals)
scope_context = set(scope_after) - set(self.scope_before)
# capture the local scope, ignoring the context manager itself
self.locals = dict(
(k, scope_after[k]) for k in scope_context if not isinstance(scope_after[k], self.__class__)
)
for name in self.locals:
obj = scope_after[name]
if iscallable(obj):
# closure around the func_code with the appropriate locals
_wrapper = type(lambda: 0)(obj.func_code, self.locals)
self.__dict__[name] = _wrapper
# update locals so the calling functions refer to the wrappers too
self.locals[name] = _wrapper
else:
self.__dict__[name] = obj
# remove from module scope
del sys.modules[__name__].__dict__[name]
return self
with Namespace() as Spam:
x = 1
def ham(a):
return x + a
def cheese(a):
return ham(a) * 10
It uses inspect to modify locals while within the context manager and then to re-assign back to the original values when done.
It's not perfect - I can't remember where it hits issues, but I'm sure it does - but it might help you get started.
I have some python code that's throwing a KeyError exception. So far I haven't been able to reproduce outside of the operating environment, so I can't post a reduced test case here.
The code that's raising the exception is iterating through a loop like this:
for k in d.keys():
if condition:
del d[k]
The del[k] line throws the exception. I've added a try/except clause around it and have been able to determine that k in d is False, but k in d.keys() is True.
The keys of d are bound methods of old-style class instances.
The class implements __cmp__ and __hash__, so that's where I've been focusing my attention.
k in d.keys() will test equality iteratively for each key, while k in d uses __hash__, so your __hash__ may be broken (i.e. it returns different hashes for objects that compare equal).
Simple example of what's broken, for interest:
>>> count = 0
>>> class BrokenHash(object):
... def __hash__(self):
... global count
... count += 1
... return count
...
... def __eq__(self, other):
... return True
...
>>> foo = BrokenHash()
>>> bar = BrokenHash()
>>> foo is bar
False
>>> foo == bar
True
>>> baz = {bar:1}
>>> foo in baz
False
>>> foo in baz.keys()
True
Don't delete items in d while iterating over it, store the keys you want to delete in a list and delete them in another loop:
deleted = []
for k in d.keys():
if condition:
deleted.append(k)
for k in deleted:
del d[k]
What you're doing would throw a a concurrent modification exception in Java. d.keys() creates a list of the keys as they exist when you call it, but that list is now static - modifications to d will not change a stored version of d.keys(). So when you iterate over d.keys() but delete items, you end up with the possibility of modifying a key that is no longer there.
You can use d.pop(k, None), which will return either the value mapped to k or None, if k is not present. This avoids the KeyError problem.
EDIT: For clarification, to prevent more phantom downmods (no problem with negative feedback, just make it constructive and leave a comment so we can have a potentially informative discussion - I'm here to learn as well as help):
It's true that in this particular condition it shouldn't get messed up. I was just bringing it up as a potential issue, because if he's using the same kind of coding scheme in another portion of the program where he isn't so careful/lucky about how he's treating the data structure, such problems could arise. He isn't even using a dictionary, as well, but rather a class that implements certain methods so you can treat it in a similar fashion.