Python getattr equivalent for dictionaries? - python

What's the most succinct way of saying, in Python, "Give me dict['foo'] if it exists, and if not, give me this other value bar"? If I were using an object rather than a dictionary, I'd use getattr:
getattr(obj, 'foo', bar)
but this raises a key error if I try using a dictionary instead (a distinction I find unfortunate coming from JavaScript/CoffeeScript). Likewise, in JavaScript/CoffeeScript I'd just write
dict['foo'] || bar
but, again, this yields a KeyError. What to do? Something succinct, please!

dict.get(key, default) returns dict[key] if key in dict, else returns default.
Note that the default for default is None so if you say dict.get(key) and key is not in dict then this will just return None rather than raising a KeyError as happens when you use the [] key access notation.

Also take a look at collections module's defaultdict class. It's a dict for which you can specify what it must return when the key is not found. With it you can do things like:
class MyDefaultObj:
def __init__(self):
self.a = 1
from collections import defaultdict
d = defaultdict(MyDefaultObj)
i = d['NonExistentKey']
type(i)
<instance of class MyDefalutObj>
which allows you to use the familiar d[i] convention.
However, as mikej said, .get() also works, but here is the form closer to your JavaScript example:
d = {}
i = d.get('NonExistentKey') or MyDefaultObj()
# the reason this is slightly better than d.get('NonExistent', MyDefaultObj())
# is that instantiation of default value happens only when 'NonExistent' does not exist.
# With d.get('NonExistent', MyDefaultObj()) you spin up a default every time you .get()
type(i)
<instance of class MyDefalutObj>

Related

How to avoid repetition in a ternary operator assignment?

When fetching a number of config values from os.environ, it's nice to have defaults in the python code to easily allow the application to start in a number of contexts.
A typical django settings.py has a number of
SOME_SETTING = os.environ.get('SOME_SETTING')
lines.
To provide sensible defaults we opted for
SOME_SETTING = os.environ.get('SOME_SETTING') or "theValue"
However, this is error prone because calling the application with
SOME_SETTING=""
manage.py
will lead SOME_SETTING to be set to theValue instead of the explicitly defined ""
Is there a way to assign values in python using the ternary a = b if b else d without repeating b or assigning it to a shorthand variable before?
this becomes obvious if we look at
SOME_VERY_LONG_VAR_NAME = os.environ.get('SOME_VERY_LONG_VAR_NAME') if os.environ.get('SOME_VERY_LONG_VAR_NAME') else 'meh'
It would be much nicer to be able to do something like
SOME_VERY_LONG_VAR_NAME = if os.environ.get('SOME_VERY_LONG_VAR_NAME') else 'meh'
Just like Python's built-in mapping class dict, os.environ.get has a second argument, and it seems like you want it:
SOME_SETTING = os.environ.get('SOME_SETTING', "theValue")
This is the same as
try:
SOME_SETTING = os.environ['SOME_SETTING']
except KeyError:
SOME_SETTING = "theValue"
If you read dict.get()'s doc, you'll find out the method's signature is get(self, key, default=None). The default argument is what gets returned if the key is not found in the dict (and default to a sensible None). So you can use this second argument instead of doing an erroneous boolean test:
SOME_SETTING = os.environ.get('SOME_SETTING', "theValue")

Static methods for recursive functions within a class?

I'm working with nested dictionaries on Python (2.7) obtained from YAML objects and I have a couple of questions that I've been trying to get an answer to by reading, but have not been successful. I'm somewhat new to Python.
One of the simplest functions is one that reads the whole dictionary and outputs a list of all the keys that exist in it. I use an underscore at the beginning since this function is later used by others within a class.
class Myclass(object):
#staticmethod
def _get_key_list(d,keylist):
for key,value in d.iteritems():
keylist.append(key)
if isinstance(value,dict):
Myclass._get_key_list(d.get(key),keylist)
return list(set(keylist))
def diff(self,dict2):
keylist = []
all_keys1 = self._get_key_list(self.d,keylist)
all_keys2 = self._get_key_list(dict2,keylist)
... # More code
Question 1: Is this a correct way to do this? I am not sure whether it's good practice to use a static method for this reason. Since self._get_key_list(d,keylist) is recursive, I dont want "self" to be the first argument once the function is recursively called, which is what would happen for a regular instance method.
I have a bunch of static methods that I'm using, but I've read in a lot of places thay they could perhaps not be good practice when used a lot. I also thought I could make them module functions, but I wanted them to be tied to the class.
Question 2: Instead of passing the argument keylist to self._get_key_list(d,keylist), how can I initialize an empty list inside the recursive function and update it? Initializing it inside would reset it to [] every time.
I would eliminate keylist as an explicit argument:
def _get_keys(d):
keyset = set()
for key, value in d.iteritems():
keylist.add(key)
if isinstance(value, dict):
keylist.update(_get_key_list(value))
return keyset
Let the caller convert the set to a list if they really need a list, rather than an iterable.
Often, there is little reason to declare something as a static method rather than a function outside the class.
If you are concerned about efficiency (e.g., getting lots of repeat keys from a dict), you can go back to threading a single set/list through the calls as an explicit argument, but don't make it optional; just require that the initial caller supply the set/list to update. To emphasize that the second argument will be mutated, just return None when the function returns.
def _get_keys(d, result):
for key, value in d.iteritems():
result.add(key)
if isinstance(value, dict):
_get_keys(value, result)
result = set()
_get_keys(d1, result)
_get_keys(d2, result)
# etc
There's no good reason to make a recursive function in a class a static method unless it is meant to be invoked outside the context of an instance.
To initialize a parameter, we usually assign to it a default value in the parameter list, but in case it needs to be a mutable object such as an empty list in this case, you need to default it to None and the initialize it inside the function, so that the list reference won't get reused in the next call:
class Myclass(object):
def _get_key_list(self, d, keylist=None):
if keylist is None:
keylist = []
for key, value in d.iteritems():
keylist.append(key)
if isinstance(value, dict):
self._get_key_list(d.get(key), keylist)
return list(set(keylist))
def diff(self, dict2):
all_keys1 = self._get_key_list(self.d)
all_keys2 = self._get_key_list(dict2)
... # More code

Initializing a dictionary of dictionaries for integer values with defaultdict [duplicate]

The error comes from publishDB = defaultdict(defaultdict({})) I want to make a database like {subject1:{student_id:{assignemt1:marks, assignment2:marks,finals:marks}} , {student_id:{assignemt1:marks, assignment2:marks,finals:marks}}, subject2:{student_id:{assignemt1:marks, assignment2:marks,finals:marks}} , {student_id:{assignemt1:marks, assignment2:marks,finals:marks}}}. I was trying to populate it as DB[math][10001] = a dict and later read out as d = DB[math][10001]. Since, I am on my office computer I can not try different module.
Am I on right track to do so?
Such a nested dict structure can be achieved using a recursive defaultdict "tree":
def tree():
return defaultdict(tree)
publishDB = tree()
At each level, the defaultdicts are instantiated with tree which is a zero-argument callable, as required.
Then you can simply assign marks:
publishDB[subject][student][assignment] = mark
defaultdict() requires that its first argument be callable: it must be a class that you want an instance of, or a function that returns an instance.
defaultdict({}) has an empty dictionary, which is not callable.
You likely want defaultdict(dict), as dict is a class that returns a dictionary when instantiated (called).
But that still doesn't solve the problem... just moves it to a different level. The outer defaultdict(...) in defaultdict(defaultdict(dict)) has the exact same issue because defaultdict(dict) isn't callable.
You can use a lambda expression to solve this, creating a one-line function that, when called, creates a defaultdict(dict):
defaultdict(lambda: defaultdict(dict))
You could also use the lambda at the lower level if you wanted:
defaultdict(lambda: defaultdict(lambda: {}))

python getattr() with multiple params

Construction getattr(obj, 'attr1.attr2', None) does not work.
What are the best practices to replace this construction?
Divide that into two getattr statements?
You can use operator.attrgetter() in order to get multiple attributes at once:
from operator import attrgetter
my_attrs = attrgetter(attr1, attr2)(obj)
As stated in this answer, the most straightforward solution would be to use operator.attrgetter (more info in this python docs page).
If for some reason, this solution doesn't make you happy, you could use this code snippet:
def multi_getattr(obj, attr, default = None):
"""
Get a named attribute from an object; multi_getattr(x, 'a.b.c.d') is
equivalent to x.a.b.c.d. When a default argument is given, it is
returned when any attribute in the chain doesn't exist; without
it, an exception is raised when a missing attribute is encountered.
"""
attributes = attr.split(".")
for i in attributes:
try:
obj = getattr(obj, i)
except AttributeError:
if default:
return default
else:
raise
return obj
# Example usage
obj = [1,2,3]
attr = "append.__doc__.capitalize.__doc__"
multi_getattr(obj, attr) #Will return the docstring for the
#capitalize method of the builtin string
#object
from this page, which does work. I tested and used it.
I would suggest using something like this:
from operator import attrgetter
attrgetter('attr0.attr1.attr2.attr3')(obj)
If you have the attribute names you want to get in a list, you can do the following:
my_attrs = [getattr(obj, attr) for attr in attr_list]
A simple, but not very eloquent way, to get multiple attr would be to use tuples with or without brackets something like
aval, bval = getattr(myObj,"a"), getattr(myObj,"b")
but I think you might be wanting instead to get atrribute of a contained object with the way you are using dot notation. In which case it would be something like
getattr(myObj.contained, "c")
where contained is an object cotained within myObj object and c is an attribute of contained. Let me know if this is not what you want.

Python - Dictionary - Modify __getitem__?

Ok so i've build my own variable handler which has a __getitem__ function for use when accessing data via data[key], it works great except for when trying to access a link of items:
data["key"]["subkey"]
def __getitem__(self, key, **args):
print key
...
return self.dict[key]
When trying to access a subkey that doesn't exist, Python simply returns a KeyError without printing "subkey", why is this and how can I get Python to print out what I'm actually trying to get?
I know that I've probably misunderstood the mechanics but is there a way to emulate a dictionary AND follow the string of data that's being requested?
Mainly so I can dynamically log the missing variables in a dictionary flow...
This obviously works (but it's not the native syntax that I like):
data["key:subkey"]
def __getitem__(self, key, **args):
for slice in key.split(':'):
print key
...
The goal is to emulate the following,
Works:
data = {'key' : {'subkey' : 1}}
print data["key"]["subkey"]
Will not work, but I want to catch the exception within __getitem__ and then create the missing key automatically or just log the missing subkey:
data = {'key' : {}}
print data["key"]["subkey"]
Solution:
class Var():
def __init__(self):
self.dict = {'test' : {}}
def __getitem__(self, var, **args):
print ':',var
if var in self.dict:
v = Var(self.dict[var])
return v
print vHandle['test']['down']
Output:
: test
: down
None
The fact is that when Python encounters an expression such as data["key"]["subkey"], what is done internally is (data["key"])["subkey"]. That is, the first part of the expression is resolved: the retrievalof the item "key" from the object "data". Then, Python tries do call __getitem__ on the resulting object of that expression.
If such resulting object does not have a __getitem__method itself, there is your error.
There are two possible workarounds there: you should either work with "tuple indexes" - like
data["key", "subkey"](and then test on your __getitem__ method wether you got a tuple instance as the key) - or make __getitem__ return an specialized object that also features a __getitem__ method - even if all it does is to log the requested keys.
Remember: tmp = foo['bar']['baz'] is the same as tmp = foo['bar']; tmp = tmp['baz']
So to allow arbitrary depths your __getitem__ method must return a new object that also contains such a __getitem__ method.

Categories

Resources