Related
The Zen of python tells us:
There should be one and only one obvious way to do it.
This is difficult to put in practice when it comes to the following situation.
A class receives a list of documents.
The output is a dictionary per document with a variety of key/value pairs.
Every pair depends on a previous calculated one or even from other value/pairs of other dictionary of the list.
This is a very simplified example of such a class.
What is the “obvious” way to go? Every method adds a value/pair to every of the dictionaries.
class T():
def __init__(self,mylist):
#build list of dicts
self.J = [{str(i):mylist[i]} for i in range(len(mylist))]
# enhancement 1: upper
self.method1()
# enhancement 2: lower
self.J = self.static2(self.J)
def method1(self):
newdict = []
for i,mydict in enumerate(self.J):
mydict['up'] = mydict[str(i)].upper()
newdict.append(mydict)
self.J = newdict
#staticmethod
def static2(alist):
J = []
for i,mydict in enumerate(alist):
mydict['down'] = mydict[str(i)].lower()
J.append(mydict)
return J
#property
def propmethod(self):
J = []
for i,mydict in enumerate(self.J):
mydict['prop'] = mydict[str(i)].title()
J.append(mydict)
return J
# more methods extrating info out of every doc in the list
# ...
self.method1() is simple run and a new key/value pair is added to every dict.
The static method 2 can also be used.
and also the property.
Out of the three ways I discharge #property because I am not adding another attribute.
From the other two which one would you choose?
Remember the class will be composed by tens of this Methode that so not add attributes. Only Update (add keine pair values) dictionaries in a list.
I can not see the difference between method1 and static2.
thx.
I was wondering if using the #property in python to update an attribute overwrites it or simply updates it? As the speed is very different in the 2 cases.
And in case it gets overwritten, what alternative can I use? Example:
class sudoku:
def __init__(self,puzzle):
self.grid={(i,j):puzzle[i][j] for i in range(9) for j in range(9)}
self.elements
self.forbidden=set()
#property
def elements(self):
self.rows=[[self.grid[(i,j)] for j in range(9)] for i in range(9)]
self.columns=[[self.grid[(i,j)] for i in range(9)] for j in range(9)]
self.squares={(i,j): [self.grid[(3*i+k,3*j+l)] for k in range(3) for l in range(3)] for i in range(3) for j in range(3) }
self.stack=[self.grid]
self.empty={k for k in self.grid.keys() if self.grid[k]==0}
Basically, I work with the grid method, and whenever I need to update the other attributes I call elements. I prefer to call it manually tho. The question, however, is that if I change self.grid[(i,j)], does python calculate each attribute from scratch because self.grid was changed or does it only change the i-th row, j-th column etc?
Thank you
edit: added example code
As is, your question is totally unclear - but anyway, since you don't seem to understand what a property is and how it works...
class Obj(object):
def __init__(self, x, y):
self.x = x
#property
def x(self):
return self._x / 2
#x.setter
def x(self, value):
self._x = value * 2
Here we have a class with a get/set ("binding") property x, backed by a protected attribute _x.
The "#property" syntax here is mainly syntactic sugar, you could actually write this code as
class Obj(object):
def __init__(self, x, y):
self.x = x
self.y = y
def get_x(self):
return self._x / 2
def set_x(self, value):
self._x = value * 2
x = property(fget=get_x, fset=set_x)
The only difference with the previous version being that the get_x and set_x functions remain available as methods. Then if we have an obj instance:
obj = Obj(2, 4)
Then
x = obj.x
is just a shortcut for
x = obj.get_x()
and
obj.x = 42
is just a shortcut for
obj.set_x(42)
How this "shortcut" works is fully documented here, with a whole chapter dedicated to the property type.
As you can see there's nothing magical here, and once you get (no pun intended) the descriptor protocol and how the property class uses it, you can answer the question by yourself.
Note that properties will ALWAYS add some overhead (vs plain attributes or direct method call) since you have more indirections levels and method calls invoked, so it's best to only use them when it really makes sense.
EDIT: now you posted your code, I confirm that you don't understand Python's "properties" - not only the technical side of it but even the basic concept of a "computed attribute".
The point of computed attributes in general (the builtin property type being just one generic implementation of) is to have the interface of a plain attribute (something you can get the value if with value = obj.attrname and eventually set the value of with obj.attrname = somevalue) but actually invoking a getter (and eventually a setter) behind the hood.
Your elements "property" while technically implemented as a read-only property, is really a method that initializes half a dozen attributes of your class, doesn't return anything (well it implicitely returns None) and which return value is actually never used (of course). This is definitly not what computed attributes are for. This should NOT be a property, it should be a plain function (with some explicit name such as "setup_elements" or whatever makes sense here).
# nb1 : classes names should be CamelCased
# nb2 : in Python 2x, you want to inherit from 'object'
class Sudoku(object):
def __init__(self,puzzle):
self.grid={(i,j):puzzle[i][j] for i in range(9) for j in range(9)}
self.setup_elements()
self.forbidden=set()
def setup_elements(self):
self.rows=[[self.grid[(i,j)] for j in range(9)] for i in range(9)]
self.columns=[[self.grid[(i,j)] for i in range(9)] for j in range(9)]
self.squares={(i,j): [self.grid[(3*i+k,3*j+l)] for k in range(3) for l in range(3)] for i in range(3) for j in range(3) }
self.stack=[self.grid]
self.empty={k for k, v in self.grid.items() if v==0}
Now to answer your question:
if I change self.grid[(i,j)], does python calculate each attribute from scratch because self.grid was changed
self.grid is a plain attribute, so just rebinding self.grid[(i, j)] doesn't make "python" calculate anything else, of course. None of your object's other attributes will be impacted. Actually Python (the interpreter) has no mind-reading ability and will only do exactly what you asked for, nothing less, nothing more, period.
or does it only change the i-th row, j-th column
This :
obj = Sudoku(some_puzzle)
obj.grid[(1, 1)] = "WTF did you expect ?"
will NOT (I repeat: "NOT") do anything else than assigning the literal string "WTF did you expect ?" to obj.grid[(1, 1)]. None of the other attributes will be updated in any way.
Now if your question was: "if I change something to self.grid and call self.setup_elements() after, will Python recompute all attributes or only update self.rows[xxx] and self.columns[yyy]", then the answer is plain simple: Python will do exactly what you asked for: it will execute self.setup_elements(), line after line, statement after statement. Plain and simple. No magic here, and the only thing you'll get from making it a property instead of a plain method is that you won't have to type the () after to invoke the method.
So if what you expected from making this elements() method a property was to have some impossible magic happening behind the scene to detect that you actually only wanted to recompute impacted elements, then bad news, this is not going to happen, and you will have to explicitely tell the interpreter how to do so. Computed attributes might be part of the solution here, but not by any magic - you will have to write all the code needed to intercept assignments to any of those attributes and recompute what needs to be recomputed.
Beware, since all those attributes are mutable containers, just wrapping each of them into properties won't be enough - consider this:
class Foo(object):
def __init__(self):
self._bar = {"a":1, "b": 2}
#property
def bar(self):
print("getting self._bar")
return self._bar
#bar.setter
def bar(self, value):
print("setting self._bar to {}".format(value))
self._bar = value
>>> f = Foo()
>>> f.bar
getting self._bar
{'a': 1, 'b': 2}
>>> f.bar['z'] = "WTF ?"
getting self._bar
>>> f.bar
getting self._bar
{'a': 1, 'b': 2, 'z': 'WTF ?'}
>>> bar = f.bar
getting self._bar
>>> bar
{'a': 1, 'b': 2, 'z': 'WTF ?'}
>>> bar["a"] = 99
>>> f.bar
getting self._bar
{'a': 99, 'b': 2, 'z': 'WTF ?'}
As you can see, we could mutate self._bar without the bar.setter function ever being invoked - because f.bar["x"] = "y" is actually NOT assigning to f.bar (which would need f.bar = "something else") but _getting_ thef._bardict thru theFoo.bargetter, then invokingsetitem()` on this dict.
So if you want to intercept something like f.bar["x"] = "y", you will also have to write some dict-like object that will intercept all mutators access on the dict itself ( __setitem__, but also __delitem__ etc) and notify f of those changes, and change your property so that it returns an instance of this dict-like objects instead.
Suppose I have a list of inputs that will generate O objects, of the following form:
inps = [['A', 5], ['B', 2]]
and O has subclasses A and B. A and B each are initiated with a single integer --
5 or 2 in the example above -- and have a method update(self, t), so I believe it makes sense to group them under an O superclass. I could complete the program with a loop:
Os = []
for inp in inps:
if inp[0] == 'A':
Os.append(A(inp[1]))
elif inp[0] == 'B':
Os.append(B(inp[1]))
and then at runtime,
for O in Os: O.update(t)
I'm wondering, however, if there is a more object oriented way to accomplish this. One way, I suppose, might be to make a fake "O constructor" outside of the O class:
def initO(inp):
if inp[0] == 'A':
return A(inp[1])
elif inp[0] == 'B':
return B(inp[1])
Os = [initO(inp) for inp in inps]
This is more elegant, in my opinion, and for all intensive purposes gives me the result I want; but it feels like a complete abuse of the class system in python. Is there a better way to do this, perhaps by initiating A and B from the O constructor?
EDIT: The ideal would be to be able to use
Os = [O(inp) for inp in inps]
while maintaining O as a superclass of A and B.
You could use a dict to map the names to the actual classes:
dct = {'A': A, 'B': B}
[dct[name](argument) for name, argument in inps]
Or if you don't want the list-comprehension:
dct = {'A': A, 'B': B}
Os = []
for inp in inps:
cls = dct[inp[0]]
Os.append(cls(inp[1]))
Although it is technically possible to perform call by name in Python, I strongly advice not to do that. The cleanest way is probably using a dictionary:
trans = { 'A' : A, 'B' : B }
def initO(inp):
cons = trans.get(inp[0])
if cons is not None:
return cons(*inp[1:])
So here trans is a dictionary that maps names on classes (and thus corresponding constructors).
In the initO we perform a lookup, if the lookup succeeds, we call the constructor cons with the remaining arguments of inp.
In case you really want to create a (direct) subclass from within a parent class you could use the special __subclasses__ method:
class O(object):
def __init__(self, integer):
self.value = integer
#classmethod
def get_subclass(cls, subclassname, value):
# probably not a really good name for that method - I'm out of creativity...
subcls = next(sub for sub in cls.__subclasses__() if sub.__name__ == subclassname)
return subcls(value)
def __repr__(self):
return '{self.__class__.__name__}({self.value})'.format(self=self)
class A(O):
pass
class B(O):
pass
This acts like a factory:
>>> O.get_subclass('A', 1)
A(1)
Or as list-comprehension:
>>> [O.get_subclass(*inp) for inp in inps]
In case you want to optimize it and you know that you won't add subclasses during the programs progress you could put the subclasses in a dictionary that maps from __name__ to the subclass:
class O(object):
__subs = {}
def __init__(self, integer):
self.value = integer
#classmethod
def get_subclass(cls, subclassname, value):
if not cls.__subs:
cls.__subs = {sub.__name__: sub for sub in cls.__subclasses__()}
return cls.__subs[subclassname](value)
You could probably also use __new__ to implement that behavior or a metaclass but I think a classmethod may be more appropriate here because it's easy to understand and allows for more flexibility.
In case you not only want direct subclasses you might want to check this recipe to find even subclasses of your subclasses (I also implemented it in a 3rd party extension package of mine: iteration_utilities.itersubclasses).
Without knowing more about your A and B, it's hard to say. But this looks like a classic case for a switch in a language like C. Python doesn't have a switch statement, so the use of a dict or dict-like construct is used instead.
If you're sure your inputs are clean, you can directly get your classes using the globals() function:
Os = [globals()[f](x) for (f, x) in inps]
If you want to sanitize, you can do something like this:
allowed = {'A', 'B'}
Os = [globals()[f](x) for (f, x) in inps if f in allowed]
This solution can also be changed if you prefer to have a fixed dictionary and sanitized inputs:
allowed = {'A', 'B'}
classname_to_class = {k: v for (k, v) in globals().iteritems() if k in allowed}
# Now, you can have a dict mapping class names to classes without writing 'A': A, 'B': B ...
Alternately, if you can prefix all your class definitions, you could even do something like this:
classname_to_class = {k[13:]: v for (k, v) in globals().iteritems() if k.startswith('SpecialPrefix'} # 13: is the length of 'SpecialPrefix'
This solution allows you to just name your classes with a prefix and have the dictionary automatically populate (after stripping out the special prefix if you so choose). These dictionaries are equivalent to trans and dct in the other solutions posted here, except without having to manually generate the dictionary.
Unlike the other solutions posted so far, these reduce the likelihood of a transcription error (and the amount of boilerplate code required) in cases where you have a lot more classes than A and B.
At the risk of drawing more negative fire... we can use metaclasses. This may or may not be suitable for your particular application. Every time you define a subclass of class O, you always have an up-to-date list (well, dict) of O's subclasses. Oh, and this is written for Python 2 (but can be ported to Python 3).
class OMetaclass(type):
'''This metaclass adds a 'subclasses' attribute to its classes that
maps subclass name to the class object.'''
def __init__(cls, name, bases, dct):
if not hasattr(cls, 'subclasses'):
cls.subclasses = {}
else:
cls.subclasses[name] = cls
super(OMetaclass, cls).__init__(name, bases, dct)
class O(object):
__metaclass__ = OMetaclass
### Now, define the rest of your subclasses of O as usual.
class A(O):
def __init__(self, x): pass
class B(O):
def __init__(self, x): pass
Now, you have a dictionary, O.subclasses, that contains all the subclasses of O. You can now just do this:
Os = [O.subclasses[cls](arg) for (cls, arg) in inps]
Now, you don't have to worry about weird prefixes for your classes and you won't need to change your code if you're subclassing O already, but you've introduced magic (metaclasses) that may make your program harder to grok.
I need a way to inspect a class so I can safely identify which attributes are user-defined class attributes. The problem is that functions like dir(), inspect.getmembers() and friends return all class attributes including the pre-defined ones like: __class__, __doc__, __dict__, __hash__. This is of course understandable, and one could argue that I could just make a list of named members to ignore, but unfortunately these pre-defined attributes are bound to change with different versions of Python therefore making my project volnerable to changed in the python project - and I don't like that.
example:
>>> class A:
... a=10
... b=20
... def __init__(self):
... self.c=30
>>> dir(A)
['__doc__', '__init__', '__module__', 'a', 'b']
>>> get_user_attributes(A)
['a','b']
In the example above I want a safe way to retrieve only the user-defined class attributes ['a','b'] not 'c' as it is an instance attribute. So my question is... Can anyone help me with the above fictive function get_user_attributes(cls)?
I have spent some time trying to solve the problem by parsing the class in AST level which would be very easy. But I can't find a way to convert already parsed objects to an AST node tree. I guess all AST info is discarded once a class has been compiled into bytecode.
Below is the hard way. Here's the easy way. Don't know why it didn't occur to me sooner.
import inspect
def get_user_attributes(cls):
boring = dir(type('dummy', (object,), {}))
return [item
for item in inspect.getmembers(cls)
if item[0] not in boring]
Here's a start
def get_user_attributes(cls):
boring = dir(type('dummy', (object,), {}))
attrs = {}
bases = reversed(inspect.getmro(cls))
for base in bases:
if hasattr(base, '__dict__'):
attrs.update(base.__dict__)
elif hasattr(base, '__slots__'):
if hasattr(base, base.__slots__[0]):
# We're dealing with a non-string sequence or one char string
for item in base.__slots__:
attrs[item] = getattr(base, item)
else:
# We're dealing with a single identifier as a string
attrs[base.__slots__] = getattr(base, base.__slots__)
for key in boring:
del attrs['key'] # we can be sure it will be present so no need to guard this
return attrs
This should be fairly robust. Essentially, it works by getting the attributes that are on a default subclass of object to ignore. It then gets the mro of the class that's passed to it and traverses it in reverse order so that subclass keys can overwrite superclass keys. It returns a dictionary of key-value pairs. If you want a list of key, value tuples like in inspect.getmembers then just return either attrs.items() or list(attrs.items()) in Python 3.
If you don't actually want to traverse the mro and just want attributes defined directly on the subclass then it's easier:
def get_user_attributes(cls):
boring = dir(type('dummy', (object,), {}))
if hasattr(cls, '__dict__'):
attrs = cls.__dict__.copy()
elif hasattr(cls, '__slots__'):
if hasattr(base, base.__slots__[0]):
# We're dealing with a non-string sequence or one char string
for item in base.__slots__:
attrs[item] = getattr(base, item)
else:
# We're dealing with a single identifier as a string
attrs[base.__slots__] = getattr(base, base.__slots__)
for key in boring:
del attrs['key'] # we can be sure it will be present so no need to guard this
return attrs
Double underscores on both ends of 'special attributes' have been a part of python before 2.0. It would be very unlikely that they would change that any time in the near future.
class Foo(object):
a = 1
b = 2
def get_attrs(klass):
return [k for k in klass.__dict__.keys()
if not k.startswith('__')
and not k.endswith('__')]
print get_attrs(Foo)
['a', 'b']
Thanks aaronasterling, you gave me the expression i needed :-)
My final class attribute inspector function looks like this:
def get_user_attributes(cls,exclude_methods=True):
base_attrs = dir(type('dummy', (object,), {}))
this_cls_attrs = dir(cls)
res = []
for attr in this_cls_attrs:
if base_attrs.count(attr) or (callable(getattr(cls,attr)) and exclude_methods):
continue
res += [attr]
return res
Either return class attribute variabels only (exclude_methods=True) or also retrieve the methods.
My initial tests og the above function supports both old and new-style python classes.
/ Jakob
If you use new style classes, could you simply subtract the attributes of the parent class?
class A(object):
a = 10
b = 20
#...
def get_attrs(Foo):
return [k for k in dir(Foo) if k not in dir(super(Foo))]
Edit: Not quite. __dict__,__module__ and __weakref__ appear when inheriting from object, but aren't there in object itself. You could special case these--I doubt they'd change very often.
Sorry for necro-bumping the thread. I'm surprised that there's still no simple function (or a library) to handle such common usage as of 2019.
I'd like to thank aaronasterling for the idea. Actually, set container provides a more straightforward way to express it:
class dummy: pass
def abridged_set_of_user_attributes(obj):
return set(dir(obj))-set(dir(dummy))
def abridged_list_of_user_attributes(obj):
return list(abridged_set_of_user_attributes(obj))
The original solution using list comprehension is actually two level of loops because there are two in keyword compounded, despite having only one for keyword made it look like less work than it is.
This worked for me to include user defined attributes with __ that might be be found in cls.__dict__
import inspect
class A:
__a = True
def __init__(self, _a, b, c):
self._a = _a
self.b = b
self.c = c
def test(self):
return False
cls = A(1, 2, 3)
members = inspect.getmembers(cls, predicate=lambda x: not inspect.ismethod(x))
attrs = set(dict(members).keys()).intersection(set(cls.__dict__.keys()))
__attrs = {m[0] for m in members if m[0].startswith(f'_{cls.__class__.__name__}')}
attrs.update(__attrs)
This will correctly yield: {'_A__a', '_a', 'b', 'c'}
You can update to clean the cls.__class__.__name__ if you wish
Say I have a Python function that returns multiple values in a tuple:
def func():
return 1, 2
Is there a nice way to ignore one of the results rather than just assigning to a temporary variable? Say if I was only interested in the first value, is there a better way than this:
x, temp = func()
You can use x = func()[0] to return the first value, x = func()[1] to return the second, and so on.
If you want to get multiple values at a time, use something like x, y = func()[2:4].
One common convention is to use a "_" as a variable name for the elements of the tuple you wish to ignore. For instance:
def f():
return 1, 2, 3
_, _, x = f()
If you're using Python 3, you can you use the star before a variable (on the left side of an assignment) to have it be a list in unpacking.
# Example 1: a is 1 and b is [2, 3]
a, *b = [1, 2, 3]
# Example 2: a is 1, b is [2, 3], and c is 4
a, *b, c = [1, 2, 3, 4]
# Example 3: b is [1, 2] and c is 3
*b, c = [1, 2, 3]
# Example 4: a is 1 and b is []
a, *b = [1]
The common practice is to use the dummy variable _ (single underscore), as many have indicated here before.
However, to avoid collisions with other uses of that variable name (see this response) it might be a better practice to use __ (double underscore) instead as a throwaway variable, as pointed by ncoghlan. E.g.:
x, __ = func()
Remember, when you return more than one item, you're really returning a tuple. So you can do things like this:
def func():
return 1, 2
print func()[0] # prints 1
print func()[1] # prints 2
The best solution probably is to name things instead of returning meaningless tuples (unless there is some logic behind the order of the returned items). You can for example use a dictionary:
def func():
return {'lat': 1, 'lng': 2}
latitude = func()['lat']
You could even use namedtuple if you want to add extra information about what you are returning (it's not just a dictionary, it's a pair of coordinates):
from collections import namedtuple
Coordinates = namedtuple('Coordinates', ['lat', 'lng'])
def func():
return Coordinates(lat=1, lng=2)
latitude = func().lat
If the objects within your dictionary/tuple are strongly tied together then it may be a good idea to even define a class for it. That way you'll also be able to define more complex operations. A natural question that follows is: When should I be using classes in Python?
Most recent versions of python (≥ 3.7) have dataclasses which you can use to define classes with very few lines of code:
from dataclasses import dataclass
#dataclass
class Coordinates:
lat: float = 0
lng: float = 0
def func():
return Coordinates(lat=1, lng=2)
latitude = func().lat
The primary advantage of dataclasses over namedtuple is that its easier to extend, but there are other differences. Note that by default, dataclasses are mutable, but you can use #dataclass(frozen=True) instead of #dataclass to force them being immutable.
Here is a video that might help you pick the right data class for your use case.
Three simple choices.
Obvious
x, _ = func()
x, junk = func()
Hideous
x = func()[0]
And there are ways to do this with a decorator.
def val0( aFunc ):
def pick0( *args, **kw ):
return aFunc(*args,**kw)[0]
return pick0
func0= val0(func)
This seems like the best choice to me:
val1, val2, ignored1, ignored2 = some_function()
It's not cryptic or ugly (like the func()[index] method), and clearly states your purpose.
If this is a function that you use all the time but always discard the second argument, I would argue that it is less messy to create an alias for the function without the second return value using lambda.
def func():
return 1, 2
func_ = lambda: func()[0]
func_() # Prints 1
This is not a direct answer to the question. Rather it answers this question: "How do I choose a specific function output from many possible options?".
If you are able to write the function (ie, it is not in a library you cannot modify), then add an input argument that indicates what you want out of the function. Make it a named argument with a default value so in the "common case" you don't even have to specify it.
def fancy_function( arg1, arg2, return_type=1 ):
ret_val = None
if( 1 == return_type ):
ret_val = arg1 + arg2
elif( 2 == return_type ):
ret_val = [ arg1, arg2, arg1 * arg2 ]
else:
ret_val = ( arg1, arg2, arg1 + arg2, arg1 * arg2 )
return( ret_val )
This method gives the function "advanced warning" regarding the desired output. Consequently it can skip unneeded processing and only do the work necessary to get your desired output. Also because Python does dynamic typing, the return type can change. Notice how the example returns a scalar, a list or a tuple... whatever you like!
When you have many output from a function and you don't want to call it multiple times, I think the clearest way for selecting the results would be :
results = fct()
a,b = [results[i] for i in list_of_index]
As a minimum working example, also demonstrating that the function is called only once :
def fct(a):
b=a*2
c=a+2
d=a+b
e=b*2
f=a*a
print("fct called")
return[a,b,c,d,e,f]
results=fct(3)
> fct called
x,y = [results[i] for i in [1,4]]
And the values are as expected :
results
> [3,6,5,9,12,9]
x
> 6
y
> 12
For convenience, Python list indexes can also be used :
x,y = [results[i] for i in [0,-2]]
Returns : a = 3 and b = 12
It is possible to ignore every variable except the first with less syntax if you like. If we take your example,
# The function you are calling.
def func():
return 1, 2
# You seem to only be interested in the first output.
x, temp = func()
I have found the following to works,
x, *_ = func()
This approach "unpacks" with * all other variables into a "throwaway" variable _. This has the benefit of assigning the one variable you want and ignoring all variables behind it.
However, in many cases you may want an output that is not the first output of the function. In these cases, it is probably best to indicate this by using the func()[i] where i is the index location of the output you desire. In your case,
# i == 0 because of zero-index.
x = func()[0]
As a side note, if you want to get fancy in Python 3, you could do something like this,
# This works the other way around.
*_, y = func()
Your function only outputs two potential variables, so this does not look too powerful until you have a case like this,
def func():
return 1, 2, 3, 4
# I only want the first and last.
x, *_, d = func()