I want to add __add__ and __radd__ to the python base class set.
The code can be as simple as
def __add__(self, other) :
assert isinstance(other, set), \
"perhaps additional type checking or argument validation can go here but" + \
" strictly only defined for pure python sets"
return set( list(self) + list(other) )
def __radd__(self, other) :
assert isinstance(other, set), \
"perhaps additional type checking or argument validation can go here but" + \
" strictly only defined for pure sets"
return set( list(other) + list(self) )
What is the pythonic implementation of this and how can I extend the base class without creating my own MySet class that takes set as a parent class? Can I just use set.__add__ = some_function_I_defined?
What you should do imho is subclass the built-in set class.
In Python (contrary to e.g. Ruby or JavaScript) monkey-patching a built-in is not allowed.
So e.g. trying to add a non-existent method:
x = [1,2,3]
x.my_new_method_added_in_runtime = lambda: "whatever"
is not going to work, you'd get AttributeError: 'list' object has no attribute 'my_new_method_added_in_runtime'
You cannot also modify the existing methods of objects instantiated using those built-ins:
x = [1,2,3]
x.sort = lambda: "returning some string instead of sorting..."
will result in AttributeError: 'list' object attribute 'sort' is read-only
And
list.append = None
# OR
del list.append
will result in:
TypeError: can't set attributes of built-in/extension type 'list'
All of the above is true for set as well an so on.
You could try to look for some libraries to achieve that e.g. https://pypi.org/project/forbiddenfruit/0.1.0/, but it's strongly discouraged.
#chepner correctly answered my question without realizing it. I was trying to reimplement already existing functionality since I was not familiar with python sets.
The act of joining two sets, 'taking their union' is implemented with the python __or__ and __ior__ methods. The operation I needed specifically was | and |=.
In principle we should be able to set operations as chepner suggested with set.__add__ = set.__or__, but as chepner points out this results in an error:
TypeError: cannot set '__add__' attribute of immutable type 'set'
Thank you all.
Related
For me what I do is detect what is unpickable and make it into a string (I guess I could have deleted it too but then it will falsely tell me that field didn't exist but I'd rather have it exist but be a string). But I wanted to know if there was a less hacky more official way to do this.
Current code I use:
def make_args_pickable(args: Namespace) -> Namespace:
"""
Returns a copy of the args namespace but with unpickable objects as strings.
note: implementation not tested against deep copying.
ref:
- https://stackoverflow.com/questions/70128335/what-is-the-proper-way-to-make-an-object-with-unpickable-fields-pickable
"""
pickable_args = argparse.Namespace()
# - go through fields in args, if they are not pickable make it a string else leave as it
# The vars() function returns the __dict__ attribute of the given object.
for field in vars(args):
field_val: Any = getattr(args, field)
if not dill.pickles(field_val):
field_val: str = str(field_val)
setattr(pickable_args, field, field_val)
return pickable_args
Context: I think I do it mostly to remove the annoying tensorboard object I carry around (but I don't think I will need the .tb field anymore thanks to wandb/weights and biases). Not that this matters a lot but context is always nice.
Related:
What does it mean for an object to be picklable (or pickle-able)?
Python - How can I make this un-pickleable object pickleable?
Edit:
Since I decided to move away from dill - since sometimes it cannot recover classes/objects (probably because it cannot save their code or something) - I decided to only use pickle (which seems to be the recommended way to be done in PyTorch).
So what is the official (perhaps optimized) way to check for pickables without dill or with the official pickle?
Is this the best:
def is_picklable(obj):
try:
pickle.dumps(obj)
except pickle.PicklingError:
return False
return True
thus current soln:
def make_args_pickable(args: Namespace) -> Namespace:
"""
Returns a copy of the args namespace but with unpickable objects as strings.
note: implementation not tested against deep copying.
ref:
- https://stackoverflow.com/questions/70128335/what-is-the-proper-way-to-make-an-object-with-unpickable-fields-pickable
"""
pickable_args = argparse.Namespace()
# - go through fields in args, if they are not pickable make it a string else leave as it
# The vars() function returns the __dict__ attribute of the given object.
for field in vars(args):
field_val: Any = getattr(args, field)
# - if current field value is not pickable, make it pickable by casting to string
if not dill.pickles(field_val):
field_val: str = str(field_val)
elif not is_picklable(field_val):
field_val: str = str(field_val)
# - after this line the invariant is that it should be pickable, so set it in the new args obj
setattr(pickable_args, field, field_val)
return pickable_args
def make_opts_pickable(opts):
""" Makes a namespace pickable """
return make_args_pickable(opts)
def is_picklable(obj: Any) -> bool:
"""
Checks if somehting is pickable.
Ref:
- https://stackoverflow.com/questions/70128335/what-is-the-proper-way-to-make-an-object-with-unpickable-fields-pickable
"""
import pickle
try:
pickle.dumps(obj)
except pickle.PicklingError:
return False
return True
Note: one of the reasons I want something "offical"/tested is because I am getting pycharm halt on the try catch: How to stop PyCharm's break/stop/halt feature on handled exceptions (i.e. only break on python unhandled exceptions)? which is not what I want...I want it to only halt on unhandled exceptions.
What is the proper way to make an object with unpickable fields pickable?
I believe the answer to this belongs in the question you linked -- Python - How can I make this un-pickleable object pickleable?. I've added a new answer to that question explaining how you can make an unpicklable object picklable the proper way, without using __reduce__.
So what is the official (perhaps optimized) way to check for pickables without dill or with the official pickle?
Objects that are picklable are defined in the docs as follows:
None, True, and False
integers, floating point numbers, complex numbers
strings, bytes, bytearrays
tuples, lists, sets, and dictionaries containing only picklable objects
functions defined at the top level of a module (using def, not lambda)
built-in functions defined at the top level of a module
classes that are defined at the top level of a module
instances of such classes whose dict or the result of calling getstate() is picklable (see section Pickling Class Instances for details).
The tricky parts are (1) knowing how functions/classes are defined (you can probably use the inspect module for that) and (2) recursing through objects, checking against the rules above.
There are a lot of caveats to this, such as the pickle protocol versions, whether the object is an extension type (defined in a C extension like numpy, for example) or an instance of a 'user-defined' class. Usage of __slots__ can also impact whether an object is picklable or not (since __slots__ means there's no __dict__), but can be pickled with __getstate__. Some objects may also be registered with a custom function for pickling. So, you'd need to know if that has happened as well.
Technically, you can implement a function to check for all of this in Python, but it will be quite slow by comparison. The easiest (and probably most performant, as pickle is implemented in C) way to do this is to simply attempt to pickle the object you want to check.
I tested this with PyCharm pickling all kinds of things... it doesn't halt with this method. The key is that you must anticipate pretty much any kind of exception (see footnote 3 in the docs). The warnings are optional, they're mostly explanatory for the context of this question.
def is_picklable(obj: Any) -> bool:
try:
pickle.dumps(obj)
return True
except (pickle.PicklingError, pickle.PickleError, AttributeError, ImportError):
# https://docs.python.org/3/library/pickle.html#what-can-be-pickled-and-unpickled
return False
except RecursionError:
warnings.warn(
f"Could not determine if object of type {type(obj)!r} is picklable"
"due to a RecursionError that was supressed. "
"Setting a higher recursion limit MAY allow this object to be pickled"
)
return False
except Exception as e:
# https://docs.python.org/3/library/pickle.html#id9
warnings.warn(
f"An error occurred while attempting to pickle"
f"object of type {type(obj)!r}. Assuming it's unpicklable. The exception was {e}"
)
return False
Using the example from my other answer I linked above, you could make your object picklable by implementing __getstate__ and __setstate__ (or subclassing and adding them, or making a wrapper class) adapting your make_args_pickable...
class Unpicklable:
"""
A simple marker class so we can distinguish when a deserialized object
is a string because it was originally unpicklable
(and not simply a string to begin with)
"""
def __init__(self, obj_str: str):
self.obj_str = obj_str
def __str__(self):
return self.obj_str
def __repr__(self):
return f'Unpicklable(obj_str={self.obj_str!r})'
class PicklableNamespace(Namespace):
def __getstate__(self):
"""For serialization"""
# always make a copy so you don't accidentally modify state
state = self.__dict__.copy()
# Any unpicklables will be converted to a ``Unpicklable`` object
# with its str format stored in the object
for key, val in state.items():
if not is_picklable(val):
state[key] = Unpicklable(str(val))
return state
def __setstate__(self, state):
self.__dict__.update(state) # or leave unimplemented
In action, I'll pickle a namespace whose attributes contain a file handle (normally not picklable) and then load the pickle data.
# Normally file handles are not picklable
p = PicklableNamespace(f=open('test.txt'))
data = pickle.dumps(p)
del p
loaded_p = pickle.loads(data)
# PicklableNamespace(f=Unpicklable(obj_str="<_io.TextIOWrapper name='test.txt' mode='r' encoding='cp1252'>"))
Yes, a try/except is the best way to go about this.
Per the docs, pickle is capable of recursively pickling objects, that is to say, if you have a list of objects that are pickleable, it will pickle all objects inside of that list if you attempt to pickle that list. This means that you cannot feasibly test to see if an object is pickleable without pickling it. Because of that, your structure of:
def is_picklable(obj):
try:
pickle.dumps(obj)
except pickle.PicklingError:
return False
return True
is the simplest and easiest way to go about checking this. If you are not working with recursive structures and/or you can safely assume that all recursive structures will only contain pickleable objects, you could check the type() value of the object against the list of pickleable objects:
None, True, and False
integers, floating point numbers, complex numbers
strings, bytes, bytearrays
tuples, lists, sets, and dictionaries containing only picklable objects
functions defined at the top level of a module (using def, not lambda)
built-in functions defined at the top level of a module
classes that are defined at the top level of a module
instances of such classes whose dict or the result of calling getstate() is picklable (see section Pickling Class Instances for details).
This is likely faster than using a try:... except:... like you showed in your question.
To me no matter the error I want my function to tell me it's not pickable. So it seems to work if I do this:
def is_picklable(obj: Any) -> bool:
"""
Checks if somehting is pickable.
Ref:
- https://stackoverflow.com/questions/70128335/what-is-the-proper-way-to-make-an-object-with-unpickable-fields-pickable
- pycharm halting all the time issue: https://stackoverflow.com/questions/70761481/how-to-stop-pycharms-break-stop-halt-feature-on-handled-exceptions-i-e-only-b
"""
import pickle
try:
pickle.dumps(obj)
except:
return False
return True
plus as an added bonus it doesn't freak pycharm out see How to stop PyCharm's break/stop/halt feature on handled exceptions (i.e. only break on python unhandled exceptions)? for details.
Construction getattr(obj, 'attr1.attr2', None) does not work.
What are the best practices to replace this construction?
Divide that into two getattr statements?
You can use operator.attrgetter() in order to get multiple attributes at once:
from operator import attrgetter
my_attrs = attrgetter(attr1, attr2)(obj)
As stated in this answer, the most straightforward solution would be to use operator.attrgetter (more info in this python docs page).
If for some reason, this solution doesn't make you happy, you could use this code snippet:
def multi_getattr(obj, attr, default = None):
"""
Get a named attribute from an object; multi_getattr(x, 'a.b.c.d') is
equivalent to x.a.b.c.d. When a default argument is given, it is
returned when any attribute in the chain doesn't exist; without
it, an exception is raised when a missing attribute is encountered.
"""
attributes = attr.split(".")
for i in attributes:
try:
obj = getattr(obj, i)
except AttributeError:
if default:
return default
else:
raise
return obj
# Example usage
obj = [1,2,3]
attr = "append.__doc__.capitalize.__doc__"
multi_getattr(obj, attr) #Will return the docstring for the
#capitalize method of the builtin string
#object
from this page, which does work. I tested and used it.
I would suggest using something like this:
from operator import attrgetter
attrgetter('attr0.attr1.attr2.attr3')(obj)
If you have the attribute names you want to get in a list, you can do the following:
my_attrs = [getattr(obj, attr) for attr in attr_list]
A simple, but not very eloquent way, to get multiple attr would be to use tuples with or without brackets something like
aval, bval = getattr(myObj,"a"), getattr(myObj,"b")
but I think you might be wanting instead to get atrribute of a contained object with the way you are using dot notation. In which case it would be something like
getattr(myObj.contained, "c")
where contained is an object cotained within myObj object and c is an attribute of contained. Let me know if this is not what you want.
This question already has answers here:
Access to class attributes by using a variable in Python? [duplicate]
(2 answers)
Closed 7 years ago.
I got a class:
class thing:
def __init__(self, inp, inp2):
self.aThing = inp
self.bThing = inp2
def test(self, character):
print(self.eval(character+"Thing"))
(Important is the last line)
The test script would look like this:
x = thing(1, 2)
x.test("b")
Trying to run this it screams at me:
AttributeError: 'thing' object has no attribute 'eval'
Makes sense, so I tried this:
print(self.(eval(character+"Thing")))
Now its a Syntax error: .(
So is there a way Í could do this? And if not, is there a way to do this kind of case distinction without having to list all possible cases? (like using if-statements or the switch equivalent)
Oh, and I know that in this case I could just use a dictionary instead of a class, but that is not what I am searching for...
Your question is very broad and you're probably better off watching a talk on polymorphism and the super function(behaviour distinction without if) or python design patterns.
For the specific problem you posted you want to use getattr:
getattr(object, name[, default])
Return the value of the named
attribute of object. name must be a string. If the string is the name
of one of the object’s attributes, the result is the value of that
attribute. For example, getattr(x, 'foobar') is equivalent to
x.foobar. If the named attribute does not exist, default is returned
if provided, otherwise AttributeError is raised.
class thing:
def __init__(self, inp, inp2):
self.aThing = inp
self.bThing = inp2
def test(self, character):
print(getattr(self,character+"Thing"))
Example Demo
>>> t = thing(1, 2)
>>> t.test("a")
1
I need to determine if a given Python variable is an instance of native type: str, int, float, bool, list, dict and so on. Is there elegant way to doing it?
Or is this the only way:
if myvar in (str, int, float, bool):
# do something
This is an old question but it seems none of the answers actually answer the specific question: "(How-to) Determine if Python variable is an instance of a built-in type". Note that it's not "[...] of a specific/given built-in type" but of a.
The proper way to determine if a given object is an instance of a buil-in type/class is to check if the type of the object happens to be defined in the module __builtin__.
def is_builtin_class_instance(obj):
return obj.__class__.__module__ == '__builtin__'
Warning: if obj is a class and not an instance, no matter if that class is built-in or not, True will be returned since a class is also an object, an instance of type (i.e. AnyClass.__class__ is type).
The best way to achieve this is to collect the types in a list of tuple called primitiveTypes and:
if isinstance(myvar, primitiveTypes): ...
The types module contains collections of all important types which can help to build the list/tuple.
Works since Python 2.2
Not that I know why you would want to do it, as there isn't any "simple" types in Python, it's all objects. But this works:
type(theobject).__name__ in dir(__builtins__)
But explicitly listing the types is probably better as it's clearer. Or even better: Changing the application so you don't need to know the difference.
Update: The problem that needs solving is how to make a serializer for objects, even those built-in. The best way to do this is not to make a big phat serializer that treats builtins differently, but to look up serializers based on type.
Something like this:
def IntSerializer(theint):
return str(theint)
def StringSerializer(thestring):
return repr(thestring)
def MyOwnSerializer(value):
return "whatever"
serializers = {
int: IntSerializer,
str: StringSerializer,
mymodel.myclass: MyOwnSerializer,
}
def serialize(ob):
try:
return ob.serialize() #For objects that know they need to be serialized
except AttributeError:
# Look up the serializer amongst the serializer based on type.
# Default to using "repr" (works for most builtins).
return serializers.get(type(ob), repr)(ob)
This way you can easily add new serializers, and the code is easy to maintain and clear, as each type has its own serializer. Notice how the fact that some types are builtin became completely irrelevant. :)
You appear to be interested in assuring the simplejson will handle your types. This is done trivially by
try:
json.dumps( object )
except TypeError:
print "Can't convert", object
Which is more reliable than trying to guess which types your JSON implementation handles.
What is a "native type" in Python? Please don't base your code on types, use Duck Typing.
you can access all these types by types module:
`builtin_types = [ i for i in types.__dict__.values() if isinstance(i, type)]`
as a reminder, import module types first
def isBuiltinTypes(var):
return type(var) in types.__dict__.values() and not isinstance(var, types.InstanceType)
It's 2020, I'm on python 3.7, and none of the existing answers worked for me. What worked instead is the builtins module. Here's how:
import builtins
type(your_object).__name__ in dir(builtins)
Built in type function may be helpful:
>>> a = 5
>>> type(a)
<type 'int'>
building off of S.Lott's answer you should have something like this:
from simplejson import JSONEncoder
class JSONEncodeAll(JSONEncoder):
def default(self, obj):
try:
return JSONEncoder.default(self, obj)
except TypeError:
## optionally
# try:
# # you'd have to add this per object, but if an object wants to do something
# # special then it can do whatever it wants
# return obj.__json__()
# except AttributeError:
##
# ...do whatever you are doing now...
# (which should be creating an object simplejson understands)
to use:
>>> json = JSONEncodeAll()
>>> json.encode(myObject)
# whatever myObject looks like when it passes through your serialization code
these calls will use your special class and if simplejson can take care of the object it will. Otherwise your catchall functionality will be triggered, and possibly (depending if you use the optional part) an object can define it's own serialization
For me the best option is:
allowed_modules = set(['numpy'])
def isprimitive(value):
return not hasattr(value, '__dict__') or \
value.__class__.__module__ in allowed_modules
This fix when value is a module and value.__class__.__module__ == '__builtin__' will fail.
The question asks to check for non-class types. These types don't have a __dict__ member (You could also test for __repr__ member, instead of checking for __dict__) Other answers mention to check for membership in types.__dict__.values(), but some of the types in this list are classes.
def isnonclasstype(val):
return getattr(val,"__dict__", None) != None
a=2
print( isnonclasstype(a) )
a="aaa"
print( isnonclasstype(a) )
a=[1,2,3]
print( isnonclasstype(a) )
a={ "1": 1, "2" : 2 }
print( isnonclasstype(a) )
class Foo:
def __init__(self):
pass
a = Foo()
print( isnonclasstype(a) )
gives me:
> python3 t.py
False
False
False
False
True
> python t.py
False
False
False
False
True
I need to determine if a given Python variable is an instance of native type: str, int, float, bool, list, dict and so on. Is there elegant way to doing it?
Or is this the only way:
if myvar in (str, int, float, bool):
# do something
This is an old question but it seems none of the answers actually answer the specific question: "(How-to) Determine if Python variable is an instance of a built-in type". Note that it's not "[...] of a specific/given built-in type" but of a.
The proper way to determine if a given object is an instance of a buil-in type/class is to check if the type of the object happens to be defined in the module __builtin__.
def is_builtin_class_instance(obj):
return obj.__class__.__module__ == '__builtin__'
Warning: if obj is a class and not an instance, no matter if that class is built-in or not, True will be returned since a class is also an object, an instance of type (i.e. AnyClass.__class__ is type).
The best way to achieve this is to collect the types in a list of tuple called primitiveTypes and:
if isinstance(myvar, primitiveTypes): ...
The types module contains collections of all important types which can help to build the list/tuple.
Works since Python 2.2
Not that I know why you would want to do it, as there isn't any "simple" types in Python, it's all objects. But this works:
type(theobject).__name__ in dir(__builtins__)
But explicitly listing the types is probably better as it's clearer. Or even better: Changing the application so you don't need to know the difference.
Update: The problem that needs solving is how to make a serializer for objects, even those built-in. The best way to do this is not to make a big phat serializer that treats builtins differently, but to look up serializers based on type.
Something like this:
def IntSerializer(theint):
return str(theint)
def StringSerializer(thestring):
return repr(thestring)
def MyOwnSerializer(value):
return "whatever"
serializers = {
int: IntSerializer,
str: StringSerializer,
mymodel.myclass: MyOwnSerializer,
}
def serialize(ob):
try:
return ob.serialize() #For objects that know they need to be serialized
except AttributeError:
# Look up the serializer amongst the serializer based on type.
# Default to using "repr" (works for most builtins).
return serializers.get(type(ob), repr)(ob)
This way you can easily add new serializers, and the code is easy to maintain and clear, as each type has its own serializer. Notice how the fact that some types are builtin became completely irrelevant. :)
You appear to be interested in assuring the simplejson will handle your types. This is done trivially by
try:
json.dumps( object )
except TypeError:
print "Can't convert", object
Which is more reliable than trying to guess which types your JSON implementation handles.
What is a "native type" in Python? Please don't base your code on types, use Duck Typing.
you can access all these types by types module:
`builtin_types = [ i for i in types.__dict__.values() if isinstance(i, type)]`
as a reminder, import module types first
def isBuiltinTypes(var):
return type(var) in types.__dict__.values() and not isinstance(var, types.InstanceType)
It's 2020, I'm on python 3.7, and none of the existing answers worked for me. What worked instead is the builtins module. Here's how:
import builtins
type(your_object).__name__ in dir(builtins)
Built in type function may be helpful:
>>> a = 5
>>> type(a)
<type 'int'>
building off of S.Lott's answer you should have something like this:
from simplejson import JSONEncoder
class JSONEncodeAll(JSONEncoder):
def default(self, obj):
try:
return JSONEncoder.default(self, obj)
except TypeError:
## optionally
# try:
# # you'd have to add this per object, but if an object wants to do something
# # special then it can do whatever it wants
# return obj.__json__()
# except AttributeError:
##
# ...do whatever you are doing now...
# (which should be creating an object simplejson understands)
to use:
>>> json = JSONEncodeAll()
>>> json.encode(myObject)
# whatever myObject looks like when it passes through your serialization code
these calls will use your special class and if simplejson can take care of the object it will. Otherwise your catchall functionality will be triggered, and possibly (depending if you use the optional part) an object can define it's own serialization
For me the best option is:
allowed_modules = set(['numpy'])
def isprimitive(value):
return not hasattr(value, '__dict__') or \
value.__class__.__module__ in allowed_modules
This fix when value is a module and value.__class__.__module__ == '__builtin__' will fail.
The question asks to check for non-class types. These types don't have a __dict__ member (You could also test for __repr__ member, instead of checking for __dict__) Other answers mention to check for membership in types.__dict__.values(), but some of the types in this list are classes.
def isnonclasstype(val):
return getattr(val,"__dict__", None) != None
a=2
print( isnonclasstype(a) )
a="aaa"
print( isnonclasstype(a) )
a=[1,2,3]
print( isnonclasstype(a) )
a={ "1": 1, "2" : 2 }
print( isnonclasstype(a) )
class Foo:
def __init__(self):
pass
a = Foo()
print( isnonclasstype(a) )
gives me:
> python3 t.py
False
False
False
False
True
> python t.py
False
False
False
False
True