Python: Checking if an object is atomically pickleable - python

What's an accurate way of checking whether an object can be atomically pickled? When I say "atomically pickled", I mean without considering other objects it may refer to. For example, this list:
l = [threading.Lock()]
is not a a pickleable object, because it refers to a Lock which is not pickleable. But atomically, this list itself is pickleable.
So how do you check whether an object is atomically pickleable? (I'm guessing the check should be done on the class, but I'm not sure.)
I want it to behave like this:
>>> is_atomically_pickleable(3)
True
>>> is_atomically_pickleable(3.1)
True
>>> is_atomically_pickleable([1, 2, 3])
True
>>> is_atomically_pickleable(threading.Lock())
False
>>> is_atomically_pickleable(open('whatever', 'r'))
False
Etc.

Given that you're willing to break encapsulation, I think this is the best you can do:
from pickle import Pickler
import os
class AtomicPickler(Pickler):
def __init__(self, protocol):
# You may want to replace this with a fake file object that just
# discards writes.
blackhole = open(os.devnull, 'w')
Pickler.__init__(self, blackhole, protocol)
self.depth = 0
def save(self, o):
self.depth += 1
if self.depth == 1:
return Pickler.save(self, o)
self.depth -= 1
return
def is_atomically_pickleable(o, protocol=None):
pickler = AtomicPickler(protocol)
try:
pickler.dump(o)
return True
except:
# Hopefully this exception was actually caused by dump(), and not
# something like a KeyboardInterrupt
return False
In Python the only way you can tell if something will work is to try it. That's the nature of a language as dynamic as Python. The difficulty with your question is that you want to distinguish between failures at the "top level" and failures at deeper levels.
Pickler.save is essentially the control-center for Python's pickling logic, so the above creates a modified Pickler that ignores recursive calls to its save method. Any exception raised while in the top-level save is treated as a pickling failure. You may want to add qualifiers to the except statement. Unqualified excepts in Python are generally a bad idea as exceptions are used not just for program errors but also for things like KeyboardInterrupt and SystemExit.
This can give what are arguably false negatives for types with odd custom pickling logic. For example, if you create a custom list-like class that instead of causing Pickler.save to be recursively called it actually tried to pickle its elements on its own somehow, and then created an instance of this class that contained an element that its custom logic could not pickle, is_atomically_pickleable would return False for this instance even though removing the offending element would result in an object that was pickleable.
Also, note the protocol argument to is_atomically_pickleable. Theoretically an object could behave differently when pickled with different protocols (though that would be pretty weird) so you should make this match the protocol argument you give to dump.

Given the dynamic nature of Python, I don't think there's really a well-defined way to do what you're asking aside from heuristics or a whitelist.
If I say:
x = object()
is x "atomically pickleable"? What if I say:
x.foo = threading.Lock()
? is x "atomically pickleable" now?
What if I made a separate class that always had a lock attribute? What if I deleted that attribute from an instance?

I think the persistent_id interface is a poor match for you are attempting to do. It is designed to be used when your object should refer to equivalent objects on the new program rather then copies of the old one. You are attempting to filter out every object that cannot be pickled which is different and why are you attempting to do this.
I think this is a sign of problem in your code. That fact that you want to pickle objects which refer to gui widgets, files, and locks suggests that you are doing something strange. The kind of objects you typically persist shouldn't be related to or hold references to that sort of object.
Having said that, I think your best option is the following:
class MyPickler(Pickler):
def save(self, obj):
try:
Pickler.save(self, obj)
except PicklingEror:
Pickle.save( self, FilteredObject(obj) )
This should work for the python implementation, I make no guarantees as to what will happen in the C implementation. Every object which gets saved will be passed to the save method. This method will raise the PicklingError when it cannot pickle the object. At this point, you can step in and recall the function asking it to pickle your own object which should pickle just fine.
EDIT
From my understanding, you have essentially a user-created dictionary of objects. Some objects are picklable and some aren't. I'd do this:
class saveable_dict(dict):
def __getstate__(self):
data = {}
for key, value in self.items():
try:
encoded = cPickle.dumps(value)
except PicklingError:
encoded = cPickle.dumps( Unpickable() )
return data
def __setstate__(self, state):
for key, value in state:
self[key] = cPickle.loads(value)
Then use that dictionary when you want to hold that collection of objects. The user should be able to get any picklable objects back, but everything else will come back as the Unpicklable() object. The difference between this and the previous approach is in objects which are themselves pickable but have references to unpicklable objects. But those objects are probably going to come back broken regardless.
This approach also has the benefit that it remains entirely within the defined API and thus should work in either cPickle or pickle.

I ended up coding my own solution to this.
Here's the code. Here are the tests. It's part of GarlicSim, so you can use it by installing garlicsim and doing from garlicsim.general_misc import pickle_tools.
If you want to use it on Python 3 code, use the Python 3 fork of garlicsim.
Here is an excerpt from the module (may be outdated):
import re
import cPickle as pickle_module
import pickle # Importing just to get dispatch table, not pickling with it.
import copy_reg
import types
from garlicsim.general_misc import address_tools
from garlicsim.general_misc import misc_tools
def is_atomically_pickleable(thing):
'''
Return whether `thing` is an atomically pickleable object.
"Atomically-pickleable" means that it's pickleable without considering any
other object that it contains or refers to. For example, a `list` is
atomically pickleable, even if it contains an unpickleable object, like a
`threading.Lock()`.
However, the `threading.Lock()` itself is not atomically pickleable.
'''
my_type = misc_tools.get_actual_type(thing)
return _is_type_atomically_pickleable(my_type, thing)
def _is_type_atomically_pickleable(type_, thing=None):
'''Return whether `type_` is an atomically pickleable type.'''
try:
return _is_type_atomically_pickleable.cache[type_]
except KeyError:
pass
if thing is not None:
assert isinstance(thing, type_)
# Sub-function in order to do caching without crowding the main algorithm:
def get_result():
# We allow a flag for types to painlessly declare whether they're
# atomically pickleable:
if hasattr(type_, '_is_atomically_pickleable'):
return type_._is_atomically_pickleable
# Weird special case: `threading.Lock` objects don't have `__class__`.
# We assume that objects that don't have `__class__` can't be pickled.
# (With the exception of old-style classes themselves.)
if not hasattr(thing, '__class__') and \
(not isinstance(thing, types.ClassType)):
return False
if not issubclass(type_, object):
return True
def assert_legit_pickling_exception(exception):
'''Assert that `exception` reports a problem in pickling.'''
message = exception.args[0]
segments = [
"can't pickle",
'should only be shared between processes through inheritance',
'cannot be passed between processes or pickled'
]
assert any((segment in message) for segment in segments)
# todo: turn to warning
if type_ in pickle.Pickler.dispatch:
return True
reduce_function = copy_reg.dispatch_table.get(type_)
if reduce_function:
try:
reduce_result = reduce_function(thing)
except Exception, exception:
assert_legit_pickling_exception(exception)
return False
else:
return True
reduce_function = getattr(type_, '__reduce_ex__', None)
if reduce_function:
try:
reduce_result = reduce_function(thing, 0)
# (The `0` is the protocol argument.)
except Exception, exception:
assert_legit_pickling_exception(exception)
return False
else:
return True
reduce_function = getattr(type_, '__reduce__', None)
if reduce_function:
try:
reduce_result = reduce_function(thing)
except Exception, exception:
assert_legit_pickling_exception(exception)
return False
else:
return True
return False
result = get_result()
_is_type_atomically_pickleable.cache[type_] = result
return result
_is_type_atomically_pickleable.cache = {}

dill has the pickles method for such a check.
>>> import threading
>>> l = [threading.Lock()]
>>>
>>> import dill
>>> dill.pickles(l)
True
>>>
>>> dill.pickles(threading.Lock())
True
>>> f = open('whatever', 'w')
>>> f.close()
>>> dill.pickles(open('whatever', 'r'))
True
Well, dill atomically pickles all of your examples, so let's try something else:
>>> l = [iter([1,2,3]), xrange(5)]
>>> dill.pickles(l)
False
Ok, this fails. Now, let's investigate:
>>> dill.detect.trace(True)
>>> dill.pickles(l)
T4: <type 'listiterator'>
False
>>> map(dill.pickles, l)
T4: <type 'listiterator'>
Si: xrange(5)
F2: <function _eval_repr at 0x106991cf8>
[False, True]
Ok. we can see the iter fails, but the xrange does pickle. So, let's replace the iter.
>>> l[0] = xrange(1,4)
>>> dill.pickles(l)
Si: xrange(1, 4)
F2: <function _eval_repr at 0x106991cf8>
Si: xrange(5)
True
Now our object atomically pickles.

Related

What is the proper way to make an object with unpickable fields pickable?

For me what I do is detect what is unpickable and make it into a string (I guess I could have deleted it too but then it will falsely tell me that field didn't exist but I'd rather have it exist but be a string). But I wanted to know if there was a less hacky more official way to do this.
Current code I use:
def make_args_pickable(args: Namespace) -> Namespace:
"""
Returns a copy of the args namespace but with unpickable objects as strings.
note: implementation not tested against deep copying.
ref:
- https://stackoverflow.com/questions/70128335/what-is-the-proper-way-to-make-an-object-with-unpickable-fields-pickable
"""
pickable_args = argparse.Namespace()
# - go through fields in args, if they are not pickable make it a string else leave as it
# The vars() function returns the __dict__ attribute of the given object.
for field in vars(args):
field_val: Any = getattr(args, field)
if not dill.pickles(field_val):
field_val: str = str(field_val)
setattr(pickable_args, field, field_val)
return pickable_args
Context: I think I do it mostly to remove the annoying tensorboard object I carry around (but I don't think I will need the .tb field anymore thanks to wandb/weights and biases). Not that this matters a lot but context is always nice.
Related:
What does it mean for an object to be picklable (or pickle-able)?
Python - How can I make this un-pickleable object pickleable?
Edit:
Since I decided to move away from dill - since sometimes it cannot recover classes/objects (probably because it cannot save their code or something) - I decided to only use pickle (which seems to be the recommended way to be done in PyTorch).
So what is the official (perhaps optimized) way to check for pickables without dill or with the official pickle?
Is this the best:
def is_picklable(obj):
try:
pickle.dumps(obj)
except pickle.PicklingError:
return False
return True
thus current soln:
def make_args_pickable(args: Namespace) -> Namespace:
"""
Returns a copy of the args namespace but with unpickable objects as strings.
note: implementation not tested against deep copying.
ref:
- https://stackoverflow.com/questions/70128335/what-is-the-proper-way-to-make-an-object-with-unpickable-fields-pickable
"""
pickable_args = argparse.Namespace()
# - go through fields in args, if they are not pickable make it a string else leave as it
# The vars() function returns the __dict__ attribute of the given object.
for field in vars(args):
field_val: Any = getattr(args, field)
# - if current field value is not pickable, make it pickable by casting to string
if not dill.pickles(field_val):
field_val: str = str(field_val)
elif not is_picklable(field_val):
field_val: str = str(field_val)
# - after this line the invariant is that it should be pickable, so set it in the new args obj
setattr(pickable_args, field, field_val)
return pickable_args
def make_opts_pickable(opts):
""" Makes a namespace pickable """
return make_args_pickable(opts)
def is_picklable(obj: Any) -> bool:
"""
Checks if somehting is pickable.
Ref:
- https://stackoverflow.com/questions/70128335/what-is-the-proper-way-to-make-an-object-with-unpickable-fields-pickable
"""
import pickle
try:
pickle.dumps(obj)
except pickle.PicklingError:
return False
return True
Note: one of the reasons I want something "offical"/tested is because I am getting pycharm halt on the try catch: How to stop PyCharm's break/stop/halt feature on handled exceptions (i.e. only break on python unhandled exceptions)? which is not what I want...I want it to only halt on unhandled exceptions.
What is the proper way to make an object with unpickable fields pickable?
I believe the answer to this belongs in the question you linked -- Python - How can I make this un-pickleable object pickleable?. I've added a new answer to that question explaining how you can make an unpicklable object picklable the proper way, without using __reduce__.
So what is the official (perhaps optimized) way to check for pickables without dill or with the official pickle?
Objects that are picklable are defined in the docs as follows:
None, True, and False
integers, floating point numbers, complex numbers
strings, bytes, bytearrays
tuples, lists, sets, and dictionaries containing only picklable objects
functions defined at the top level of a module (using def, not lambda)
built-in functions defined at the top level of a module
classes that are defined at the top level of a module
instances of such classes whose dict or the result of calling getstate() is picklable (see section Pickling Class Instances for details).
The tricky parts are (1) knowing how functions/classes are defined (you can probably use the inspect module for that) and (2) recursing through objects, checking against the rules above.
There are a lot of caveats to this, such as the pickle protocol versions, whether the object is an extension type (defined in a C extension like numpy, for example) or an instance of a 'user-defined' class. Usage of __slots__ can also impact whether an object is picklable or not (since __slots__ means there's no __dict__), but can be pickled with __getstate__. Some objects may also be registered with a custom function for pickling. So, you'd need to know if that has happened as well.
Technically, you can implement a function to check for all of this in Python, but it will be quite slow by comparison. The easiest (and probably most performant, as pickle is implemented in C) way to do this is to simply attempt to pickle the object you want to check.
I tested this with PyCharm pickling all kinds of things... it doesn't halt with this method. The key is that you must anticipate pretty much any kind of exception (see footnote 3 in the docs). The warnings are optional, they're mostly explanatory for the context of this question.
def is_picklable(obj: Any) -> bool:
try:
pickle.dumps(obj)
return True
except (pickle.PicklingError, pickle.PickleError, AttributeError, ImportError):
# https://docs.python.org/3/library/pickle.html#what-can-be-pickled-and-unpickled
return False
except RecursionError:
warnings.warn(
f"Could not determine if object of type {type(obj)!r} is picklable"
"due to a RecursionError that was supressed. "
"Setting a higher recursion limit MAY allow this object to be pickled"
)
return False
except Exception as e:
# https://docs.python.org/3/library/pickle.html#id9
warnings.warn(
f"An error occurred while attempting to pickle"
f"object of type {type(obj)!r}. Assuming it's unpicklable. The exception was {e}"
)
return False
Using the example from my other answer I linked above, you could make your object picklable by implementing __getstate__ and __setstate__ (or subclassing and adding them, or making a wrapper class) adapting your make_args_pickable...
class Unpicklable:
"""
A simple marker class so we can distinguish when a deserialized object
is a string because it was originally unpicklable
(and not simply a string to begin with)
"""
def __init__(self, obj_str: str):
self.obj_str = obj_str
def __str__(self):
return self.obj_str
def __repr__(self):
return f'Unpicklable(obj_str={self.obj_str!r})'
class PicklableNamespace(Namespace):
def __getstate__(self):
"""For serialization"""
# always make a copy so you don't accidentally modify state
state = self.__dict__.copy()
# Any unpicklables will be converted to a ``Unpicklable`` object
# with its str format stored in the object
for key, val in state.items():
if not is_picklable(val):
state[key] = Unpicklable(str(val))
return state
def __setstate__(self, state):
self.__dict__.update(state) # or leave unimplemented
In action, I'll pickle a namespace whose attributes contain a file handle (normally not picklable) and then load the pickle data.
# Normally file handles are not picklable
p = PicklableNamespace(f=open('test.txt'))
data = pickle.dumps(p)
del p
loaded_p = pickle.loads(data)
# PicklableNamespace(f=Unpicklable(obj_str="<_io.TextIOWrapper name='test.txt' mode='r' encoding='cp1252'>"))
Yes, a try/except is the best way to go about this.
Per the docs, pickle is capable of recursively pickling objects, that is to say, if you have a list of objects that are pickleable, it will pickle all objects inside of that list if you attempt to pickle that list. This means that you cannot feasibly test to see if an object is pickleable without pickling it. Because of that, your structure of:
def is_picklable(obj):
try:
pickle.dumps(obj)
except pickle.PicklingError:
return False
return True
is the simplest and easiest way to go about checking this. If you are not working with recursive structures and/or you can safely assume that all recursive structures will only contain pickleable objects, you could check the type() value of the object against the list of pickleable objects:
None, True, and False
integers, floating point numbers, complex numbers
strings, bytes, bytearrays
tuples, lists, sets, and dictionaries containing only picklable objects
functions defined at the top level of a module (using def, not lambda)
built-in functions defined at the top level of a module
classes that are defined at the top level of a module
instances of such classes whose dict or the result of calling getstate() is picklable (see section Pickling Class Instances for details).
This is likely faster than using a try:... except:... like you showed in your question.
To me no matter the error I want my function to tell me it's not pickable. So it seems to work if I do this:
def is_picklable(obj: Any) -> bool:
"""
Checks if somehting is pickable.
Ref:
- https://stackoverflow.com/questions/70128335/what-is-the-proper-way-to-make-an-object-with-unpickable-fields-pickable
- pycharm halting all the time issue: https://stackoverflow.com/questions/70761481/how-to-stop-pycharms-break-stop-halt-feature-on-handled-exceptions-i-e-only-b
"""
import pickle
try:
pickle.dumps(obj)
except:
return False
return True
plus as an added bonus it doesn't freak pycharm out see How to stop PyCharm's break/stop/halt feature on handled exceptions (i.e. only break on python unhandled exceptions)? for details.

Python weakref.proxy object considered iterable?

I came across the following (using Python 3.8.3):
from collections.abc import Iterable
from weakref import proxy
class Dummy:
pass
isinstance(d := Dummy, Iterable)
# False (as expected)
isinstance(p := proxy(d), Iterable)
# True (why?!)
for _ in p:
# raises TypeError
pass
How could the proxy object pass the iterable-test?
It provides __iter__ in case the underlying type provides it. __iter__ must be implemented on the type to work, not the instance, so it can't conditionally define __iter__ without fragmenting into many different classes, not just a single proxy class.
Unfortunately, the Iterable test just checks if the class defines __iter__, not whether it works, and proxy doesn't know if it really works without calling the wrapped class's __iter__. If you want to check for iterability, just iterate it, and catch the TypeError if it fails. If you can't iterate it immediately, while it's subject to time-of-check/time-of-use race conditions, you could write your own simple tester that covers whether it can actually be iterated:
def is_iterable(maybeiter):
try:
iter(maybeiter)
except TypeError:
return False
else:
return True

Why doesn't python Queue have falsey behavior

In python, I have gotten quite used to container objects having truthy behavior when they are populated, and falsey behavior when they are not:
# list
a = []
not a
True
a.append(1)
not a
False
# deque
from collections import deque
d = deque()
not d
True
d.append(1)
not d
False
# and so on
However, queue.Queue does not have this behavior. To me, this seems odd and a contradiction against almost any other container data type that I can think of. Furthermore, the method empty on queue seem to go against coding conventions that avoid race conditions on any other object (checking if a file exists, checking if a list is empty, etc). For example, we would generally say the following is bad practice:
_queue = []
if not len(_queue):
# do something
And should be replaced with
_queue = []
if not _queue:
# do something
or to handle an IndexError, which we might still argue would be better with the if not _queue statement:
try:
x = _queue.pop()
except IndexError as e:
logger.exception(e)
# do something else
Yet, Queue requires someone to do one of the following:
_queue = queue.Queue()
if _queue.empty():
# do something
# though this smells like a race condition
# or handle an exception
try:
_queue.get(timeout=5)
except Empty as e:
# do something else
# maybe logger.exception(e)
Is there documentation somewhere that might point to why this design choice was made? It seems odd, especially when the source code shows that it was built on top of collections.deque (noted that Queue does not inherit from deque)
According to the definition of the truth value testing procedure, the behavior is expected:
Any object can be tested for truth value, for use in an if or while
condition or as operand of the Boolean operations below.
By default, an object is considered true unless its class defines
either a __bool__() method that returns False or a __len__() method
that returns zero, when called with the object.
As Queue does not neither implements __bool__() nor __len__() then it's truth value is True. As to why does Queue does not implement __len__() a clue can be found in the comments of the qsize function:
'''Return the approximate size of the queue (not reliable!).'''
The same can be said of the __bool__() function.
I'm going to leave the accepted answer as is, but as far as I can tell, the reason is that if _queue: # do something would be a race condition, since Queue is designed to be passed between threads and therefore possesses dubious state as far as tasks go.
From the source:
class Queue:
~snip~
def qsize(self):
'''Return the approximate size of the queue (not reliable!).'''
with self.mutex:
return self._qsize()
def empty(self):
'''Return True if the queue is empty, False otherwise (not reliable!).
This method is likely to be removed at some point. Use qsize() == 0
as a direct substitute, but be aware that either approach risks a race
condition where a queue can grow before the result of empty() or
qsize() can be used.
To create code that needs to wait for all queued tasks to be
completed, the preferred technique is to use the join() method.
'''
with self.mutex:
return not self._qsize()
~snip
Must have missed this helpful docstring when I was originally looking. The qsize bool is not tied to the state of the queue once it's evaluated. So the user is doing processing against a queue based on an already out-of-date state.
Like checking the existence of a file, it's more pythonic to just handle the exception:
try:
task = _queue.get(timeout=4)
except Empty as e:
# do something
since the exception/success against get is the state of the queue.
Likewise, we would not do:
if os.exists(file):
with open(file) as fh:
# do processing
Instead, we would do:
try:
with open(file) as fh:
# do processing
except FileNotFoundError as e:
# do something else
I suppose the intentional leaving-out of the __bool__ method by the author is to steer the developer away from leaning against such a paradigm, and treating the queue like you would any other object that might be of questionable state.

How to check if a variable is instance of any class [duplicate]

I need to determine if a given Python variable is an instance of native type: str, int, float, bool, list, dict and so on. Is there elegant way to doing it?
Or is this the only way:
if myvar in (str, int, float, bool):
# do something
This is an old question but it seems none of the answers actually answer the specific question: "(How-to) Determine if Python variable is an instance of a built-in type". Note that it's not "[...] of a specific/given built-in type" but of a.
The proper way to determine if a given object is an instance of a buil-in type/class is to check if the type of the object happens to be defined in the module __builtin__.
def is_builtin_class_instance(obj):
return obj.__class__.__module__ == '__builtin__'
Warning: if obj is a class and not an instance, no matter if that class is built-in or not, True will be returned since a class is also an object, an instance of type (i.e. AnyClass.__class__ is type).
The best way to achieve this is to collect the types in a list of tuple called primitiveTypes and:
if isinstance(myvar, primitiveTypes): ...
The types module contains collections of all important types which can help to build the list/tuple.
Works since Python 2.2
Not that I know why you would want to do it, as there isn't any "simple" types in Python, it's all objects. But this works:
type(theobject).__name__ in dir(__builtins__)
But explicitly listing the types is probably better as it's clearer. Or even better: Changing the application so you don't need to know the difference.
Update: The problem that needs solving is how to make a serializer for objects, even those built-in. The best way to do this is not to make a big phat serializer that treats builtins differently, but to look up serializers based on type.
Something like this:
def IntSerializer(theint):
return str(theint)
def StringSerializer(thestring):
return repr(thestring)
def MyOwnSerializer(value):
return "whatever"
serializers = {
int: IntSerializer,
str: StringSerializer,
mymodel.myclass: MyOwnSerializer,
}
def serialize(ob):
try:
return ob.serialize() #For objects that know they need to be serialized
except AttributeError:
# Look up the serializer amongst the serializer based on type.
# Default to using "repr" (works for most builtins).
return serializers.get(type(ob), repr)(ob)
This way you can easily add new serializers, and the code is easy to maintain and clear, as each type has its own serializer. Notice how the fact that some types are builtin became completely irrelevant. :)
You appear to be interested in assuring the simplejson will handle your types. This is done trivially by
try:
json.dumps( object )
except TypeError:
print "Can't convert", object
Which is more reliable than trying to guess which types your JSON implementation handles.
What is a "native type" in Python? Please don't base your code on types, use Duck Typing.
you can access all these types by types module:
`builtin_types = [ i for i in types.__dict__.values() if isinstance(i, type)]`
as a reminder, import module types first
def isBuiltinTypes(var):
return type(var) in types.__dict__.values() and not isinstance(var, types.InstanceType)
It's 2020, I'm on python 3.7, and none of the existing answers worked for me. What worked instead is the builtins module. Here's how:
import builtins
type(your_object).__name__ in dir(builtins)
Built in type function may be helpful:
>>> a = 5
>>> type(a)
<type 'int'>
building off of S.Lott's answer you should have something like this:
from simplejson import JSONEncoder
class JSONEncodeAll(JSONEncoder):
def default(self, obj):
try:
return JSONEncoder.default(self, obj)
except TypeError:
## optionally
# try:
# # you'd have to add this per object, but if an object wants to do something
# # special then it can do whatever it wants
# return obj.__json__()
# except AttributeError:
##
# ...do whatever you are doing now...
# (which should be creating an object simplejson understands)
to use:
>>> json = JSONEncodeAll()
>>> json.encode(myObject)
# whatever myObject looks like when it passes through your serialization code
these calls will use your special class and if simplejson can take care of the object it will. Otherwise your catchall functionality will be triggered, and possibly (depending if you use the optional part) an object can define it's own serialization
For me the best option is:
allowed_modules = set(['numpy'])
def isprimitive(value):
return not hasattr(value, '__dict__') or \
value.__class__.__module__ in allowed_modules
This fix when value is a module and value.__class__.__module__ == '__builtin__' will fail.
The question asks to check for non-class types. These types don't have a __dict__ member (You could also test for __repr__ member, instead of checking for __dict__) Other answers mention to check for membership in types.__dict__.values(), but some of the types in this list are classes.
def isnonclasstype(val):
return getattr(val,"__dict__", None) != None
a=2
print( isnonclasstype(a) )
a="aaa"
print( isnonclasstype(a) )
a=[1,2,3]
print( isnonclasstype(a) )
a={ "1": 1, "2" : 2 }
print( isnonclasstype(a) )
class Foo:
def __init__(self):
pass
a = Foo()
print( isnonclasstype(a) )
gives me:
> python3 t.py
False
False
False
False
True
> python t.py
False
False
False
False
True

Determine if Python variable is an instance of a built-in type

I need to determine if a given Python variable is an instance of native type: str, int, float, bool, list, dict and so on. Is there elegant way to doing it?
Or is this the only way:
if myvar in (str, int, float, bool):
# do something
This is an old question but it seems none of the answers actually answer the specific question: "(How-to) Determine if Python variable is an instance of a built-in type". Note that it's not "[...] of a specific/given built-in type" but of a.
The proper way to determine if a given object is an instance of a buil-in type/class is to check if the type of the object happens to be defined in the module __builtin__.
def is_builtin_class_instance(obj):
return obj.__class__.__module__ == '__builtin__'
Warning: if obj is a class and not an instance, no matter if that class is built-in or not, True will be returned since a class is also an object, an instance of type (i.e. AnyClass.__class__ is type).
The best way to achieve this is to collect the types in a list of tuple called primitiveTypes and:
if isinstance(myvar, primitiveTypes): ...
The types module contains collections of all important types which can help to build the list/tuple.
Works since Python 2.2
Not that I know why you would want to do it, as there isn't any "simple" types in Python, it's all objects. But this works:
type(theobject).__name__ in dir(__builtins__)
But explicitly listing the types is probably better as it's clearer. Or even better: Changing the application so you don't need to know the difference.
Update: The problem that needs solving is how to make a serializer for objects, even those built-in. The best way to do this is not to make a big phat serializer that treats builtins differently, but to look up serializers based on type.
Something like this:
def IntSerializer(theint):
return str(theint)
def StringSerializer(thestring):
return repr(thestring)
def MyOwnSerializer(value):
return "whatever"
serializers = {
int: IntSerializer,
str: StringSerializer,
mymodel.myclass: MyOwnSerializer,
}
def serialize(ob):
try:
return ob.serialize() #For objects that know they need to be serialized
except AttributeError:
# Look up the serializer amongst the serializer based on type.
# Default to using "repr" (works for most builtins).
return serializers.get(type(ob), repr)(ob)
This way you can easily add new serializers, and the code is easy to maintain and clear, as each type has its own serializer. Notice how the fact that some types are builtin became completely irrelevant. :)
You appear to be interested in assuring the simplejson will handle your types. This is done trivially by
try:
json.dumps( object )
except TypeError:
print "Can't convert", object
Which is more reliable than trying to guess which types your JSON implementation handles.
What is a "native type" in Python? Please don't base your code on types, use Duck Typing.
you can access all these types by types module:
`builtin_types = [ i for i in types.__dict__.values() if isinstance(i, type)]`
as a reminder, import module types first
def isBuiltinTypes(var):
return type(var) in types.__dict__.values() and not isinstance(var, types.InstanceType)
It's 2020, I'm on python 3.7, and none of the existing answers worked for me. What worked instead is the builtins module. Here's how:
import builtins
type(your_object).__name__ in dir(builtins)
Built in type function may be helpful:
>>> a = 5
>>> type(a)
<type 'int'>
building off of S.Lott's answer you should have something like this:
from simplejson import JSONEncoder
class JSONEncodeAll(JSONEncoder):
def default(self, obj):
try:
return JSONEncoder.default(self, obj)
except TypeError:
## optionally
# try:
# # you'd have to add this per object, but if an object wants to do something
# # special then it can do whatever it wants
# return obj.__json__()
# except AttributeError:
##
# ...do whatever you are doing now...
# (which should be creating an object simplejson understands)
to use:
>>> json = JSONEncodeAll()
>>> json.encode(myObject)
# whatever myObject looks like when it passes through your serialization code
these calls will use your special class and if simplejson can take care of the object it will. Otherwise your catchall functionality will be triggered, and possibly (depending if you use the optional part) an object can define it's own serialization
For me the best option is:
allowed_modules = set(['numpy'])
def isprimitive(value):
return not hasattr(value, '__dict__') or \
value.__class__.__module__ in allowed_modules
This fix when value is a module and value.__class__.__module__ == '__builtin__' will fail.
The question asks to check for non-class types. These types don't have a __dict__ member (You could also test for __repr__ member, instead of checking for __dict__) Other answers mention to check for membership in types.__dict__.values(), but some of the types in this list are classes.
def isnonclasstype(val):
return getattr(val,"__dict__", None) != None
a=2
print( isnonclasstype(a) )
a="aaa"
print( isnonclasstype(a) )
a=[1,2,3]
print( isnonclasstype(a) )
a={ "1": 1, "2" : 2 }
print( isnonclasstype(a) )
class Foo:
def __init__(self):
pass
a = Foo()
print( isnonclasstype(a) )
gives me:
> python3 t.py
False
False
False
False
True
> python t.py
False
False
False
False
True

Categories

Resources