Detecting if an iterator will be consumed - python

Is there an uniform way of knowing if an iterable object will be consumed by the iteration?
Suppose you have a certain function crunch which asks for an iterable object for parameter, and uses it many times. Something like:
def crunch (vals):
for v in vals:
chomp(v)
for v in vals:
yum(v)
(note: merging together the two for loops is not an option).
An issue arises if the function gets called with an iterable which is not a list. In the following call the yum function is never executed:
crunch(iter(range(4))
We could in principle fix this by redefining the crunch function as follows:
def crunch (vals):
vals = list(vals)
for v in vals:
chomp(v)
for v in vals:
yum(v)
But this would result in using twice the memory if the call to crunch is:
hugeList = list(longDataStream)
crunch(hugeList)
We could fix this by defining crunch like this:
def crunch (vals):
if type(vals) is not list:
vals = list(vals)
for v in vals:
chomp(v)
for v in vals:
yum(v)
But still there colud be the case in which the calling code stores data in something which
cannot be consumed
is not a list
For instance:
from collections import deque
hugeDeque = deque(longDataStream)
crunch(hugeDeque)
It would be nice to have a isconsumable predicate, so that we can define crunch like this:
def crunch (vals):
if isconsumable(vals):
vals = list(vals)
for v in vals:
chomp(v)
for v in vals:
yum(v)
Is there a solution for this problem?

One possibility is to test whether the item is a Sequence, using isinstance(val, collections.Sequence). Non-consumability still isn't totally guaranteed but I think it's about the best you can get. A Python sequence has to have a length, which means that at least it can't be an open-ended iterator, and in general implies that the elements have to be known ahead of time, which in turn implies that they can be iterated over without consuming them. It's still possible to write pathological classes that fit the sequence protocol but aren't re-iterable, but you'll never be able to handle those.
Note that neither Iterable nor Iterator is the appropriate choice, because these types don't guarantee a length, and hence can't guarantee that the iteration will even be finite, let alone repeatable. You could, however, check for both Sized and Iterable.
The important thing is to document that your function will iterate over its argument twice, thus warning users that they must pass in an object that supports this.

Another, additional option could be to query if the iterable is its own iterator:
if iter(vals) is vals:
vals = list(vals)
because in this case, it is just an iterator.
This works with generators, iterators, files and many other objects which are designed for "one run", in other words, all iterables which are iterators by itself, because an iterator returns self from its __iter__().
But this might not be enough, because there are objects which empty themselves on iteration without being their own iterator.
Normally, a self-consuming object will be its own iterator, but there are cases where this might not be allowed.
Imagine a class which wraps a list and empties this list on iteration, such as
class ListPart(object):
"""Liste stückweise zerlegen."""
def __init__(self, data=None):
if data is None: data = []
self.data = data
def next(self):
try:
return self.data.pop(0)
except IndexError:
raise StopIteration
def __iter__(self):
return self
def __len__(self): # doesn't work with __getattr__...
return len(self.data)
which you call like
l = [1, 2, 3, 4]
lp = ListPart(l)
for i in lp: process(i)
# now l is empty.
If I add now additional data to that list and iterate over the same object again, I'll get the new data which is a breach of the protocol:
The intention of the protocol is that once an iterator’s next() method raises StopIteration, it will continue to do so on subsequent calls. Implementations that do not obey this property are deemed broken. (This constraint was added in Python 2.3; in Python 2.2, various iterators are broken according to this rule.)
So in this case, the object would have to return an iterator distinct from itself despite of being self-consuming. In this case, this could be done with
def __iter__(self):
while True:
try:
yield l.pop(0)
except IndexError: # pop from empty list
return
which returns a new generator on each iteration - something which would fall though the mash in the case we are discussing.

def crunch (vals):
vals1, vals2 = itertools.tee(vals, 2)
for v in vals1:
chomp(v)
for v in vals2:
yum(v)
In this case tee will end up storing the entirity of vals internally since one iterator is completed before the other one is started

Many answers come close to the point but miss it.
An Iterator is an object that is consumed by iterating over it. There is no way around it. Example of iterator objects are those returned by calls to iter(), or those returned by the functions in the itertools module.
The proper way to check whether an object is an iterator is to call isinstance(obj, Iterator). This basically checks whether the object implements the next() method (__next__() in Python 3) but you don't need to care about this.
So, remember, an iterator is always consumed. For example:
# suppose you have a list
my_list = [10, 20, 30]
# and build an iterator on the list
my_iterator = iter(my_list)
# iterate the first time over the object
for x in my_iterator:
print x
# then again
for x in my_iterator:
print x
This will print the content of the list just once.
Then there are Iterable objects. When you call iter() on an iterable it will return an iterator. Commenting in this page I made myself an error, so I will clarify here. Iterable objects are not required to return a new iterator on every call. Many iterators themselves are iterables (i.e. you can call iter() on them) and they will return the object itself.
A simple example for this are list iterators. iter(my_list) and iter(iter(my_list)) are the same object, and this is basically what #glglgl answer is checking for.
The iterator protocol requires iterator objects to return themselves as their own iterator (and thus be iterable). This is not required for the iteration mechanics to work, but you wouldn't be able to loop over the iterator object.
All of this said, what you should do is check whether you're given an Iterator, and if that's the case, make a copy of the result of the iteration (with list()). Your isconsumable(obj) is (as someone already said) isinstance(obj, Iterator).
Note that this also works for xrange(). xrange(10) returns an xrange object. Every time you iter over the xrange objects it returns a new iterator starting from the start, so you're fine and don't need to make a copy.

Here is a summary of definitions.
container
An object with a __contains__ method
generator
A function which returns an iterator.
iterable
A object with an __iter__() or __getitem__() method.
Examples of iterables include all sequence types (such as list,
str, and tuple) and some non-sequence types like dict and file.
When an iterable object is passed as an argument to the builtin
function iter(), it returns an iterator for the object. This
iterator is good for one pass over the set of values.
iterator
An iterable which has a next() method.
Iterators are required to have an
__iter__() method that returns the iterator object itself.
An iterator is
good for one pass over the set of values.
sequence
An iterable which supports efficient element access using integer
indices
via the __getitem__() special method and defines a len() method that returns
the length of the sequence.
Some built-in sequence types are list, str,
tuple, and unicode.
Note that dict also supports __getitem__() and
__len__(), but is considered a mapping rather than a sequence because the
lookups use arbitrary immutable keys rather than integers.
Now there is a multitude of ways of testing if an object is an iterable, or iterator, or sequence of some sort. Here is a summary of these ways, and how they classify various kinds of objects:
Iterable Iterator iter_is_self Sequence MutableSeq
object
[] True False False True True
() True False False True False
set([]) True False False False False
{} True False False False False
deque([]) True False False False False
<listiterator> True True True False False
<generator> True True True False False
string True False False True False
unicode True False False True False
<open> True True True False False
xrange(1) True False False True False
Foo.__iter__ True False False False False
Sized has_len has_iter has_contains
object
[] True True True True
() True True True True
set([]) True True True True
{} True True True True
deque([]) True True True False
<listiterator> False False True False
<generator> False False True False
string True True False True
unicode True True False True
<open> False False True False
xrange(1) True True True False
Foo.__iter__ False False True False
Each columns refers to a different way to classify iterables, each rows refers to a different kind of object.
import pandas as pd
import collections
import os
def col_iterable(obj):
return isinstance(obj, collections.Iterable)
def col_iterator(obj):
return isinstance(obj, collections.Iterator)
def col_sequence(obj):
return isinstance(obj, collections.Sequence)
def col_mutable_sequence(obj):
return isinstance(obj, collections.MutableSequence)
def col_sized(obj):
return isinstance(obj, collections.Sized)
def has_len(obj):
return hasattr(obj, '__len__')
def listtype(obj):
return isinstance(obj, types.ListType)
def tupletype(obj):
return isinstance(obj, types.TupleType)
def has_iter(obj):
"Could this be a way to distinguish basestrings from other iterables?"
return hasattr(obj, '__iter__')
def has_contains(obj):
return hasattr(obj, '__contains__')
def iter_is_self(obj):
"Seems identical to col_iterator"
return iter(obj) is obj
def gen():
yield
def short_str(obj):
text = str(obj)
if text.startswith('<'):
text = text.split()[0] + '>'
return text
def isiterable():
class Foo(object):
def __init__(self):
self.data = [1, 2, 3]
def __iter__(self):
while True:
try:
yield self.data.pop(0)
except IndexError: # pop from empty list
return
def __repr__(self):
return "Foo.__iter__"
filename = 'mytestfile'
f = open(filename, 'w')
objs = [list(), tuple(), set(), dict(),
collections.deque(), iter([]), gen(), 'string', u'unicode',
f, xrange(1), Foo()]
tests = [
(short_str, 'object'),
(col_iterable, 'Iterable'),
(col_iterator, 'Iterator'),
(iter_is_self, 'iter_is_self'),
(col_sequence, 'Sequence'),
(col_mutable_sequence, 'MutableSeq'),
(col_sized, 'Sized'),
(has_len, 'has_len'),
(has_iter, 'has_iter'),
(has_contains, 'has_contains'),
]
funcs, labels = zip(*tests)
data = [[test(obj) for test in funcs] for obj in objs]
f.close()
os.unlink(filename)
df = pd.DataFrame(data, columns=labels)
df = df.set_index('object')
print(df.ix[:, 'Iterable':'MutableSeq'])
print
print(df.ix[:, 'Sized':])
isiterable()

Related

How does a for loop evaluate its argument

My question is a very simple one.
Does a for loop evaluates the argument it uses every time ?
Such as:
for i in range(300):
Does python create a list of 300 items for every iteration of this loop?
If it is, is this a way to avoid it?
lst = range(300)
for i in lst:
#loop body
Same goes for code examples like this.
for i in reversed(lst):
for k in range(len(lst)):
Is the reverse process applied every single time, or the length calculated at every iteration? (I ask this for both python2 and python3)
If not, how does Python evaluate the changes on the iterable while iterating over it ?
No fear, the iterator will only be evaluated once. It ends up being roughly equivalent to code like this:
it = iter(range(300))
while True:
try:
i = next(it)
except StopIteration:
break
... body of loop ...
Note that it's not quite equivalent, because break will work differently. Remember that you can add an else to a for loop, but that won't work in the above code.
What objects are created depends on what the __iter__ method of the Iterable you are looping over returns.
Usually Python creates one Iterator when iterating over an Iterable which itself is not an Iterator. In Python2, range returns a list, which is an Iterable and has an __iter__ method which returns an Iterator.
>>> from collections import Iterable, Iterator
>>> isinstance(range(300), Iterable)
True
>>> isinstance(range(300), Iterator)
False
>>> isinstance(iter(range(300)), Iterator)
True
The for in sequence: do something syntax is basically a shorthand for doing this:
it = iter(some_iterable) # get Iterator from Iterable, if some_iterable is already an Iterator, __iter__ returns self by convention
while True:
try:
next_item = next(it)
# do something with the item
except StopIteration:
break
Here is a demo with some print statements to clarify what's happening when using a for loop:
class CapitalIterable(object):
'when iterated over, yields capitalized words of string initialized with'
def __init__(self, stri):
self.stri = stri
def __iter__(self):
print('__iter__ has been called')
return CapitalIterator(self.stri)
# instead of returning a custom CapitalIterator, we could also
# return iter(self.stri.title().split())
# because the built in list has an __iter__ method
class CapitalIterator(object):
def __init__(self, stri):
self.items = stri.title().split()
self.index = 0
def next(self): # python3: __next__
print('__next__ has been called')
try:
item = self.items[self.index]
self.index += 1
return item
except IndexError:
raise StopIteration
def __iter__(self):
return self
c = CapitalIterable('The quick brown fox jumps over the lazy dog.')
for x in c:
print(x)
Output:
__iter__ has been called
__next__ has been called
The
__next__ has been called
Quick
__next__ has been called
Brown
__next__ has been called
Fox
__next__ has been called
Jumps
__next__ has been called
Over
__next__ has been called
The
__next__ has been called
Lazy
__next__ has been called
Dog.
__next__ has been called
As you can see, __iter__ is being called only once, therefore only one Iterator object is created.
Range creates an array of 300 ints in that case either way. It does NOT create an array of 300 ints 300 times. It is not every efficient. If you use xrange it will create an iterable object that won't take up nearly as much memory. https://docs.python.org/2/library/functions.html#xrange
example.py
for i in xrange(300): #low memory foot print, similar to a normal loop
print(i)

How do I check if an iterator is actually an iterator container?

I have a dummy example of an iterator container below (the real one reads a file too large to fit in memory):
class DummyIterator:
def __init__(self, max_value):
self.max_value = max_value
def __iter__(self):
for i in range(self.max_value):
yield i
def regular_dummy_iterator(max_value):
for i in range(max_value):
yield i
This allows me to iterate over the value more than once so that I can implement something like this:
def normalise(data):
total = sum(i for i in data)
for val in data:
yield val / total
# this works when I call next()
normalise(DummyIterator(100))
# this doesn't work when I call next()
normalise(regular_dummy_iterator(100))
How do I check in the normalise function that I am being passed an iterator container rather than a normal generator?
First of all: There is no such thing as a iterator container. You have an iterable.
An iterable produces an iterator. Any iterator is also an iterable, but produces itself as the iterator:
>>> list_iter = iter([])
>>> iter(list_iter) is list_iter
True
You don't have an iterator if the iter(ob) is ob test is false.
You can test whether you have an iterator (is consumed once next raises the StopIteration exception) vs just an iterable (can probably be iterated over multiple times) by using the collections.abcmodule. Here is an example:
from collections.abc import Iterable, Iterator
def my_iterator():
yield 1
i = my_iterator()
a = []
isinstance(i, Iterator) # True
isinstance(a, Iterator) # False
What makes my_iterator() an Iterator is the presence of both the __next__ and __iter__ magic methods (and by the way, basically what is happening behind the scenes when you call isinstance on a collections.abc abstract base class is a test for the presence of certain magic methods).
Note that an iterator is also an Iterable, as is the empty list (i.e., both have the __iter__ magic method):
isinstance(i, Iterable) # True
isinstance(a, Iterable) # True
Also note, as was pointed out in Martijn Pieters' answer, that when you apply the generic iter() function to both, you get an iterator:
isinstance(iter(my_iterator()), Iterator) # True
isinstance(iter([])), Iterator) # True
The difference here between [] and my_iterator() is that iter(my_iterator()) returns itself as the iterator, whereas iter([]) produces a new iterator every time.
As was already mentioned in MP's same answer, your object above is not an "iterator container." It is an iterable object, i.e., "an iterable". Whether or not it "contains" something isn't really related; the concept of containing is represented by the abstract base class Container. A Container may be iterable, but it doesn't necessarily have to be.

How do I check if my loop never ran at all?

How do I check if my loop never ran at all?
This somehow looks too complicated to me:
x = _empty = object()
for x in data:
... # process x
if x is _empty:
raise ValueError("Empty data iterable: {!r:100}".format(data))
Ain't there a easier solution?
The above solution is from curiousefficiency.org
Update
data can contain None items.
data is an iterator, and I don't want to use it twice.
By "never ran", do you mean that data had no elements?
If so, the simplest solution is to check it before running the loop:
if not data:
raise Exception('Empty iterable')
for x in data:
...
However, as mentioned in the comments below, it will not work with some iterables, like files, generators, etc., so should be applied carefully.
The original code is best.
x = _empty = object()
_empty is called a sentinel value. In Python it's common to create a sentinel with object(), since it makes it obvious that the only purpose of _empty is to be a dummy value. But you could have used any mutable, for instance an empty list [].
Mutable objects are always guaranteed to be unique when you compare them with is, so you can safely use them as sentinel values, unlike immutables such as None or 0.
>>> None is None
True
>>> object() is object()
False
>>> [] is []
False
I propose the following:
loop_has_run = False
for x in data:
loop_has_run = True
... # process x
if not loop_has_run:
raise ValueError("Empty data iterable: {!r:100}".format(data))
I contend that this is better than the example in the question, because:
The intent is clearer (since the variable name specifies its meaning directly).
No objects are created or destroyed (which can have a negative performance impact).
It doesn't require paying attention to the subtle point that object() always returns a unique value.
Note that the loop_has_run = True assignment should be put at the start of the loop, in case (for example) the loop body contains break.
The following simple solution works with any iterable. It is based on the idea that we can check if there is a (first) element, and then keep iterating if there was one. The result is much clearer:
import itertools
try:
first_elmt = next(data)
except StopIteration:
raise ValueError("Empty data iterator: {!r:100}".format(data))
for x in itertools.chain([first_elmt], data):
…
PS: Note that it assumes that data is an iterator (as in the question). If it is merely an iterable, the code should be run on data_iter = iter(data) instead of on data (otherwise, say if data is a list, the loop would duplicate the first element).
The intent of that code isn't immediately obvious. Sure people would understand it after a while, but the code could be made clearer.
The solution I offer requires more lines of code, but that code is in a class that can be stored elsewhere. In addition this solution will work for iterables and iterators as well as sized containers.
Your code would be changed to:
it = HadItemsIterable(data)
for x in it:
...
if it.had_items:
...
The code for the class is as follows:
from collections.abc import Iterable
class HadItemsIterable(Iterable):
def __init__(self, iterable):
self._iterator = iter(iterable)
#property
def had_items(self):
try:
return self._had_items
except AttributeError as e:
raise ValueError("Not iterated over items yet")
def __iter__(self):
try:
first = next(self._iterator)
except StopIteration:
if hasattr(self, "_had_items"):
raise
self._had_items = False
raise
self._had_items = True
yield first
yield from self._iterator
You can add a loop_flag default as False, when loop executed, change it into True:
loop_flag = False
x = _empty = object()
for x in data:
loop_flag = True
... # process x
if loop_flag:
print "loop executed..."
What about this solution?
data=[]
count=None
for count, item in enumerate(data):
print (item)
if count is None:
raise ValueError('data is empty')

Get one or None from collection

It is quite common to fall into the issue where you have a N sized collection but want to work with a singular item (conceptually a 0 or 1 sized collection).
I could write the traditional if:
def singular_item(collection):
if collection:
return collection[0]
else:
return None
and simplify to:
def singular_item(collection):
return collection[0] if collection else None
But it would not work with iterables, only collections with a defined size. Passing a generator for instance would fail:
singular_item((_ for _ in range(10)))
=> TypeError: 'generator' object has no attribute '__getitem__'
So what I normally do is this:
def singular_item(collection):
return next((_ for _ in collection), None)
singular_item([1]) -> 1
singular_item([1,2,3]) -> 1
singular_item([]) -> None
This works well for any collection (or iterable), but it feels somewhat clumsy creating a generator for getting just one item. Also the readability is somewhat killed in it: the two other examples are much more explicit about what the code is trying to do.
So my questions are:
Is there a better way to do this, maybe by using a builtin function?
Do you waste resources when creating a generator for getting just one item?
Use the iter() function to create an iterable instead:
def singular_item(collection):
return next(iter(collection), None)
iter() calls collection.__iter__() to obtain an iterable object for next() to loop over, which could be the collection object itself.
Iterators are very efficient otherwise, this approach is just the right way to handle any iterable or sequence.
For the zero or one case, I'd go for (based on the (conceptually a 0 or 1 sized collection)):
def one(iterable, default=None):
i = iter(iterable)
fst = next(i, default)
try:
next(i)
raise ValueError('Must be only 0 or 1 values')
except StopIteration:
return fst

how to tell a variable is iterable but not a string

I have a function that take an argument which can be either a single item or a double item:
def iterable(arg)
if #arg is an iterable:
print "yes"
else:
print "no"
so that:
>>> iterable( ("f","f") )
yes
>>> iterable( ["f","f"] )
yes
>>> iterable("ff")
no
The problem is that string is technically iterable, so I can't just catch the ValueError when trying arg[1]. I don't want to use isinstance(), because that's not good practice (or so I'm told).
Use isinstance (I don't see why it's bad practice)
import types
if not isinstance(arg, types.StringTypes):
Note the use of StringTypes. It ensures that we don't forget about some obscure type of string.
On the upside, this also works for derived string classes.
class MyString(str):
pass
isinstance(MyString(" "), types.StringTypes) # true
Also, you might want to have a look at this previous question.
Cheers.
NB: behavior changed in Python 3 as StringTypes and basestring are no longer defined. Depending on your needs, you can replace them in isinstance by str, or a subset tuple of (str, bytes, unicode), e.g. for Cython users.
As #Theron Luhn mentionned, you can also use six.
As of 2017, here is a portable solution that works with all versions of Python:
#!/usr/bin/env python
import collections
import six
def iterable(arg):
return (
isinstance(arg, collections.Iterable)
and not isinstance(arg, six.string_types)
)
# non-string iterables
assert iterable(("f", "f")) # tuple
assert iterable(["f", "f"]) # list
assert iterable(iter("ff")) # iterator
assert iterable(range(44)) # generator
assert iterable(b"ff") # bytes (Python 2 calls this a string)
# strings or non-iterables
assert not iterable(u"ff") # string
assert not iterable(44) # integer
assert not iterable(iterable) # function
Since Python 2.6, with the introduction of abstract base classes, isinstance (used on ABCs, not concrete classes) is now considered perfectly acceptable. Specifically:
from abc import ABCMeta, abstractmethod
class NonStringIterable:
__metaclass__ = ABCMeta
#abstractmethod
def __iter__(self):
while False:
yield None
#classmethod
def __subclasshook__(cls, C):
if cls is NonStringIterable:
if any("__iter__" in B.__dict__ for B in C.__mro__):
return True
return NotImplemented
This is an exact copy (changing only the class name) of Iterable as defined in _abcoll.py (an implementation detail of collections.py)... the reason this works as you wish, while collections.Iterable doesn't, is that the latter goes the extra mile to ensure strings are considered iterable, by calling Iterable.register(str) explicitly just after this class statement.
Of course it's easy to augment __subclasshook__ by returning False before the any call for other classes you want to specifically exclude from your definition.
In any case, after you have imported this new module as myiter, isinstance('ciao', myiter.NonStringIterable) will be False, and isinstance([1,2,3], myiter.NonStringIterable)will be True, just as you request -- and in Python 2.6 and later this is considered the proper way to embody such checks... define an abstract base class and check isinstance on it.
By combining previous replies, I'm using:
import types
import collections
#[...]
if isinstance(var, types.StringTypes ) \
or not isinstance(var, collections.Iterable):
#[Do stuff...]
Not 100% fools proof, but if an object is not an iterable you still can let it pass and fall back to duck typing.
Edit: Python3
types.StringTypes == (str, unicode). The Phython3 equivalent is:
if isinstance(var, str ) \
or not isinstance(var, collections.Iterable):
Edit: Python3.3
types.StringTypes == (str, unicode). The Phython3 equivalent is:
if isinstance(var, str ) \
or not isinstance(var, collections.abc.Iterable):
I realise this is an old post but thought it was worth adding my approach for Internet posterity. The function below seems to work for me under most circumstances with both Python 2 and 3:
def is_collection(obj):
""" Returns true for any iterable which is not a string or byte sequence.
"""
try:
if isinstance(obj, unicode):
return False
except NameError:
pass
if isinstance(obj, bytes):
return False
try:
iter(obj)
except TypeError:
return False
try:
hasattr(None, obj)
except TypeError:
return True
return False
This checks for a non-string iterable by (mis)using the built-in hasattr which will raise a TypeError when its second argument is not a string or unicode string.
2.x
I would have suggested:
hasattr(x, '__iter__')
or in view of David Charles' comment tweaking this for Python3, what about:
hasattr(x, '__iter__') and not isinstance(x, (str, bytes))
3.x
the builtin basestring abstract type was removed. Use str instead. The str and bytes types don’t have functionality enough in common to warrant a shared base class.
To explicitly expand on Alex Martelli's excellent hack of collections.py and address some of the questions around it: The current working solution in python 3.6+ is
import collections
import _collections_abc as cabc
import abc
class NonStringIterable(metaclass=abc.ABCMeta):
__slots__ = ()
#abc.abstractmethod
def __iter__(self):
while False:
yield None
#classmethod
def __subclasshook__(cls, c):
if cls is NonStringIterable:
if issubclass(c, str):
return False
return cabc._check_methods(c, "__iter__")
return NotImplemented
and demonstrated
>>> typs = ['string', iter(''), list(), dict(), tuple(), set()]
>>> [isinstance(o, NonStringIterable) for o in typs]
[False, True, True, True, True, True]
If you want to add iter('') into the exclusions, for example, modify the line
if issubclass(c, str):
return False
to be
# `str_iterator` is just a shortcut for `type(iter(''))`*
if issubclass(c, (str, cabc.str_iterator)):
return False
to get
[False, False, True, True, True, True]
If you like to test if the variable is a iterable object and not a "string like" object (str, bytes, ...) you can use the fact that the __mod__() function exists in such "string like" objects for formatting proposes. So you can do a check like this:
>>> def is_not_iterable(item):
... return hasattr(item, '__trunc__') or hasattr(item, '__mod__')
>>> is_not_iterable('')
True
>>> is_not_iterable(b'')
True
>>> is_not_iterable(())
False
>>> is_not_iterable([])
False
>>> is_not_iterable(1)
True
>>> is_not_iterable({})
False
>>> is_not_iterable(set())
False
>>> is_not_iterable(range(19)) #considers also Generators or Iterators
False
As you point out correctly, a single string is a character sequence.
So the thing you really want to do is to find out what kind of sequence arg is by using isinstance or type(a)==str.
If you want to realize a function that takes a variable amount of parameters, you should do it like this:
def function(*args):
# args is a tuple
for arg in args:
do_something(arg)
function("ff") and function("ff", "ff") will work.
I can't see a scenario where an isiterable() function like yours is needed. It isn't isinstance() that is bad style but situations where you need to use isinstance().
Adding another answer here that doesn't require extra imports and is maybe more "pythonic", relying on duck typing and the fact that str has had a unicode casefold method since Python 3.
def iterable_not_string(x):
'''
Check if input has an __iter__ method and then determine if it's a
string by checking for a casefold method.
'''
try:
assert x.__iter__
try:
assert x.casefold
# could do the following instead for python 2.7 because
# str and unicode types both had a splitlines method
# assert x.splitlines
return False
except AttributeError:
return True
except AttributeError:
return False
Python 3.X
Notes:
You need implement "isListable" method.
In my case dict is not iterable because iter(obj_dict) returns an iterator of just the keys.
Sequences are iterables, but not all iterables are sequences (immutable, mutable).
set, dict are iterables but not sequence.
list is iterable and sequence.
str is an iterable and immutable sequence.
Sources:
https://docs.python.org/3/library/stdtypes.html
https://opensource.com/article/18/3/loop-better-deeper-look-iteration-python
See this example:
from typing import Iterable, Sequence, MutableSequence, Mapping, Text
class Custom():
pass
def isListable(obj):
if(isinstance(obj, type)): return isListable(obj.__new__(obj))
return isinstance(obj, MutableSequence)
try:
# Listable
#o = [Custom()]
#o = ["a","b"]
#o = [{"a":"va"},{"b":"vb"}]
#o = list # class type
# Not listable
#o = {"a" : "Value"}
o = "Only string"
#o = 1
#o = False
#o = 2.4
#o = None
#o = Custom()
#o = {1, 2, 3} #type set
#o = (n**2 for n in {1, 2, 3})
#o = bytes("Only string", 'utf-8')
#o = Custom # class type
if isListable(o):
print("Is Listable[%s]: %s" % (o.__class__, str(o)))
else:
print("Not Listable[%s]: %s" % (o.__class__, str(o)))
except Exception as exc:
raise exc

Categories

Resources