When a Python list is known to always contain a single item, is there a way to access it other than:
mylist[0]
You may ask, 'Why would you want to?'. Curiosity alone. There seems to be an alternative way to do everything in Python.
Raises exception if not exactly one item:
Sequence unpacking:
singleitem, = mylist
# Identical in behavior (byte code produced is the same),
# but arguably more readable since a lone trailing comma could be missed:
[singleitem] = mylist
Rampant insanity, unpack the input to the identity lambda function:
# The only even semi-reasonable way to retrieve a single item and raise an exception on
# failure for too many, not just too few, elements as an expression, rather than a
# statement, without resorting to defining/importing functions elsewhere to do the work
singleitem = (lambda x: x)(*mylist)
All others silently ignore spec violation, producing first or last item:
Explicit use of iterator protocol:
singleitem = next(iter(mylist))
Destructive pop:
singleitem = mylist.pop()
Negative index:
singleitem = mylist[-1]
Set via single iteration for (because the loop variable remains available with its last value when a loop terminates):
for singleitem in mylist: break
There are many others (combining or varying bits of the above, or otherwise relying on implicit iteration), but you get the idea.
I will add that the more_itertools
library has a tool that returns one item from an iterable.
from more_itertools import one
iterable = ["foo"]
one(iterable)
# "foo"
In addition, more_itertools.one raises an error if the iterable is empty or has more than one item.
iterable = []
one(iterable)
# ValueError: not enough values to unpack (expected 1, got 0)
iterable = ["foo", "bar"]
one(iterable)
# ValueError: too many values to unpack (expected 1)
more_itertools is a third-party package > pip install more-itertools
(This is an adjusted repost of my answer to a similar question related to sets.)
One way is to use reduce with lambda x: x.
from functools import reduce
> reduce(lambda x: x, [3]})
3
> reduce(lambda x: x, [1, 2, 3])
TypeError: <lambda>() takes 1 positional argument but 2 were given
> reduce(lambda x: x, [])
TypeError: reduce() of empty sequence with no initial value
Benefits:
Fails for multiple and zero values
Doesn't change the original list
Doesn't need a new variable and can be passed as an argument
Cons: "API misuse" (see comments).
Related
As zip yields as many values as the shortest iterable given, I would have expected passing zero arguments to zip to return an iterable yielding infinitely many tuples, instead of returning an empty iterable.
This would have been consistent with how other monoidal operations behave:
>>> sum([]) # sum
0
>>> math.prod([]) # product
1
>>> all([]) # logical conjunction
True
>>> any([]) # logical disjunction
False
>>> list(itertools.product()) # Cartesian product
[()]
For each of these operations, the value returned when given no arguments the identity value for the operation, which is to say, one that does not modify the result when included in the operation:
sum(xs) == sum([*xs, 0]) == sum([*xs, sum()])
math.prod(xs) == math.prod([*xs, 1]) == math.prod([*xs, math.prod()])
all(xs) == all([*xs, True]) == all([*xs, all()])
any(xs) == any([*xs, False]) == any([*xs, any()])
Or at least, one that gives a trivially isomorphic result:
itertools.product(*xs, itertools.product()) ≡
≡ itertools.product(*xs, [()]) ≡
≡ (*x, ()) for x in itertools.product(*xs)
In the case of zip, this would have been:
zip(*xs, zip()) ≡ f(x) for x in zip(*xs)
Because zip returns an n-tuple when given n arguments, it follows that zip() with 0 arguments must yield 0-tuples, i.e. (). This forces f to return (*x, ()) and therefore zip() to be equivalent to itertools.repeat(()). Another, more general law is:
((*x, *y) for x, y in zip(zip(*xs), zip(*ys)) ≡ zip(*xs, *ys)
which would have then held for all xs and ys, including when either xs or ys is empty (and does hold for itertools.product).
Yielding empty tuples indefinitely is also the behaviour that falls out of this straightforward reimplementation:
def my_zip(*iterables):
iterables = tuple(map(iter, iterables))
while True:
item = []
for it in iterables:
try:
item.append(next(it))
except StopIteration:
return
yield tuple(item)
which means that the case of zip with no arguments must have been specifically special-cased not to do that.
Why is zip() not equivalent to itertools.repeat(()) despite all the above?
PEP 201 and related discussion show that zip() with no arguments originally raised an exception. It was changed to return an empty list because this is more convenient for some cases of zip(*s) where s turns out to be an empty list. No consideration was given to what might be the 'identity', which in any case appears difficult to define with respect to zip - there is nothing you can zip with arbitrary x that will return x.
The original reasons for certain commutative and associative mathematical functions applied to an empty list to return the identity by default are not clear, but may have been driven by convenience, principle of least astonishment, and the history of earlier languages like Perl or ABC. Explicit reference to the concept of mathematical identity is rarely if ever made (see e.g. Reason for "all" and "any" result on empty lists). So there is no reason to rely on functions in general to do this. In many cases it would be less surprising for them to raise an exception instead.
This question already has answers here:
Python generator that groups another iterable into groups of N [duplicate]
(9 answers)
Closed 1 year ago.
I am passing the result of itertools.zip_longest to itertools.product, however I get errors when it gets to the end and finds None.
The error I get is:
Error: (, TypeError('sequence item 0: expected str instance, NoneType found',), )
If I use zip instead of itertools.zip_longest then I don't get all the items.
Here is the code I am using to generate the zip:
def grouper(iterable, n, fillvalue=None):
args = [iter(iterable)] * n
print(args)
#return zip(*args)
return itertools.zip_longest(*args)
sCharacters = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789~`!##$%^&*()_-+={[}]|\"""':;?/>.<,"
for x in grouper(sCharacters, 4):
print(x)
Here is the output. The first one is itertools.zip_longest and the second is just zip. You can see the first with the None items and the second is missing the final item, the comma: ','
How can I get a zip of all characters in a string without the none at the end.
Or how can I avoid this error?
Thanks for your time.
I've had to solve this in a performance critical case before, so here is the fastest code I've found for doing this (works no matter the values in iterable):
from itertools import zip_longest
def grouper(n, iterable):
fillvalue = object() # Guaranteed unique sentinel, cannot exist in iterable
for tup in zip_longest(*(iter(iterable),) * n, fillvalue=fillvalue):
if tup[-1] is fillvalue:
yield tuple(v for v in tup if v is not fillvalue)
else:
yield tup
The above is, a far as I can tell, unbeatable when the input is long enough and the chunk sizes are small enough. For cases where the chunk size is fairly large, it can lose out to this even uglier case, but usually not by much:
from future_builtins import map # Only on Py2, and required there
from itertools import islice, repeat, starmap, takewhile
from operator import truth # Faster than bool when guaranteed non-empty call
def grouper(n, iterable):
'''Returns a generator yielding n sized groups from iterable
For iterables not evenly divisible by n, the final group will be undersized.
'''
# Can add tests to special case other types if you like, or just
# use tuple unconditionally to match `zip`
rettype = ''.join if type(iterable) is str else tuple
# Keep islicing n items and converting to groups until we hit an empty slice
return takewhile(truth, map(rettype, starmap(islice, repeat((iter(iterable), n)))))
Either approach seamlessly leaves the final element incomplete if there aren't sufficient items to complete the group. It runs extremely fast because literally all of the work is pushed to the C layer in CPython after "set up", so however long the iterable is, the Python level work is the same, only the C level work increases. That said, it does a lot of C work, which is why the zip_longest solution (which does much less C work, and only trivial Python level work for all but the final chunk) usually beats it.
The slower, but more readable equivalent code to option #2 (but skipping the dynamic return type in favor of just tuple) is:
def grouper(n, iterable):
iterable = iter(iterable)
while True:
x = tuple(islice(iterable, n))
if not x:
return
yield x
Or more succinctly with Python 3.8+'s walrus operator:
def grouper(n, iterable):
iterable = iter(iterable)
while x := tuple(islice(iterable, n)):
yield x
the length of sCharacters is 93 (Note, 92 % 4 ==0). so since zip outputs a sequence of length of the shortest input sequence, it will miss the last element
Beware, the addition of the Nones of itertools.zip_longest are artificial values which may not be the desired behaviour for everyone. That's why zip just ignores unneccessary, additional values
EDIT:
to be able to use zip you could append some whitespace to your string:
n=4
sCharacters = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789~`!##$%^&*()_-+={[}]|\"""':;?/>.<,"
if len(sCharacters) % n > 0:
sCharacters = sCharacters + (" "*(n-len(sCharacters) % n))
EDIT2:
to obtain the missing tail when using zip use code like this:
tail = '' if len(sCharacters)%n == 0 else sCharacters[-(len(sCharacters)%n):]
I have an iterator it that I'm assuming already sorted but I would like to raise an exception if it isn't.
Data from iterator is not in memory so I do not want to use sorted() builtin because AFAIK it puts the whole iterator in a list.
The solution I'm using now is to wrap the iterator in a generator function like this:
def checkSorted(it):
prev_v = it.next()
yield prev_v
for v in it:
if v >= prev_v:
yield v
prev_v = v
else:
raise ValueError("Iterator is not sorted")
So that I can use it like this:
myconsumer(checkSorted(it))
Does someone know if there are better solutions?
I know that my solution works but it seems quite strange (at least to me) writing a module on my own to accomplish such a trivial task. I'm looking for a simple one liner or builtin solution (If it exists)
Basically your solution is almost as elegant as it gets (you could of course put it in an utility module if you find it generally useful). You could if you wanted it use an infinity object to cut the code down a bit, but then you have to include a class definition as well which grows the code again (unless you inline the class definition):
def checkSorted(it):
prev = type("", (), {"__lt__": lambda a, b: False})()
for x in it:
if prev < x:
raise ValueError("Not sorted")
prev = x
yield x
The first line is using the type to first create a class and then instantiate it. Objects of this class compares less than to anything (infinity object).
The problem with doing a one-liner is that you have to deal with three constructs: you have to update state (assignment), throw an exception and doing a loop. You could easily perform these by using statements, but making them into a oneliner will mean that you will have to try to put the statements on the same line - which in turn will result in problem with the loop and if-constructs.
If you want to put the whole thing into an expression you will have to use dirty tricks to do these, the assignment and looping the iterutils can provide and the throwing can be done by using the throw method in a generator (which can be provided in an expression too):
imap( lambda i, j: (i >= j and j or (_ for _ in ()).throw(ValueError("Not sorted"))), *(lambda pre, it: (chain([type("", (), {"__ge__": lambda a, b: True})()], pre), it))(*tee(it)))
The last it is the iterator you want to check and the expression evaluates to a checked iterator. I agree it's not good looking and not obvious what it does, but you asked for it (and I don't think you wanted it).
As an alternative i suggest to use itertools.izip_longest (and zip_longest in python 3 )to create a generator contains consecutive pairs :
You can use tee to create 2 independent iterators from a first iterable.
from itertools import izip_longest,tee
def checkSorted(it):
pre,it=tee(it)
next(it)
for i,j in izip_longest(pre,it):
if j:
if i >= j:
yield i
else:
raise ValueError("Iterator is not sorted")
else :
yield i
Demo :
it=iter([5,4,3,2,1])
print list(checkSorted(it))
[5, 4, 3, 2, 1]
it=iter([5,4,3,2,3])
print list(checkSorted(it))
Traceback (most recent call last):
File "/home/bluebird/Desktop/ex2.py", line 19, in <module>
print list(checkSorted(it))
File "/home/bluebird/Desktop/ex2.py", line 10, in checkSorted
raise ValueError("Iterator is not sorted")
ValueError: Iterator is not sorted
Note : Actually I think there is no need to yield the values of your iterable wen you have them already.So as a more elegant way I suggest to use a generator expression within all function and return a bool value :
from itertools import izip,tee
def checkSorted(it):
pre,it=tee(it)
next(it)
return all(i>=j for i,j in izip(pre,it))
I'm trying to parse a tuple of the form:
a=(1,2)
or
b=((1,2), (3,4)...)
where for a's case the code would be:
x, y = a
and b would be:
for element in b:
x, y = element
is there an fast and clean way to accept both forms? This is in a MIDI receive callback
(x is a pointer to a function to run, and y is intensity data to be passed to a light).
# If your input is in in_seq...
if hasattr(in_seq[0], "__iter__"):
# b case
else:
# a case
This basically checks to see if the first element of the input sequence is iterable. If it is, then it's your second case (since a tuple is iterable), if it's not, then it's your first case.
If you know for sure that the inputs will be tuples, then you could use this instead:
if isinstance(in_seq[0], tuple):
# b case
else:
# a case
Depending on what you want to do, your handling for the 'a' case could be as simple as bundling the single tuple inside a larger tuple and then calling the same code on it as the 'b' case, e.g...
b_case = (a_case,)
Edit: as pointed out in the comments, a better version might be...
from collections import Iterable
if isinstance(in_seq[0], Iterable):
# ...
The right way to do that would be:
a = ((1,2),) # note the difference
b = ((1,2), (3,4), ...)
for pointer, intensity in a:
pass # here you do what you want
Simply put! there is this list say LST = [[12,1],[23,2],[16,3],[12,4],[14,5]] and i want to get all the minimum elements of this list according to its first element of the inside list. So for the above example the answer would be [12,1] and [12,4]. Is there any typical way in python of doing this?
Thanking you in advance.
Two passes:
minval = min(LST)[0]
return [x for x in LST if x[0] == minval]
One pass:
def all_minima(iterable, key=None):
if key is None: key = id
hasminvalue = False
minvalue = None
minlist = []
for entry in iterable:
value = key(entry)
if not hasminvalue or value < minvalue:
minvalue = value
hasminvalue = True
minlist = [entry]
elif value == minvalue:
minlist.append(entry)
return minlist
from operator import itemgetter
return all_minima(LST, key=itemgetter(0))
A compact single-pass solution requires sorting the list -- that's technically O(N log N) for an N-long list, but Python's sort is so good, and so many sequences "just happen" to have some embedded order in them (which timsort cleverly exploits to go faster), that sorting-based solutions sometimes have surprisingly good performance in the real world.
Here's a solution requiring 2.6 or better:
import itertools
import operator
f = operator.itemgetter(0)
def minima(lol):
return list(next(itertools.groupby(sorted(lol, key=f), key=f))[1])
To understand this approach, looking "from the inside, outwards" helps.
f, i.e., operator.itemgetter(0), is a key-function that picks the first item of its argument for ordering purposes -- the very purpose of operator.itemgetter is to easily and compactly build such functions.
sorted(lol, key=f) therefore returns a sorted copy of the list-of-lists lol, ordered by increasing first item. If you omit the key=f the sorted copy will be ordered lexicographically, so it will also be in order of increasing first item, but that acts only as the "primary key" -- items with the same first sub-item will in turn be sorted among them by the values of their second sub-items, and so forth -- while with the key=f you're guaranteed to preserve the original order among items with the same first sub-item. You don't specify which behavior you require (and in your example the two behaviors happen to produce the same result, so we cannot distinguish from that example) which is why I'm carefully detailing both possibilities so you can choose.
itertools.groupby(sorted(lol, key=f), key=f) performs the "grouping" task that is the heart of the operation: it yields groups from the sequence (in this case, the sequence sorted provides) based on the key ordering criteria. That is, a group with all adjacent items producing the same value among themselves when you call f with the item as an argument, then a group with all adjacent item producing a different value from the first group (but same among themselves), and so forth. groupby respect the ordering of the sequence it takes as its argument, which is why we had to sort the lol first (and this behavior of groupby makes it very useful in many cases in which the sequence's ordering does matter).
Each result yielded by groupby is a pair k, g: a key k which is the result of f(i) on each item in the group, an iterator g which yields each item in the group in sequence.
The next built-in (the only bit in this solution which requires Python 2.6) given an iterator produces its next item -- in particular, the first item when called on a fresh, newly made iterator (and, every generator of course is an iterator, as is groupby's result). In earlier Python versions, it would have to be groupby(...).next() (since next was only a method of iterators, not a built-in), which is deprecated since 2.6.
So, summarizing, the result of our next(...) is exactly the pair k, g where k is the minimum (i.e., first after sorting) value for the first sub-item, and g is an iterator for the group's items.
So, with that [1] we pick just the iterator, so we have an iterator yielding just the subitems we want.
Since we want a list, not an iterator (per your specs), the outermost list(...) call completes the job.
Is all of this worth it, performance-wise? Not on the tiny example list you give -- minima is actually slower than either code in #Kenny's answer (of which the first, "two-pass" solution is speedier). I just think it's worth keeping the ideas in mind for the next sequence processing problem you may encounter, where the details of typical inputs may be quite different (longer lists, rarer minima, partial ordering in the input, &c, &c;-).
m = min(LST, key=operator.itemgetter(0))[0]
print [x for x in LST if x[0] == m]
minval = min(x[0] for x in LST)
result = [x for x in LST if x[0]==minval]