Creating a generator expression from a list in python - python

What is the best way to do the following in Python:
for item in [ x.attr for x in some_list ]:
do_something_with(item)
This may be a nub question, but isn't the list comprehension generating a new list that we don't need and just taking up memory? Wouldn't it be better if we could make an iterator-like list comprehension.

Yes (to both of your questions).
By using parentheses instead of brackets you can make what's called a "generator expression" for that sequence, which does exactly what you've proposed. It lets you iterate over the sequence without allocating a list to hold all the elements simultaneously.
for item in (x.attr for x in some_list):
do_something_with(item)
The details of generator expressions are documented in PEP 289.

Why not just:
for x in some_list:
do_something_with(x.attr)

This question is tagged functional-programming without an appropriate answer, so here's a functional solution:
from operator import itemgetter
map(do_something_with, map(itemgetter('attr'), some_list))
Python 3's map() uses an iterator, but Python 2 creates a list. For Python 2 use itertools.imap() instead.
If you're returning some_list, you can simplify it further using a generator expression and lazy evaluation :
def foo(some_list):
return (do_something_with(item.attr) for item in some_list)

Related

Generator expression vs list comprehension for adding values to a set [duplicate]

This question already has answers here:
Understanding generators in Python
(13 answers)
Why does it work when I append a new element to a TUPLE?
(2 answers)
Closed 3 years ago.
I am a tutor for an intermediate Python course at a university and I recently had some students come to me for the following problem (code is supposed to add all the values in a list to a set):
mylist = [10, 20, 30, 40]
my_set = set()
(my_set.add(num) for num in mylist)
print(my_set)
Their output was:
set()
Now, I realized their generator expression is the reason nothing is being added to the set, but I am unsure as to why.
I also realized that using a list comprehension rather than a generator expression:
[my_set.add(num) for num in mylist]
actually adds all the values to the set (although I realize this is memory inefficient as it involves allocating a list that is never used. The same could be done with just a for loop and no additional memory.).
My question is essentially why does the list comprehension add to the set, while the generator expression does not? Also would the generator expression be in-place, or would it allocate more memory?
Generator expressions are lazy, if you don't actually iterate over them, they do nothing (aside from compute the value of the iterator for the outermost loop, e.g. in this case, doing work equivalent to iter(mylist) and storing the result for when the genexpr is actually iterated). To make it work, you'd have to run out the generator, e.g. using the consume itertools recipe:
consume(my_set.add(num) for num in mylist)
# Unoptimized equivalent:
for _ in (my_set.add(num) for num in mylist):
pass
In any event, this is an insane thing to do; comprehensions and generator expressions are functional programming tools, and should not have side-effects, let alone be written solely for the purpose of producing side-effects. Code maintainers (reasonably) expect that comprehensions will trigger no "spooky action at a distance"; don't violate that expectation. Just use a set comprehension:
myset = {num for num in mylist}
or since the comprehension does nothing in this case, the set constructor:
myset = set(mylist) # Or with modern unpacking generalizations, myset = {*mylist}
Your students (and yourself perhaps) are using comprehension expressions as shorthand for loops - that's a bad pattern.
The answer to your question is that the list comprehension needs to be evaluated immediately, as the results are needed to populate the list, while the generator expression is only evaluated as it's being used.
You're interested in the side effect of that evaluation, but if the side effect is really the main goal, the code should just be:
myset = set(mylist)

What type of comprehension is this?

I came across the following Python code and am having trouble understanding it:
''.join(random.choice(string.ascii_lowercase + string.ascii_uppercase + string.digits) for i in range(length))
The for loop tells me it's a comprehension, but of what type? It's not a list comprehension, because the [] are missing (unless there's a special syntax at work here). I tried to work it out by running
random.choice(string.ascii_lowercase + string.ascii_uppercase + string.digits) for i in range(length)
directly in the interpreter but got syntax error at for.
I did some digging around and came to a not-so-sure conclusion that this is what's called a generator comprehension, but I didn't find any examples that look like this one; they all use the () notation for creating the generator object.
So, is it like join() works on iterators (and therefore generators) and we actually have a generator syntax here? If yes, can we omit the surrounding () when creating generator objects in function calls?
you need join() because the list contains characters, and you want to get a string, hence join()
random.choice() selects random character from the argument list
the argument list contains ASCII upper/lower case characters and digits
the length of the resulting string is length
Summing up all together, this line of code generates a random string with length length that contains upper/lower case letters and numbers.
This is a plain old list comprehension, just the [] are missing because not required when you use join()
It creates an iterator, much like in a list comprehension. Take this example from pythonwiki:
# list comprehension
doubles = [2 * n for n in range(50)]
# same as the list comprehension above
doubles = list(2 * n for n in range(50))
Both are list comprehensions, but the former case is more familiar. I believe your example relies on the latter case. The wiki I linked calls this a generator expression.

Python : how to create list of single tuple?

i needed a list of only one tuple, like this[(1,2,3,4,5,6)]
, i tried this,
>>> [( i for i in range(1,10))]
[<generator object <genexpr> at 0x7fbf7ad94cd0>]
what is that generator object? How to use it?
how to generate this kind of list?
You need this:
[tuple( i for i in range(1,10))]
(i for i in something) notation is called Generator expression in Python and returns a generator object. You are simply, capturing this object in list. See PEP-289, for more knowledge on Generator Expressions
Also, I am assuming, you plan to do much more than i for i in range(1,10), as this is completely redundant, you can just as well do [tuple(range(1,10))]

Python concatenate list

I'm new to python and this is just to automate something on my PC. I want to concatenate all the items in a list. The problem is that
''.join(list)
won't work as it isn't a list of strings.
This site http://www.skymind.com/~ocrow/python_string/ says the most efficient way to do it is
''.join([`num` for num in xrange(loop_count)])
but that isn't valid python...
Can someone explain the correct syntax for including this sort of loop in a string.join()?
You need to turn everything in the list into strings, using the str() constructor:
''.join(str(elem) for elem in lst)
Note that it's generally not a good idea to use list for a variable name, it'll shadow the built-in list constructor.
I've used a generator expression there to apply the str() constructor on each and every element in the list. An alternative method is to use the map() function:
''.join(map(str, lst))
The backticks in your example are another spelling of calling repr() on a value, which is subtly different from str(); you probably want the latter. Because it violates the Python principle of "There should be one-- and preferably only one --obvious way to do it.", the backticks syntax has been removed from Python 3.
Here is another way (discussion is about Python 2.x):
''.join(map(str, my_list))
This solution will have the fastest performance and it looks nice and simple imo. Using a generator won't be more efficient. In fact this will be more efficient, as ''.join has to allocate the exact amount of memory for the string based on the length of the elements so it will need to consume the whole generator before creating the string anyway.
Note that `` has been removed in Python 3 and it's not good practice to use it anymore, be more explicit by using str() if you have to eg. str(num).
just use this, no need of [] and use str(num):
''.join(str(num) for num in xrange(loop_count))
for list just replace xrange(loop_count) with the list name.
example:
>>> ''.join(str(num) for num in xrange(10)) #use range() in python 3.x
'0123456789'
If your Python is too old for "list comprehensions" (the odd [x for x in ...] syntax), use map():
''.join(map(str, list))

simple python list comprehension question

i am trying to select the elements of a list without the very first element. the following code works but it kinda look ugly to me
[s[i] for i in range(len(s)) if i>0]
is there a better way to write it? thanks
Use the slicing notation:
s[1:]
Alternatively, you can avoid copying the list thus:
itertools.islice(s, 1, None)
The result isn't a list — it doesn't support random access, for instance — but you can pass it to anything that accepts an iterator.
Wouldn't s[1:] be correct?

Categories

Resources