Compound conditions in a for loop - python

Python allows an "if" condition in list comprehensions, e.g.:
[l for l in lines if l.startswith('example')]
This feature is missing in regular "for" loop, so in absence of:
for line in lines if line.startswith('example'):
statements
one needs to assess the condition in the loop:
for line in lines:
if line.startswith('example'):
statements
or to embed the generator expression, like:
for line in [l for l in lines if l.startswith('example')]:
statements
Is my understanding correct? Is there a better or more pythonic way than ones I listed above to achieve the same result of adding a condition in the for loop?
Please notice "lines" was chosen just as an example, any collection or generator could be there.

Several nice ideas came from other answers and comments, but I think this recent discussion on Python-ideas and its continuation are the best answer to this question.
To summarize: the idea was already discussed in the past, and the benefits did not seem enough to motivate the syntax change, considering:
increased language complexity and impact on learning curve
technical changes in all implementations: CPython, Jython, Pypy..
possible weird situations that extreme use of the synthax could lead
One point that people seem to highly consider is to avoid bringing Perl-alike complexity that compromise maintainability.
This message and this one nicely summarize possible alternatives (almost already appeared in this page as well) to a compound if-statement in for-loop:
# nested if
for l in lines:
if l.startswith('example'):
body
# continue, to put an accent on exceptional case
for l in lines:
if not l.startswith('example'):
continue
body
# hacky way of generator expression
# (better than comprehension as does not store a list)
for l in (l for l in lines if l.startswith('example')):
body()
# and its named version
def gen(lines):
return (l for l in lines if l.startswith('example'))
for line in gen(lines):
body
# functional style
for line in filter(lambda l: l.startswith('example'), lines):
body()

Maybe not Pythonic, but you could filter the lines.
for line in filter(lambda l: l.startswith('example'), lines):
print(line)
And you could define your own filter function, of course, if that lambda bothers you, or you want more complex filtering.
def my_filter(line):
return line.startswith('example') or line.startswith('sample')
for line in filter(my_filter, lines):
print(line)
I would say that having the condition within the loop is better because you aren't maintaining the "filtered" list in memory as you iterate over the lines.
So, that'd just be
for line in file:
if not my_filter(line):
continue
# statements

Its not that the feature is missing, I can't think of any way it could be done except in some special cases. (l for l in lines if l.startswith('example')) is a generator object and the l variable is local to that object. The for only sees what was returned by the generator's __next__ method.
The for is very different because the result of the generator needs to be bound to a variable in the caller's scope. You could have written
for line in (line for line in lines if l.startswith('example')):
foo(line)
safely because those two line's are in different scopes.
Further, the generator doesn't have to return just its local variable. It can evaluate any expression. How would you shortcut this?
for line in (foo(line)+'bar' for line in lines if line.startswith('example')):
statements
Suppose you have a list of lists
for l in (l[:] for l in list_of_lists if l):
l.append('modified')
That shouldn't append to the original lists.

Is there a better or more pythonic way than ones I listed above to achieve the same result of adding a condition in the for loop?
No, there is not, and there shouldn't be; that was the rationale for why list comprehensions got here in the first place. From the corresponding PEP:
List comprehensions provide a more concise way to create lists in situations where map() and filter() and/or nested loops would currently be used.
List comprehensions constitute an alternative for nested for, ifs; why would you want an alternative to the alternative?
If you need to use an if with a for, you nest it inside it, if you don't want to do that, you use a list comprehension. Flat is better than nested but readability counts; allowing an if there would result in long ugly lines that are harder to visually parse.

Related

How does this implementation of the Python sum function work? [duplicate]

This question already has answers here:
Understanding generators in Python
(13 answers)
What does "list comprehension" and similar mean? How does it work and how can I use it?
(5 answers)
Closed last year.
I have found an example online of how to count items in a list with the sum() function in Python; however, when I search for how to use the sum() function on the internet, all I can find is the basic sum(iterable, start), which adds numbers together from each element of the list/array.
Code I found, where each line of the file contains one word, and file = open("words.txt", "r"):
wordsInFile = sum(1 for line in file)
this works in my program, and I kind of see what is happening, but I would like to learn more about this kind of syntax, and what it can or can't recognize besides line. It seems pretty efficient, but I can't find any website explaining how it works, which prevents me from using this in the future in other contexts.
This expression is a generator.
First, let's write it a bit differently
wordsInFile = sum([1 for line in file])
In this form, [1 for line in file] is called a list comprehension. It's basically a for loop which produces a list, wrapped up into one line. It's similar to
wordsInFile = []
for line in file:
wordsInFile.append(1)
but a lot more concise.
Now, when we remove the brackets
wordsInFile = sum(1 for line in file)
we get what's called a generator expression. It's basically the same as what I wrote before except that it doesn't produce an intermediate list. It produces a special iterator object that supplies the values on-demand, which is more efficient for a function like sum that only uses its input once (and hence doesn't need us to waste a bunch of memory producing a big list).

While loops where only conditions are executed? [duplicate]

This question already has answers here:
Remove all occurrences of a value from a list?
(26 answers)
Closed 2 years ago.
So I want to execute only while loop statements, without putting anything inside them. For example, I have an array arr from which I have to remove multiple occurrences of some element. The instant the condition statement returns an error, while loop should end.
arr=[1,2,4,2,4,2,2]
This removes only one 2:
arr.remove(2)
I need to run this as long as it does not return error. (C++ has a semicolon put after while to do this).
I want something like this
while(arr.remove(2));
Three things.
First, it's not considered good practice in Python – it's not "pythonic" – to use an expression for its side effects. This is why, for example, the Python assignment operator is not itself an expression. (Although you can do something like a = b = 1 to set multiple variables to the same value, that statement doesn't break down as a = (b = 1); any such attempt to use an assignment statement as a value is a syntax error.)
Second, modifying data in place is also discouraged; it's usually better to make a copy and make the changes as the copy is constructed.
Third, even if this were a good way to do things, it wouldn't work in this case. When the remove method succeeds, it returns None, which is not truthy, so your loop exits immediately. On the other hand, when it fails, instead of returning a false value, it throws an exception, which will abort your whole program instead of just the loop, unless you wrap it in a try block.
So the list comprehension probably is the best solution here.
The way you are looking to solve this does not yield the results you are looking for. Since you are looking to create a new list, you are not going to want to use the remove function as per #Matthias comment. The idiomatic way to do it would be something along the lines of this:
arr=[1,2,4,2,4,2,2]
arr = [x if x != 2 for x in arr]
So I want to execute only while loop statements, without putting anything inside them.
That's really not necessary. Don't try to copy other language's syntax in Python. Different languages are designed with different objectives and hence, they have different syntax (or grammar of the language). Python has a different way of doing things than C++.
If you want to focus on the effectiveness of the program, then that's the different story. See this for more information on this.
Unfortunately, remove doesn't return anything (it returns None). So, you can't have anything that would look neat and clean without putting anything inside while.
Pythonic way to remove all occurrence of a element from list:
list(filter((2).__ne__, arr))
Or
arr = [x for x in arr if x != 2]
Or
while 2 in arr:
arr.remove(2)
you can use:
arr = [1,2,4,2,4,2,2]
try:
while arr.pop(arr.index(2)):
pass
except ValueError:
pass
print(arr)
#[1, 4, 4]
I am assuming you want to remove all occurrences of an element. This link might help you.
click here

Generator expression vs list comprehension for adding values to a set [duplicate]

This question already has answers here:
Understanding generators in Python
(13 answers)
Why does it work when I append a new element to a TUPLE?
(2 answers)
Closed 3 years ago.
I am a tutor for an intermediate Python course at a university and I recently had some students come to me for the following problem (code is supposed to add all the values in a list to a set):
mylist = [10, 20, 30, 40]
my_set = set()
(my_set.add(num) for num in mylist)
print(my_set)
Their output was:
set()
Now, I realized their generator expression is the reason nothing is being added to the set, but I am unsure as to why.
I also realized that using a list comprehension rather than a generator expression:
[my_set.add(num) for num in mylist]
actually adds all the values to the set (although I realize this is memory inefficient as it involves allocating a list that is never used. The same could be done with just a for loop and no additional memory.).
My question is essentially why does the list comprehension add to the set, while the generator expression does not? Also would the generator expression be in-place, or would it allocate more memory?
Generator expressions are lazy, if you don't actually iterate over them, they do nothing (aside from compute the value of the iterator for the outermost loop, e.g. in this case, doing work equivalent to iter(mylist) and storing the result for when the genexpr is actually iterated). To make it work, you'd have to run out the generator, e.g. using the consume itertools recipe:
consume(my_set.add(num) for num in mylist)
# Unoptimized equivalent:
for _ in (my_set.add(num) for num in mylist):
pass
In any event, this is an insane thing to do; comprehensions and generator expressions are functional programming tools, and should not have side-effects, let alone be written solely for the purpose of producing side-effects. Code maintainers (reasonably) expect that comprehensions will trigger no "spooky action at a distance"; don't violate that expectation. Just use a set comprehension:
myset = {num for num in mylist}
or since the comprehension does nothing in this case, the set constructor:
myset = set(mylist) # Or with modern unpacking generalizations, myset = {*mylist}
Your students (and yourself perhaps) are using comprehension expressions as shorthand for loops - that's a bad pattern.
The answer to your question is that the list comprehension needs to be evaluated immediately, as the results are needed to populate the list, while the generator expression is only evaluated as it's being used.
You're interested in the side effect of that evaluation, but if the side effect is really the main goal, the code should just be:
myset = set(mylist)

Why am I getting a syntax error for this conditional statement?

I've recently been practicing using map() in Python 3.5.2, and when I tried to run the module it said the comma separating the function and the iterable was a SyntaxError. Here's the code:
eng_swe = {"merry":"god", "christmas":"jul", "and":"och", "happy":"gott",
"new":"nytt", "year":"år"}
def map_translate(l):
"""Translates English words into Swedish using the dictionary above."""
return list(map(lambda x: eng_swe[x] if x in eng_swe.keys(), l))
I noticed that if I eliminate the conditional statement like this:
return list(map(lambda x: eng_swe[x], l))
it works fine, but it sacrifices the ability to avoid attempting to add items to the list that aren't in the dictionary. Interestingly enough, there also weren't any problems when I tried using a conditional statement with reduce(), as shown here:
from functools import reduce
def reduce_max_in_list(l):
"""Returns maximum integer in list using the 'reduce' function."""
return reduce(lambda x, y: x if x > y else y, l)
Yes, I know I could do the exact same thing more cleanly and easily with a list comprehension, but I consider it worth my time to at least learn how to use map() correctly, even if I end up never using it again.
You're getting the SyntaxError because you're using a conditional expression without supplying the else clause which is mandatory.
The grammar for conditional expressions (i.e if statements in an expression form) always includes an else clause:
conditional_expression ::= or_test ["if" or_test "else" expression]
^^
In your reduce example you do supply it and, as a result, no errors are being raised.
In your first example, you don't specify what should be returned if the condition isn't true. Since python can't yield nothing from an expression, that is a syntax error. e.g:
a if b # SyntaxError.
a if b else c # Ok.
You might argue that it could be useful to implicitly yield None in this case, but I doubt that a proposal of that sort would get any traction within the community... (I wouldn't vote for it ;-)
While the others' explanations of why your code is causing a SyntaxError are completely accurate, the goal of my answer is to aid you in your goal "to at least learn how to use map() correctly."
Your use of map in this context does not make much sense. As you noted in your answer it would be much cleaner if you used a list comprehension:
[eng_swe[x] for x in l if x in eng_swe]
As you can see, this looks awfully similar to your map expression, minus some of the convolution. Generally, this is a sign that you're using map incorrectly. map(lambda... is pretty much a code smell. (Note that I am saying this as an ardent supporter of the use of map in Python. I know many people think it should never be used, but I am not one of those people, as long as it is used properly.)
So, you might be wondering, what is an example of a good time to use map? Well, one use case I can think of off the top of my head is converting a list of strs to ints. For example, if I am reading a table of data stored in a file, I might do:
with open('my_file.txt', 'r') as f:
data = [map(int, line.split(' ')) for line in f]
Which would leave me with a 2d-array of ints, perfect for further manipulation or analysis. What makes this a better use of map than your code is that it uses a built-in function. I am not writing a lambda expressly to be used by map (as this is a sign that you should use a list comprehension).
Getting back to your code, however... if you want to write your code functionally, you should really be using filter, which is just as important to know as map.
map(lambda x: eng_swe[x], filter(lambda x: eng_swe.get(x), l))
Note that I was unable to get rid of the map(lambda... code smell in my version, but at least I broke it down into smaller parts. The filter finds the words that can be translated and the map performs the actual translation. (Still, in this case, a list comprehension is probably better.) I hope that this explanation helps you more than it confuses you in your quest to write Python code functionally.

pythonic way to combine a lot of startswith statements

What is the "pythonic" way to combine a lot of startswith statements?
Here are the details:
I receive various types of messages from a server, which sends them with different first letters in order for receiver to quickly identify and sort them. I wrote a code with a lot of
if message.startswith('A'):
do_A()
elif message.startswith('B'):
do_B()
- like statements. However, I feel there is more pythonic way to write the code without many statements, like maybe to make a list of all possible first letters and have one startswith statement.
Other variants with if message[0]=='A' are even better, since it appears to be faster per this, and speed matters to me.
Use a dictionary mapping first letter to a function:
message_map = {'A': do_A, 'B': do_B}
dispatch = message_map.get(message[:1])
if dispatch is not None:
dispatch()
Functions in Python are first-class objects, so you can store them in a dictionary like this.
Note that I used a slice to get the first character; it'll result in an empty string if message happens to be empty, rather than throw an IndexError exception.

Categories

Resources