I'd like to filter a list of values. Depending on the state of a variable, I'd like to return the positive or negative result of the filter. Example:
def foo(it, my_condition):
return [s for s in it if (s.startswith("q") if my_condition else not s.startswith("q"))]
foo(["The", "quick", "brown", "fox"], my_condition=True)
So on my_condition=True I get ["quick"] and on my_condition=False I get ["The", "brown", "fox"].
What I don't like about the implementation is this part: (s.startswith("q") if filter else not s.startswith("q")). It contains duplicate code and takes up a lot of space in an otherwise concise list comprehension. What I really want is just to insert a not after the if, depending on the state of the filter variable.
Is there a more pretty / clean solution to this? If possible, I'd like to avoid the computational overhead of lambda expressions in this case.
Just compare the result of startswith with the boolean parameter:
def foo(it, keep_matches):
return [s for s in it if s.startswith("q") == keep_matches]
note: don't call your variable filter as this is a built-in function to filter iterables, I changed for a more explicit name (not sure it's the best choice, but it's better than flag or filter)
Related
I was trying to check for palindromes and wanted to eliminate non alphanumeric characters. I can use filter for this like so:
filteredChars = filter(lambda ch: ch.isalnum(), s)
However, I also need to compare with the same case so I would really like to get is ch.lower so I tried this.
filteredChars = filter(lambda ch.lower() : ch.isalnum(), s)
but I got an error.
Is it possible to write a lambda to do this without a list comprehension or a user defined function?
I can already get my answer with:
filteredChars = [ch.lower() for ch in s if ch.isalnum()]
However, this DigitalOcean filter() tutorial says that list comprehensions use up more space
... a list comprehension will make a new list, which will increase the run time for that processing. This means that after our list comprehension has completed its expression, we’ll have two lists in memory. However, filter() will make a simple object that holds a reference to the original list, the provided function, and an index of where to go in the original list, which will take up less memory
Does filter only hold references to the filtered values in the original sequence? When I think of this though, I conclude (maybe not correctly) that if I have to lower the cases, then I would actually need a new list with the modified characters hence, filter can't be used for this task at all.
First of all, the reason this didn't work is simply syntax. An argument for a lambda can't pass through operations and is simply a declaration, just like regular functions.
Next, you can't really modify the return value as filter needs the function to return a boolean - which values to pass or filter. It can't modify the actual values. So if you want to use filter you need to "normalize" its input to be lowercase:
filteredChars = filter(lambda ch: ch.isalnum(), s.lower())
Alternatively, you can convert the exact list-comprehension you used to a generator expression, as simply as changing the brackets [...] to parenthesis (...):
filteredChars = (ch.lower() for ch in s if ch.isalnum())
Lastly, as this can be confusing, you can also create a generator and loop over that:
def filter_chars(s):
for ch in s:
if ch.isalnum():
yield ch.lower()
And now going in line with the previous methods you can do:
filteredChars = filter_chars(s)
Or as you are probably going to iterate over filteredChars anyway, just do directly:
for ch in filter_chars(s):
# do stuff
filter does
Construct an iterator from those elements of iterable for which
function returns true.(...)
as you wish to select and alter elements, this is not task for filter alone but rather composition of filter and map in this particular case this might be written following way
s = "Xy1!"
filteredChars = map(str.lower,filter(str.isalnum,s))
for c in filteredChars:
print(c)
gives output
x
y
1
filter and map are members of trinity, where third one is reduce (inside python2) xor functools.reduce (inside python3).
One way to do that is applying filter(), then joining the string, then applying lower():
"".join(filter(lambda ch: ch.isalnum(), s)).lower()
Another is using map() and a ternary operator:
"".join(map(lambda ch: ch.lower() if ch.isalnum() else "", s))
Let's say I have a list of lists like this
lol = [[1, 'e_r_i'], [2, 't_u_p']]
and I want to apply a function to the string elements which returns several values from which I need only a subset (which ones differ per use-case). For illustration purposes, I just make a simple split() operation:
def dummy(s):
return s.split('_')
Now, let's say I only want the last two letters and concatenate those; there is the straightforward option
positions = []
for _, s in lol:
stuff = dummy(s)
positions.append(f"{stuff[1]}{stuff[2]}")
and doing the same in a list comprehension
print([f"{dummy(s)[1]}{dummy(s)[2]}" for _, s in lol])
both give the identical, desired outcome
['ri', 'up']
Is there a way to use the walrus operator here in the list comprehension to avoid calling dummy twice?
PS: Needless to say that in reality the dummy function is far more complex, so I don't look for a better solution regarding the split but it is fully about the structure and potential usage of the walrus operator.
I will have to say that your first explicit loop is the best option here. It is clear, readable code and you're not repeating any calls.
Still, as you asked for it, you could always do:
print([f"{(y:=dummy(s))[1]}{y[2]}" for _, s in lol])
You could also wrap the processing in another function:
def dummy2(l):
return f"{l[1]}{l[2]}"
And this removes the need of walrus altogether and simplifies the code further:
print([dummy2(dummy(s)) for _, s in lol])
Yes. This is what you want
output = [f"{(stuff := dummy(s))[1]}{stuff[2]}" for _, s in lol]
In python, there is such a feature - True and False can be added, subtracted, etc
Are there any examples where this can be useful?
Is there any real benefit from this feature, for example, when:
it increases productivity
it makes the code more concise (without losing speed)
etc
While in most cases it would just be confusing and completely unwarranted to (ab)use this functionality, I'd argue that there are a few cases that are exceptions.
One example would be counting. True casts to 1, so you can count the number of elements that pass some criteria in this fashion, while remaining concise and readable. An example of this would be:
valid_elements = sum(is_valid(element) for element in iterable)
As mentioned in the comments, this could be accomplished via:
valid_elements = list(map(is_valid, iterable)).count(True)
but to use .count(...), the object must be a list, which imposes a linear space complexity (iterable may have been a constant space generator for all we know).
Another case where this functionality might be usable is as a play on the ternary operator for sequences, where you either want the sequence or an empty sequence depending on the value. Say you want to return the resulting list if a condition holds, otherwise an empty list:
return result_list * return_empty
or if you are doing a conditional string concatentation
result = str1 + str2 * do_concatenate
of course, both of these could be solved by using python's ternary operator:
return [] if return_empty else result_list
...
result = str1 + str2 if do_concatenate else str1
The point being, this behavior does provide other options in a few scenarios that isn't all too unreasonable. Its just a matter of using your best judgement as to whether it'll cause confusion for future readers (yourself included).
I would avoid it at all cost. It is confusing and goes against typing. Python being permissive does not mean you should do it ...
I want to pass a lambda expression to a filter that captures an outside variable l. I want l to be the list passed to filter. Assume that that list comes out of some other list comprehension/ mapping/ filtering. Is it possible to assign an identifier (in tis case l) to that list? Like so:
filter((lambda x : len([z for z in l if z == x]) == 1), l#[1,1,2,3,4,4,5,6,6] )
I just used # because in Haskell you can use # in a similar way.
Is there some succinct syntax for this or do I need to break up the operation into several lines and assign l in a normal manner?
It is not possible to give a name 'on the fly' to the second parameter of filter.
If even it were, I think that Python code should be more explicit (see "The Zen of Python") and I suggest to split the code in two or more lines.
I have a string list
[str1, str2, str3.....]
and I also have a def to check the format of the strings, something like:
def CheckIP(strN):
if(formatCorrect(strN)):
return True
return False
Now I want to check every string in list, and of course I can use for to check one by one. But could I use map() to make code more readable...?
You can map your list to your function and then use all to check if it returns True for every item:
if all(map(CheckIP, list_of_strings)):
# All strings are good
Actually, it would be cleaner to just get rid of the CheckIP function and use formatCorrect directly:
if all(map(formatCorrect, list_of_strings)):
# All strings are good
Also, as an added bonus, all uses lazy-evaluation. Meaning, it only checks as many items as are necessary before returning a result.
Note however that a more common approach would be to use a generator expression instead of map:
if all(formatCorrect(x) for x in list_of_strings):
In my opinion, generator expressions are always better than map because:
They are slightly more readable.
They are just as fast if not faster than using map. Also, in Python 2.x, map creates a list object that is often unnecessary (wastes memory). Only in Python 3.x does map use lazy-computation like a generator expression.
They are more powerful. In addition to just mapping items to a function, generator expressions allow you to perform operations on each item as they are produced. For example:
sum(x * 2 for x in (1, 2, 3))
They are preferred by most Python programmers. Keeping with convention is important when programming because it eases maintenance and makes your code more understandable.
There is talk of removing functions like map, filter, etc. from a future version of the language. Though this is not set in stone, it has come up many times in the Python community.
Of course, if you are a fan of functional programming, there isn't much chance you'll agree to points one and four. :)
An example, how you could do:
in_str = ['str1', 'str2', 'str3', 'not']
in_str2 = ['str1', 'str2', 'str3']
def CheckIP(strN):
# different than yours, just to show example.
if 'str' in strN:
return True
else:
return False
print(all(map(CheckIP, in_str))) # gives false
print(all(map(CheckIP, in_str2))) # gives true
L = [str1, str2, str3.....]
answer = list(map(CheckIP, L))
answer is a list of booleans such that answer[i] is CheckIP(L[i]). If you want to further check if all of those values are True, you could use all:
all(answer)
This returns True if and only if all the values in answer are True. However, you may do this without listifying:
all(map(CheckIP, L)), as, in python3, `map` returns an iterator, not a list. This way, you don't waste space turning everything into a list. You also save on time, as the first `False` value makes `all` return `False`, stopping `map` from computing any remaining values