Making a complicated list comprehension conditional

Making a complicated list comprehension conditional - python

Here is my current one-liner:
leader = [server.get_member(x) for x in self.rosters[server.id][clan]['members'] if discord.utils.get(server.get_member(x).roles, id="463226598351699968")]
I want to only run this if server.get_member(x) is not False. How can I add this extra logic into this list comprehension? I understand how to do a basic for in statement, but nesting it deeper than that becomes a bit confusing for me.

In general, do not sacrifice readability for the sake of writing a one-liner. If it not immediately obvious how to do it with a list-comprehension, then use a for-loop.
leader = []
for x in self.rosters[server.id][clan]['members']:
member = server.get_member(x)
if member and discord.utils.get(member.roles, id="463226598351699968"):
leader.append(member)
Although, in this specific case, since you do not need x, you can use map to apply server.get_member while iterating.
leader = [m for m in map(server.get_member, self.rosters[server.id][clan]['members'])
if m and discord.utils.get(m.roles, id="463226598351699968")]

You can't. The item in list comprehension can't be saved, so you'll have to evaluate it twice. Even if you could, don't. List comprehensions are for filtering, not for running code as side effect. It's unreadable and prone to mistakes.

In general, you can achieve the effect of a temporary variable assignment with a nested list comprehension that iterates through a 1-tuple:
leader = [m for x in self.rosters[server.id][clan]['members'] for m in (server.get_member(x),) if m and discord.utils.get(m.roles, id="463226598351699968")]
But in this particular case, as #OlivierMelançon pointed out in the comment, since the additional assignment is simply mapping a value to a function call, you can achieve the desired result with the map function instead:
leader = [m for m in map(server.get_member, self.rosters[server.id][clan]['members']) if m and discord.utils.get(m.roles, id="463226598351699968")]

While I agree with the comments suggesting you should not write this as a comprehension for readability you could try:
leader = [server.get_member(x) for x in self.rosters[server.id][clan]['members'] if discord.utils.get(server.get_member(x).roles, id="463226598351699968") if server.get_member(x)]
Similar to this answer.

Related

Convert for loop into list comprehension with assignment?

I am trying to convert a for loop with an assignment into a list comprehension.
More precisely I am trying to only replace one element from a list with three indexes.
Can it be done?
for i in range(len(data)):
data[i][0] = data[i][0].replace('+00:00','Z').replace(' ','T')
Best

If you really, really want to convert it to a list comprehension, you could try something like this, assuming the sub-lists have three elements, as you stated in the questions:
new_data = [[a.replace('+00:00','Z').replace(' ','T'), b, c] for (a, b, c) in data]
Note that this does not modify the existing list, but creates a new list, though. However, in this case I'd just stick with a regular for loop, which much better conveys what you are actually doing. Instead of iterating the indices, you could iterate the elements directly, though:
for x in data:
x[0] = x[0].replace('+00:00','Z').replace(' ','T')

I believe it could be done, but that's not the best way to do that.
First you would create a big Jones Complexity for a foreign reader of your code.
Second you would exceed preferred amount of chars on a line, which is 80. Which again will bring complexity problems for a reader.
Third is that list comprehension made to return things from comprehensing of lists, here you change your original list. Not the best practice as well.

List comprehension is useful when making lists. So, it is not recommended here. But still, you can try this simple solution -
print([ele[0].replace('+00:00','Z').replace(' ','T') for ele in data])

Although I don't recommend you use list-comprehension in this case, but if you really want to use it, here is a example.
It can handle different length of data, if you need it.
code:
data = [["1 +00:00",""],["2 +00:00","",""],["3 +00:00"]]
print([[i[0].replace('+00:00','Z').replace(' ','T'),*i[1:]] for i in data])
result:
[['1TZ', ''], ['2TZ', '', ''], ['3TZ']]

Can the walrus operator be used to avoid multiple function calls within a list comprehension?

Let's say I have a list of lists like this
lol = [[1, 'e_r_i'], [2, 't_u_p']]
and I want to apply a function to the string elements which returns several values from which I need only a subset (which ones differ per use-case). For illustration purposes, I just make a simple split() operation:
def dummy(s):
return s.split('_')
Now, let's say I only want the last two letters and concatenate those; there is the straightforward option
positions = []
for _, s in lol:
stuff = dummy(s)
positions.append(f"{stuff[1]}{stuff[2]}")
and doing the same in a list comprehension
print([f"{dummy(s)[1]}{dummy(s)[2]}" for _, s in lol])
both give the identical, desired outcome
['ri', 'up']
Is there a way to use the walrus operator here in the list comprehension to avoid calling dummy twice?
PS: Needless to say that in reality the dummy function is far more complex, so I don't look for a better solution regarding the split but it is fully about the structure and potential usage of the walrus operator.

I will have to say that your first explicit loop is the best option here. It is clear, readable code and you're not repeating any calls.
Still, as you asked for it, you could always do:
print([f"{(y:=dummy(s))[1]}{y[2]}" for _, s in lol])
You could also wrap the processing in another function:
def dummy2(l):
return f"{l[1]}{l[2]}"
And this removes the need of walrus altogether and simplifies the code further:
print([dummy2(dummy(s)) for _, s in lol])

Yes. This is what you want
output = [f"{(stuff := dummy(s))[1]}{stuff[2]}" for _, s in lol]

Is there a more elegant way to remove Nones from list after list comprehension

I'm using a list comprehension to map a function that either returns a value or None.
My function looks like this (this is extremely simplified, just to give you a general idea)
def convertline(x):
if x == 'undesirablevalue':
return None
else:
# do some logic
# do some logic
# do some logic
return somecalculatedvalue
and I have it iterating over a list in a list comprehension, like so. To filter out the nones, I use a list comprehension.
items = [convertline(line) for line in sample2.splitlines()]
items = [x for x in items if x is not None]
But the above code seems bulky.
I realized I could also do this:
items = [convertline(line) for line in sample2.splitlines() if convertline(line) is not None]
But this seems garbled, and I also do the math twice. Is there a better, more elegant way to do this? Both solutions seem kind of bulky. Thanks

There's really nothing wrong with your original approach. I would greatly prefer it over the approach that calls the function twice, that seems definitely wasteful, especially if it does a lot of work.
If you are using >=Python3.8, you can use an assignment expression:
[result for x in data if (result:= foo(x)) is not None]
Alternatively, the following which uses map, only does a single pass and doesn't build an intermediate list:
[x for x in map(foo, data) if x is not None]

You could it the other way around:
def convertline(x):
# do some logic
# do some logic
# do some logic
return somecalculatedvalue
items = [convertline(line) for line in sample2.splitlines() if line is not 'undesirablevalue']
A function that returns None is a bit weird in my opinion.

How to remove all duplicates from an iterable by attributes?

Given an iterable, e.g.
results = [ref_a, # references big object A
ref_b, # references big object B
ref_c, # references big object A
ref_d, # references big object D
]
The references are each unique objects, but some reference the same (bigger) object.
I only want a set(or list) of references for unique objects.
My desired result is e.g.
custom_set = (ref_a,
ref_b,
ref_d,
)
Remarks
The Python builtin set is not applicable, as the objects from the input are all different. This means set would return all elements.
I cannot change the class definition for the references, so I cannot implement a custom cmp/hash function or similar.
It does not matter if the final result contains ref_a or ref_c.
The initial result is a combination of the results of different APIs, which act independently - this is also the reason that the combined list can have references to the same (big) object.
I cannot store the result.reference only, as after the filtering, I need to access other attributes of the result. If I'd only store result.reference I would have to instantiate the costly object.
Sorry for using result as the input parameter, but I do not want to change it afterwards, as the answers would not fit to the question any more. I will remember this for a future question.
Maybe reference was also not the best naming - it is more like a lightweight proxy object.

Your code is fine, although you can solve this using itertools.groupby.
from itertools import groupby
from operator import attrgetter
f = attrgetter('reference')
custom_set = set(next(x) for _, x in groupby(sorted(results, key=f), f))
Both sorted and groupby are stable, so next(x) is guaranteed to be the first element in results with a particular value of the reference attribute.
One drawback to this approach is that sorted() takes O(n lg n) time, compared to your O(n) traversal of the list.
You could also write your code as a (mostly) one-liner, though I wouldn't recommend it:
known = {}
custom_set = set(known.add(r.reference) and r for r in result if r.reference not in known)
known.add(r.reference) will always return None, so the value of the and expression will always be r, but the expression itself will only be evaluated if r.reference isn't already in known. The and expression is just a way to work the side effect of updating known into the generator expression.

I came up with this solution, but there must be a better/ more pythonic one.
known = set()
custom_set = set()
for result in results:
if result.reference not in known:
known.add(result.reference)
custom_set.add(result)

Try this
a=[]
for i in results:
if i not in a:
a.append(i)
print(a)

Creating a Python list comprehension with an if and break with nested for loops

I noticed from this answer that the code
for i in userInput:
if i in wordsTask:
a = i
break
can be written as a list comprehension in the following way:
next([i for i in userInput if i in wordsTask])
I have a similar problem which is that I would like to write the following (simplified from original problem) code in terms of a list comprehension:
for i in xrange(N):
point = Point(long_list[i],lat_list[i])
for feature in feature_list:
polygon = shape(feature['geometry'])
if polygon.contains(point):
new_list.append(feature['properties'])
break
I expect each point to be associated with a single polygon from the feature list. Hence, once a polygon that contains the point is found, break is used to move on to the next point. Therefore, new_list will have exactly N elements.
I wrote it as a list comprehension as follows:
new_list = [feature['properties'] for i in xrange(1000) for feature in feature_list if shape(feature['geometry']).contains(Point(long_list[i],lat_list[i])]
Of course, this doesn't take into account the break in the if statement, and therefore takes significantly longer than using nested for loops. Using the advice from the above-linked post (which I probably don't fully understand), I did
new_list2 = next(feature['properties'] for i in xrange(1000) for feature in feature_list if shape(feature['geometry']).contains(Point(long_list[i],lat_list[i]))
However, new_list2 has much fewer than N elements (in my case, N=1000 and new_list2 had only 5 elements)
Question 1: Is it even worth doing this as a list comprehension? The only reason is that I read that list comprehensions are usually a bit faster than nested for loops. With 2 million data points, every second counts.
Question 2: If so, how would I go about incorporating the break statement in a list comprehension?
Question 3: What was the error going on with using next in the way I was doing?
Thank you so much for your time and kind help.

List comprehensions are not necessarily faster than a for loop. If you have a pattern like:
some_var = []
for ...:
if ...:
some_var.append(some_other_var)
then yes, the list comprehension is faster than the bunch of .append()s. You have extenuating circumstances, however. For one thing, it is actually a generator expression in the case of next(...) because it doesn't have the [ and ] around it.
You aren't actually creating a list (and therefore not using .append()). You are merely getting one value.
Your generator calls Point(long_list[i], lat_list[i]) once for each feature for each i in xrange(N), whereas the loop calls it only once for each i.
and, of course, your generator expression doesn't work.
Why doesn't your generator expression work? Because it finds only the first value overall. The loop, on the other hand, finds the first value for each i. You see the difference? The generator expression breaks out of both loops, but the for loop breaks out of only the inner one.
If you want a slight improvement in performance, use itertools.izip() (or just zip() in Python 3):
from itertools import izip
for long, lat in izip(long_list, lat_list):
point = Point(long, lat)
...

I don't know that complex list comprehensions or generator expressions are that much faster than nested loops if they're running the same algorithm (e.g. visiting the same number of values). To get a definitive answer you should probably try to implement a solution both ways and test to see which is faster for your real data.
As for how to short-circuit the inner loop but not the outer one, you'll need to put the next call inside the main list comprehension, with a separate generator expression inside of it:
new_list = [next(feature['properties'] for feature in feature_list
if shape(feature['shape']).contains(Point(long, lat)))
for long, lat in zip(long_list, lat_list)]
I've changed up one other thing: Rather than indexing long_list and lat_list with indexes from a range I'm using zip to iterate over them in parallel.
Note that if creating the Point objects over and over ends up taking too much time, you can streamline that part of the code by adding in another nested generator expression that creates the points and lets you bind them to a (reusable) name:
new_list = [next(feature['properties'] for feature in feature_list
if shape(feature['shape']).contains(point))
for point in (Point(long, lat) for long, lat in zip(long_list, lat_list))]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Making a complicated list comprehension conditional - python

You can't. The item in list comprehension can't be saved, so you'll have to evaluate it twice. Even if you could, don't. List comprehensions are for filtering, not for running code as side effect. It's unreadable and prone to mistakes.

Related

Convert for loop into list comprehension with assignment?

Can the walrus operator be used to avoid multiple function calls within a list comprehension?

Is there a more elegant way to remove Nones from list after list comprehension

How to remove all duplicates from an iterable by attributes?

Creating a Python list comprehension with an if and break with nested for loops

Categories

Resources