Reverse indexing a list of lists in Python - python

This should be trivial. Yet I don't feel 100% sure about my trick.
I have a list of lists (lol ;)) that captures edge relationships between nodes of a graph. Let's say I have a directed graph with 4 nodes labeled 0, 1, 2, 3. The edges are {(0,2),(0,3),(1,0),(1,3),(2,1)} and so the adjacency lol (call it a) is
a = [[2,3],[0,3],[1],[]]
I want to find the incidence lol now, i.e. a list of lists which indicate which nodes are incident on which nodes. For this example, the incidence lol (call it b) would be:
[[1], [2], [0], [0, 1]]
I tried the following code:
b = [[],[],[],[]]
[b[j].append(i) for i,x in enumerate(a) for j in x]
This gives me the right incidence matrix b.
The second step, although works, should ideally be b[j].append(i) for i,x in enumerate(a) for j in x, without the opening [ and closing ]. But Python interpreter cries syntax error without it. Is there a better way of phrasing it?

Your question is essentially about using list comprehensions for side effects. As, e.g. the answers to this question say, breaking it down into a for loop (or loops) is the way to go:
for i, x in enumerate(a):
for j in x:
b[j].append(i)
Also, please note that list comprehensions are used to construct lists in a very natural, easy way, like a mathematician is used to do. That is why, in Python, syntax requires square brackets (in your case).

Related

Writing list comprehensions dependent on previous values?

I've seen this question before, but it only deals with recursions that are linear in nature. I'm looking for something more general.
Suppose I have the following code
n = 10
num_bits = [0]
for i in range(n):
nums_bits.append(num_bits[i>>1]+i%2)
This code will compute num_bits, a 11 element array of value with num_bits[i] representing the number of bits it takes to represent i.
Is it possible to write this as a list comprehension? Something like this doesn't work
num_bits = [0]*11
num_bits = [num_bits[i>>1]+i%2 for i in range(11)]
since the comprehension doesn't update the value of num_bits in the middle of evaluation. Is there a canonical way to do something like this, besides a for loop?
P.S. I'm aware there are other ways to solve this problem: I'm just using it as a vehicle to understand Python's features better.
Edit: To summarize, I'd like to know what the proper way of generating lists of values that are dependent on previous values is. For a simpler example, consider the Fibonacci Numbers
fibonacci = [0,1]
for i in range(10):
fibonacci.append(fibonacci[-1]+fibonacci[-2])
Is there a way to generate these numbers in a comprehension? If not, what tools are there for this other than for loops (or are for/while loops my only option)?
Given it is not a piece of code I'd recommend, for the reasons discussed in the comments above and in the other answer, this comprehension should be faster than the for loop:
fibonacci = [0,1]
deque((fibonacci.append(fibonacci[-1]+fibonacci[-2]) for _ in range(10)), maxlen=0)
as it fills the list consuming the generator and discarding the result (an empty queue, it's the fastest recommended way to consume an iterator)
It produces:
>>> fibonacci
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
No.
There is no nice way to do this with a list comprehension, and that is not what they’re for. The purpose of list comprehensions is to offer a more readable alternative to maps and filters, and this isn’t that, so it’s not possible to do this in a sensible way.

How does the syntax `self.game_map.tiles["walkable"][x,y]` work? [duplicate]

I'm completely new at Python and have an assignment coming up. The professor has asked that we look at examples of users coding Pascal's Triangle in Python for something that will be 'similar'.
I managed to find several ways to code it but I found several people using some code that I don't understand.
Essentially, I'm looking to find out what it means (or does) when you see a list or variable that has two square brackets side by side. Example code:
pascalsTriangle = [[1]]
rows = int(input("Number of rows:"))
print(pascalsTriangle[0])
for i in range(1,rows+1):
pascalsTriangle.append([1])
for j in range(len(pascalsTriangle[i-1])-1):
pascalsTriangle[i].append(pascalsTriangle[i-1][j]+ pascalsTriangle[i-1][j+1])
pascalsTriangle[i].append(1)
print(pascalsTriangle[i])
You'll see that line 7 has this:
pascalsTriangle[i].append(pascalsTriangle[i-1][j]+pascalsTriangle[i-1][j+1])
I know that square brackets are lists. I know that square brackets within square brackets are lists within/of lists. Can anyone describe what a square bracket next to a square bracket is doing?
If you have a list
l = ["foo", "bar", "buz"]
Then l[0] is "foo", l[1] is "bar", l[2] is buz.
Similarly you could have a list in it instead of strings.
l = [ [1,2,3], "bar", "buz"]
Now l[0] is [1,2,3].
What if you want to access the second item in that list of numbers? You could say:
l[0][1]
l[0] first gets you the list, then [1] picks out the second number in it. That's why you have "square bracket next to square bracket".
Square brackets are used to define lists, but also to get things from lists.
When you have a list of lists and want something from an inner list, you need to get that inner list (using brackets) and then get the desired thing inside (using brackets again).
lol = [[1, 2, 3], [4, 5, 6]]
lol[1]
# [4, 5, 6]
lol[1][0]
# 4

How to eliminate items in list of lists in fast way while imposing some restrictions on its items?

My very first post and question here...
So, let list_a be the list of lists:
list_a = [[2,7,8], [3,4,2], [5,10], [4], [2,3,5]...]
Let list_b be another list of integers: list_b = [5,7]
I need to exclude all lists in list_a, whose items include at least one item from list_b. The result from example above schould look like list_c = [[3,4,2], [4]...]
If list_b was not a list but a single number b, then one could define list_c in one line as:
list_c = [x for x in list_a if not b in x]
I am wondering, if it is possible to write an elegant one-liner also for the list list_b with several values in it. Of course, I can just loop through all list_b's values, but may be there exists a faster option?
Let's first consider the task of checking an individual element of list_a - such as [2,7,8] - because no matter what, we're conceptually doing to need a way to do that, and then we're going to apply that to the list with a list comprehension. I'll use a as the name for such a list, and b for an element of list_b.
The straightforward way to write this is using the any builtin, which works elegantly in combination with generator expressions: any(b in a for b in list_b).
The logic is simple: we create a generator expression (like a lazily-evaluated list comprehension) to represent the result of the b in a check applied to each b in list_b. We create those by replacing the [] with (); but due to a special syntax rule we may drop these when using it as the sole argument to a function. Then any does exactly what it sounds like: it checks (with early bail-out) whether any of the elements in the iterable (which includes generator expressions) is truthy.
However, we can likely do better by taking advantage of set intersection. The key insight is that the test we are trying to do is symmetric; considering the test between a and list_b (and coming up with another name for elements of a), we could equally have written any(x in list_b for x in a), except that it's harder to understand that.
Now, it doesn't help to make a set from a, because we have to iterate over a anyway in order to do that. (The generator expression does that implicitly; in used for list membership requires iteration.) However, if we make a set from list_b, then we can do that once, ahead of time, and just have any(x in set_b for x in a).
But that, in turn, is a) as described above, hard to understand; and b) overlooking the built-in machinery of sets. The operator & normally used for set intersection requires a set on both sides, but the named method .intersection does not. Thus, set_b.intersection(a) does the trick.
Putting it all together, we get:
set_b = set(list_b)
list_c = [a for a in list_a if not set_b.intersection(a)]
You can write the logic all sublists in A where none of the elements of B are in the sublist with a list comprehension like:
A = [[2,7,8], [3,4,2], [5,10], [4], [2,3,5]]
B = [5,7]
[l for l in A if not any(n in l for n in B)]
# [[3, 4, 2], [4]]
The condition any(n in l for n in B) will be true if any element, n, of B is in the sublist, l, from A. Using not we can take the opposite of that.
Mark's answer is good but hard to read.
FYI, you can also leverage sets:
>>> set_b = set(list_b)
>>> [l for l in list_a if not set_b.intersection(l)]
[[3, 4, 2], [4]]

Divide a Python list every n-element

Recently, I've found a code for developing a set of lists from a list, that code was written by user Mai, answering question but i have not understood it yet. Could somebody help me to understand it? And... is there a way to rewrite that code that be easier? The code is:
def even_divide(lst, num_piece=4):
return [
[lst[i] for i in range(len(lst)) if (i % num_piece) == r]
for r in range(num_piece)
]
Thanks!
It's pretty simple actually. Just follow through the values of the two loops:
Starting with the outer loop, r would be 0, then 1, then 2, etc. Let's look at the case for which r == 1. When running through the different values of i, (which would be 0, 1, 2, ... len(lst), the value of i % 4, meaning the remainder of dividing i by 4, would be 0, 1, 2, 3, 0, 1, 2, 3, .... So the i % 4 would be equal to r, for every 4 values of i!
For our chosen r == 1, that would mean we're choosing lst[1], lst[5], lst[9], ..., etc.
And for r == 2? You guessed it! You'd be picking up lst[2], lst[6], lst[10],....
So over all you'd get 4 lists, with non-overlapping elements of the original list, by just "jumping" 4 elements every time, but starting at different values.
Which naturally leads to the more simple solution:
def even_divide(lst, num_piece=4):
return [lst[r::num_piece] for r in range(num_piece)]
Could somebody help me to understand it?
Sure! It's a list comprehension. A list comprehension takes a list and does something to or with every element in that list. Let's say I want to multiply every element in my list by 2:
new_list = [element*2 for element in my_list]
What makes it a list comprehension is the bracket syntax. For those new to it, that's usually the part that takes a moment to get used to. With that said, I assume that is what is giving you difficulty in understanding the code in your question, as you have a list comprehension in a list comprehension. It might be difficult to understand now, but list comprehensions are a wonderful thing in python.
But, as this post mentions, there's a lot of discussions around list comprehension, lambda's, map, reduce, and filter. Ultimately, its up to you to decide what's best for your project. I'm not a fan of anything else but list comprehensions, so I use those religiously.
Based on the question you've linked, the list comprehension takes a 1d list of length x and turns it into a 2d list of (length x, width y). It's like numpy.reshape.
And... is there a way to rewrite that code [to] be easier?
I would not recommend it. List comprehensions are considered very pythonic and you will see them everywhere. Best to use them and get used to them.

Python: Fast extraction of intersections among all possible 2-combinations in a large number of lists

I have a dataset of ca. 9K lists of variable length (1 to 100K elements). I need to calculate the length of the intersection of all possible 2-list combinations in this dataset. Note that elements in each list are unique so they can be stored as sets in python.
What is the most efficient way to perform this in python?
Edit I forgot to specify that I need to have the ability to match the intersection values to the corresponding pair of lists. Thanks everybody for the prompt response and apologies for the confusion!
If your sets are stored in s, for example:
s = [set([1, 2]), set([1, 3]), set([1, 2, 3]), set([2, 4])]
Then you can use itertools.combinations to take them two by two, and calculate the intersection (note that, as Alex pointed out, combinations is only available since version 2.6). Here with a list comrehension (just for the sake of the example):
from itertools import combinations
[ i[0] & i[1] for i in combinations(s,2) ]
Or, in a loop, which is probably what you need:
for i in combinations(s, 2):
inter = i[0] & i[1]
# processes the intersection set result "inter"
So, to have the length of each one of them, that "processing" would be:
l = len(inter)
This would be quite efficient, since it's using iterators to compute every combinations, and does not prepare all of them in advance.
Edit: Note that with this method, each set in the list "s" can actually be something else that returns a set, like a generator. The list itself could simply be a generator if you are short on memory. It could be much slower though, depending on how you generate these elements, but you wouldn't need to have the whole list of sets in memory at the same time (not that it should be a problem in your case).
For example, if each set is made from a function gen:
def gen(parameter):
while more_sets():
# ... some code to generate the next set 'x'
yield x
with open("results", "wt") as f_results:
for i in combinations(gen("data"), 2):
inter = i[0] & i[1]
f_results.write("%d\n" % len(inter))
Edit 2: How to collect indices (following redrat's comment).
Besides the quick solution I answered in comment, a more efficient way to collect the set indices would be to have a list of (index, set) instead of a list of set.
Example with new format:
s = [(0, set([1, 2])), (1, set([1, 3])), (2, set([1, 2, 3]))]
If you are building this list to calculate the combinations anyway, it should be simple to adapt to your new requirements. The main loop becomes:
with open("results", "wt") as f_results:
for i in combinations(s, 2):
inter = i[0][1] & i[1][1]
f_results.write("length of %d & %d: %d\n" % (i[0][0],i[1][0],len(inter))
In the loop, i[0] and i[1] would be a tuple (index, set), so i[0][1] is the first set, i[0][0] its index.
As you need to produce a (N by N/2) matrix of results, i.e., O(N squared) outputs, no approach can be less than O(N squared) -- in any language, of course. (N is "about 9K" in your question). So, I see nothing intrinsically faster than (a) making the N sets you need, and (b) iterating over them to produce the output -- i.e., the simplest approach. IOW:
def lotsofintersections(manylists):
manysets = [set(x) for x in manylists]
moresets = list(manysets)
for s in reversed(manysets):
moresets.pop()
for z in moresets:
yield s & z
This code's already trying to add some minor optimization (e.g. by avoiding slicing or popping off the front of lists, which might add other O(N squared) factors).
If you have many cores and/or nodes available and are looking for parallel algorithms, it's a different case of course -- if that's your case, can you mention the kind of cluster you have, its size, how nodes and cores can best communicate, and so forth?
Edit: as the OP has casually mentioned in a comment (!) that they actually need the numbers of the sets being intersected (really, why omit such crucial parts of the specs?! at least edit the question to clarify them...), this would only require changing this to:
L = len(manysets)
for i, s in enumerate(reversed(manysets)):
moresets.pop()
for j, z in enumerate(moresets):
yield L - i, j + 1, s & z
(if you need to "count from 1" for the progressive identifiers -- otherwise obvious change).
But if that's part of the specs you might as well use simpler code -- forget moresets, and:
L = len(manysets)
for i xrange(L):
s = manysets[i]
for j in range(i+1, L):
yield i, j, s & manysets[z]
this time assuming you want to "count from 0" instead, just for variety;-)
Try this:
_lists = [[1, 2, 3, 7], [1, 3], [1, 2, 3], [1, 3, 4, 7]]
_sets = map( set, _lists )
_intersection = reduce( set.intersection, _sets )
And to obtain the indexes:
_idxs = [ map(_i.index, _intersection ) for _i in _lists ]
Cheers,
José María García
PS: Sorry I misunderstood the question

Categories

Resources