Python: Different methods to initialize 2D arrays gives different outputs [duplicate] - python

This question already has answers here:
List of lists changes reflected across sublists unexpectedly
(17 answers)
Closed 2 years ago.
I was solving a few questions involving dynamic programming. I initialized the dp table as -
n = 3
dp = [[False]*n]*n
print(dp)
#Output - [[False, False, False], [False, False, False], [False, False, False]]
Followed by which I set the diagonal elements to True using -
for i in range(n):
dp[i][i] = True
print(dp)
#Output - [[True, True, True], [True, True, True], [True, True, True]]
However, the above sets every value in dp to True. But when I initialize dp as -
dp = [[False]*n for i in range(n)]
Followed by setting diagonal elements to True, I get the correct output - [[True, False, False], [False, True, False], [False, False, True]]
So how exactly does the star operator generate values of the list?

When you do dp = [[False]*n]*n, you get a list of n of the same lists, as in, when you modify one, all are modified. That's why with that loop of n, you seemingly modify all n^2 elements.
You can check it like this:
[id(x) for x in dp]
> [1566380391432, 1566380391432, 1566380391432, 1566380391432, 1566380391432] # you'll see same values
With dp = [[False]*n for i in range(n)] you are creating different lists n times. Let's try again for this dp:
[id(x) for x in dp]
[1566381807176, 1566381801160, 1566381795912, 1566381492552, 1566380166600]
In general, opt to use * to expand immutable data types, and use for ... to expand mutable data types (like lists).

Your issue is that in the first example, you are not actually creating more lists.
To explain, lets go threw the example line by line.
First, you create a new list [False]*3. This will create list with the value False 3 times.
Next, you create a different list with a reference to the first list. Note that the first list is not copied, only a reference is stored.
Next, by multiplying by 3 you create a list with 3 references to the same list. Since these are only references, changing one will change the others too.
This is why assigning dp[i][i]=True, you are actually setting all element i of all three lists to True, since they are all 3 the same list. Thus if you do this for all i all value in all (but there is only one) lists will be set to true.
The second option actually creates 3 separate lists, so the code works properly.

Related

Get all possible boolean sequences for a given list length

Given a random list length, how can I efficiently get all possible sequences of boolean values, except strings that are all True or all False?
For instance, given the number 3 it should return something like the following.
[True, False, False],
[True, True, False],
[True, False, True],
[False, True, False],
[False, True, True],
[False, False, True],
Is there already a known function that does this?
The order that it returns the sequences in is not important. I mainly just need a number of how many sequences are possible for a given list length.
This is mostly a maths question, unless you need the sequences themselves. If you do, there is a neat python solution:
from itertools import product
[seq for seq in product((True, False), repeat=3)][1:-1]
The list comprehension will contain all possible sequences, but we don't want (True, True, True) and (False, False, False). Conveniently, these will be the first and last element respectively, so we can simply discard them, using slicing from 1 to -1.
For sequences with different lengths, just change the "repeat" optional argument of the itertools.product function.
You don't need a function to determine this. Simple math will do the trick.
2**n - 2
2 because there are only two options (True/False)
n is your list length
-2 because you want to exclude the all True and all False results
This is more of a maths question, but here goes:
The number of total options is equal to the multiplication of the number of options per position, so, if you receive 3 as input:
index[0] could be true or false - 2
index[1] could be true or false - 2
index[2] could be true or false - 2
index has a total of 6 options.

Why do '*' vs list comprehension when making list of lists work differently? [duplicate]

This question already has answers here:
List of lists changes reflected across sublists unexpectedly
(17 answers)
Closed 2 years ago.
My purpose was to make 2D list by *
list2 = [[False]*2]*3 #[[False,False],[False,False],[False,False]]
list3 = [[False]*2 for _ in range(3)] #same as list2
list2[0][0] = True # [[True, False], [True, False], [True, False]]
list3[0][0] = True # [[True, False], [False, False], [False, False]]
list3 works well but list2 doesn't. list2 is affected by 'x' of list2[z][x].
What happened?
* operator concats the list and makes same replica or reference of the list so change made at one place will get replicated at all places whereas for _ range(3) will evaluate one by one hence making just one change at position list[0][0]

Nested List declaration [duplicate]

This question already has answers here:
List of lists changes reflected across sublists unexpectedly
(17 answers)
Closed 3 years ago.
mytest=[[False]*3]*2
In [46]: mytest
Out[46]: [[False, False, False], [False, False, False]]
In [47]: mytest[0][1]=True
In [48]: mytest
Out[48]: [[False, True, False], [False, True, False]]
On the other hand
mytest=[ [False]*3 for i in range(2)]
In [53]: mytest[0][1]=True
In [54]: mytest
Out[54]: [[False, True, False], [False, False, False]]
On the first when in set [0][1], it sets at two places , but in second it sets correctly .. what is wrong with first assignment.
This is how Python handles objects. In your first example, the list mytest contains two [False, False, False] lists that are stored at the same memory location (i.e., both items in the list are pointing to the same memory location). When you change the one, the other is changed as well because simply they are both pointing to the same list in memory.
In the second example, and when you are using list comprehension the two lists [False, False, False] are two different objects pointing to different memory locations.
Proof
>>> mytest=[[False]*3]*2
>>> id(mytest[0])
4340367304
>>> id(mytest[1])
4340367304
>>> mytest=[ [False]*3 for i in range(2)]
>>> id(mytest[0])
4340436936
>>> id(mytest[1])
4340498512
The difference with the first and second statement is that your first statement will first evaluate [False] * 3 first which gives [False, False, False] and then *2 will create two references of that object ([False, False, False]). In the second example, you are creating a [False, False, False] each time.

How can I check that a Python list contains only True and then only False using one or two lines?

I would like to only allow lists where the first contiguous group of elements are True and then all of the remaining elements are False. I want lists like these examples to return True:
[True]
[False]
[True, False]
[True, False, False]
[True, True, True, False]
And lists like these to return False:
[False, True]
[True, False, True]
I am currently using this function, but I feel like there is probably a better way of doing this:
def my_function(x):
n_trues = sum(x)
should_be_true = x[:n_trues] # get the first n items
should_be_false = x[n_trues:len(x)] # get the remaining items
# return True only if all of the first n elements are True and the remaining
# elements are all False
return all(should_be_true) and all([not element for element in should_be_false])
Testing:
test_cases = [[True], [False],
[True, False],
[True, False, False],
[True, True, True, False],
[False, True],
[True, False, True]]
print([my_function(test_case) for test_case in test_cases])
# expected output: [True, True, True, True, True, False, False]
Is it possible to use a comprehension instead to make this a one/two line function? I know I could not define the two temporary lists and instead put their definitions in place of their names on the return line, but I think that would be too messy.
Method 1
You could use itertools.groupby. This would avoid doing multiple passes over the list and would also avoid creating the temp lists in the first place:
def check(x):
status = list(k for k, g in groupby(x))
return len(status) <= 2 and (status[0] is True or status[-1] is False)
This assumes that your input is non-empty and already all boolean. If that's not always the case, adjust accordingly:
def check(x):
status = list(k for k, g in groupby(map(book, x)))
return status and len(status) <= 2 and (status[0] or not status[-1])
If you want to have empty arrays evaluate to True, either special case it, or complicate the last line a bit more:
return not status or (len(status) <= 2 and (status[0] or not status[-1]))
Method 2
You can also do this in one pass using an iterator directly. This relies on the fact that any and all are guaranteed to short-circuit:
def check(x):
iterator = iter(x)
# process the true elements
all(iterator)
# check that there are no true elements left
return not any(iterator)
Personally, I think method 1 is total overkill. Method 2 is much nicer and simpler, and achieves the same goals faster. It also stops immediately if the test fails, rather than having to process the whole group. It also doesn't allocate any temporary lists at all, even for the group aggregation. Finally, it handles empty and non-boolean inputs out of the box.
Since I'm writing on mobile, here's an IDEOne link for verification: https://ideone.com/4MAYYa

itertools and strided list assignment

Given a list, e.g. x = [True]*20, I want to assign False to every other element.
x[::2] = False
raises TypeError: must assign iterable to extended slice
So I naively assumed you could do something like this:
x[::2] = itertools.repeat(False)
or
x[::2] = itertools.cycle([False])
However, as far as I can tell, this results in an infinite loop. Why is there an infinite loop? Is there an alternative approach that does not involve knowing the number of elements in the slice before assignment?
EDIT: I understand that x[::2] = [False] * len(x)/2 works in this case, or you can come up with an expression for the multiplier on the right side in the more general case. I'm trying to understand what causes itertools to cycle indefinitely and why list assignment behaves differently from numpy array assignment. I think there must be something fundamental about python I'm misunderstanding. I was also thinking originally there might be performance reasons to prefer itertools to list comprehension or creating another n-element list.
What you are attempting to do in this code is not what you think (i suspect)
for instance:
x[::2] will return a slice containing every odd element of x, since x is of size 20,
the slice will be of size 10, but you are trying to assign a non-iterable of size 1 to it.
to successfully use the code you have you will need to do:
x = [True]*20
x[::2] = [False]*10
wich will assign an iterable of size 10 to a slice of size 10.
Why work in the dark with the number of elements? use
len(x[::2])
which would be equal to 10, and then use
x[::2] = [False]*len(x[::2])
you could also do something like:
x = [True if (index & 0x1 == 0) else False for index, element in enumerate(x)]
EDIT: Due to OP edit
The documentation on cycle says it Repeats indefinitely. which means it will continuously 'cycle' through the iterator it has been given.
Repeat has a similar implementation, however documentation states that it
Runs indefinitely unless the times argument is specified.
which has not been done in the questions code. Thus both will lead to infinite loops.
About the itertools being faster comment. Yes itertools are generally faster than other implementations because they are optimised to be as fast as the creators could make them.
However if you do not want to recreate a list you can use generator expressions such as the following:
x = (True if (index & 0x1 == 0) else False for index, element in enumerate(x))
which do not store all of their elements in memory but produce them as they are needed, however, generator functions can be used up.
for instance:
x = [True]*20
print(x)
y = (True if (index & 0x1 == 0) else False for index, element in enumerate(x))
print ([a for a in y])
print ([a for a in y])
will print x then the elements in the generator y, then a null list, because the generator has been used up.
As Mark Tolonen pointed out in a concise comment, the reason why your itertools attempts are cycling indefinitely is because, for the list assignment, python is checking the length of the right hand side.
Now to really dig in...
When you say:
x[::2] = itertools.repeat(False)
The left hand side (x[::2]) is a list, and you are assigning a value to a list where the value is the itertools.repeat(False) iterable, which will iterate forever since it wasn't given a length (as per the docs).
If you dig into the list assignment code in the cPython implementation, you'll find the unfortunately/painfully named function list_ass_slice, which is at the root of a lot of list assignment stuff. In that code you'll see this segment:
v_as_SF = PySequence_Fast(v, "can only assign an iterable");
if(v_as_SF == NULL)
goto Error;
n = PySequence_Fast_GET_SIZE(v_as_SF);
Here it is trying to get the length (n) of the iterable you are assigning to the list. However, before even getting there it is getting stuck on PySequence_Fast, where it ends up trying to convert your iterable to a list (with PySequence_List), within which it ultimately creates an empty list and tries to simply extend it with your iterable.
To extend the list with the iterable, it uses listextend(), and in there you'll see the root of the problem:
/* Run iterator to exhaustion. */
for (;;) {
and there you go.
Or least I think so... :) It was an interesting question so I thought I'd have some fun and dig through the source to see what was up, and ended up there.
As to the different behaviour with numpy arrays, it will simply be a difference in how the numpy.array assignments are handled.
Note that using itertools.repeat doesn't work in numpy, but it doesn't hang up (I didn't check the implementation to figure out why):
>>> import numpy, itertools
>>> x = numpy.ones(10,dtype='bool')
>>> x[::2] = itertools.repeat(False)
>>> x
array([ True, True, True, True, True, True, True, True, True, True], dtype=bool)
>>> #but the scalar assignment does work as advertised...
>>> x = numpy.ones(10,dtype='bool')
>>> x[::2] = False
>>> x
array([False, True, False, True, False, True, False, True, False, True], dtype=bool)
Try this:
l = len(x)
x[::2] = itertools.repeat(False, l/2 if l % 2 == 0 else (l/2)+1)
Your original solution ends up in an infinite loop because that's what repeat is supposed to do, from the documentation:
Make an iterator that returns object over and over again. Runs indefinitely unless the times argument is specified.
The slice x[::2] is exactly len(x)/2 elements long, so you could achieve what you want with:
x[::2] = [False]*(len(x)/2)
The itertools.repeat and itertools.cycle methods are designed to yield values infinitely. However you can specify a limit on repeat(). Like this:
x[::2] = itertools.repeat(False, len(x)/2)
The right hand side of an extended slice assignment needs to be an iterable of the right size (ten, in this case).
Here is it with a regular list on the righthand side:
>>> x = [True] * 20
>>> x[::2] = [False] * 10
>>> x
[False, True, False, True, False, True, False, True, False, True, False, True, False, True, False, True, False, True, False, True]
And here it is with itertools.repeat on the righthand side.
>>> from itertools import repeat
>>> x = [True] * 20
>>> x[::2] = repeat(False, 10)
>>> x
[False, True, False, True, False, True, False, True, False, True, False, True, False, True, False, True, False, True, False, True]

Categories

Resources