Weird list behavior in python

Weird list behavior in python - python

qtd_packs = 2
size_pack = 16
pasta = []
pasta.append ('packs/krun/')
pasta.append ('packs/parting2/')
for k in range(0, qtd_packs):
for n in range(1, size_pack+1):
samples_in.append (pasta[k]+str(n)+'.wav')
samples.append(samples_in)
del samples_in[0:len(samples_in)]
print(samples)
I'm basically trying to add the samples_in inside the samples list, then delete the old samples_in list to create a new one. This will happen 2 times, as the qtd_packs =2. But in the end, what I get is two empty lists:
[[], []]
I've append'ed the samples_in inside samples BEFORE deleting it. So what happened?
Thank you

In Python, lists are passed by reference. When you append samples_in to samples, Python appends a reference to samples_in to samples. If you want to append a copy of samples_in to samples, you can do:
samples.append(samples_in[:])
This effectively creates a new list from all the items in samples_in and passes that new list into samples.append(). So now when you clear the items in samples_in, you're not clearing the items in the list that was appended to samples as well.
Also, note that samples_in[:] is equivalent to samples_in[0:len(samples_in)].

The problem is that after this:
samples.append(samples_in)
The newly-appended value in samples is not a copy of samples_in, it's the exact same value. You can see this from the interactive interpreter:
>>> samples_in = [0]
>>> samples = []
>>> samples.append(samples_in)
>>> samples[-1] is samples_in
True
>>> id(samples[-1]), id(samples_in)
(12345678, 12345678)
Using an interactive visualizer might make it even easier to see what's happening.
So, when you modify the value through one name, like this:
>>> del samples_in[0:len(samples_in)]
The same modification is visible through both names:
>>> samples[-1]
[]
Once you realize that both names refer to the same value, that should be obvious.
As a side note, del samples_in[:] would do the exact same thing as del samples_in[0:len(samples_in)], because those are already the defaults for a slice.
What if you don't want the two names to refer to the same value? Then you have to explicitly make a copy.
The copy module has functions that can make a copy of (almost) anything, but many types have a simpler way to do it. For example, samples_in[:] asks for a new list, which copies the slice from 0 to the end (again, those are the defaults). So, if you'd done this:
>>> samples.append(samples_in[:])
… you would have a new value in samples[-1]. Again, you can test that easily:
>>> samples[-1], samples_in
([0], [0])
>>> samples[-1] == samples_in
True
>>> samples[-1] is samples_in
False
>>> id(samples[-1]), id(samples_in)
23456789, 12345678
And if you change one value, that doesn't affect the other—after all, they're separate values:
>>> del samples_in[:]
>>> samples[-1], samples_in
([0], [])
However, in this case, you really don't even need to make a copy. The only reason you're having a problem is that you're trying to reuse samples_in over and over. There's no reason to do that, and if you just created a new samples_in value each time, the problem wouldn't have come up in the first place. Instead of this:
samples_in = []
for k in range(0, qtd_packs):
for n in range(1, size_pack+1):
samples_in.append (pasta[k]+str(n)+'.wav')
samples.append(samples_in)
del samples_in[0:len(samples_in)]
Do this:
for k in range(0, qtd_packs):
samples_in = []
for n in range(1, size_pack+1):
samples_in.append (pasta[k]+str(n)+'.wav')
samples.append(samples_in)

beetea's answer below offers the solution if you want samples to contain two lists, each of which have the strings for one of your two qtd_packs:
qtd_packs = 2
size_pack = 16
pasta = []
pasta.append ('packs/krun/')
pasta.append ('packs/parting2/')
samples = []
samples_in = []
for k in range(0, qtd_packs):
for n in range(1, size_pack+1):
samples_in.append (pasta[k]+str(n)+'.wav')
samples.append(samples_in[:])
del samples_in[0:len(samples_in)]
print(samples)
produces this output:
[['packs/krun/1.wav', 'packs/krun/2.wav', 'packs/krun/3.wav', 'packs/krun/4.wav',
'packs/krun/5.wav', 'packs/krun/6.wav', 'packs/krun/7.wav', 'packs/krun/8.wav',
'packs/krun/9.wav', 'packs/krun/10.wav', 'packs/krun/11.wav', 'packs/krun/12.wav',
'packs/krun/13.wav', 'packs/krun/14.wav', 'packs/krun/15.wav', 'packs/krun/16.wav'],
['packs/parting2/1.wav', 'packs/parting2/2.wav', 'packs/parting2/3.wav',
'packs/parting2/4.wav', 'packs/parting2/5.wav', 'packs/parting2/6.wav',
'packs/parting2/7.wav', 'packs/parting2/8.wav', 'packs/parting2/9.wav',
'packs/parting2/10.wav', 'packs/parting2/11.wav', 'packs/parting2/12.wav',
'packs/parting2/13.wav', 'packs/parting2/14.wav', 'packs/parting2/15.wav',
'packs/parting2/16.wav']]
Now, when I originally read your question, I thought you were trying to make a single list containing all the strings. In that instance, you could use
samples.extend(samples_in)
instead of
samples.append(samples_in[:])
and you would get a flat list containing only the strings.

Related

Why does Python list comprehension seem to behave differently than list "multiplication"? [duplicate]

This question already has answers here:
List of lists changes reflected across sublists unexpectedly
(17 answers)
Closed 8 months ago.
Asking out of curiosity. For the sake of making a point I was trying to make a function that returns an "identity matrix" of n dimensions and then printing it in the most concise way.
First I came up with this:
def identity(n):
zeros = [[0 for j in range(n)] for i in range(n)]
for i in range(n):
zeros[i][i] = 1
return zeros
for i in range(5):
print(identity(5)[i])
This works as intended, however then I tried making the syntax shorter by doing this:
def identity(n):
zeros = [[0]*n]*n
for i in range(n):
zeros[i][i] = 1
return zeros
for i in range(5):
print(identity(5)[i])
And this for some reason changes every single element to a one, but I can't seem to figure out why?. This isn't an important question but help is much appreciated!

lists are kept by reference in python.
This means that if you have:
list_a = [1,2,3,4]
list_b = list_a
list_a and list_b are actually pointing to the same object in memory. so if you change an element of list_a:
list_a[2] = 9
then the same element for list_b will change because they are pointing to the same object. i.e. list_a literally equals list_b in every way.
That's what's happening in your code as well.
When you loop through and assign each value then it's as if you were explicitly creating a new list and assigning it to each element of your outter list:
l = []
l.append([1,2,3,4])
l.append([1,2,3,4])
...
but in the second piece of code, it is as if you are repeatedly appending the same value to the list:
l = []
la = [1,2,3,4]
l.append(la)
l.append(la)

It's because list comprehension performs shallow copy on the element.
All elements in zeros are refering to the same list.
Try these and see the results:
n = 4
zeros = [[0]*n]*n
zeros[0] = 1
print(zeros)
n = 4
lst = [0]*n
zeros = [lst]*n
print(zeros)
lst[0] = 1
print(zeros)
You can learn more about difference between shallow and deep copy here.

This is because when you multiply a list with a number, it duplicates it, but it does so by copying the reference of each element.
If you do id(zeros[0][0]) and id(zeros[0][1]) for the second case, you will see that they are the same, i.e. they refer to the same element and modifying one modifies them all.
This is not true for the first case where every 0 is instantiated seperately and has its own id.
Edit:
def zeros(n):
return [[0]*n]*n
x = zeros(5)
for i in range(5):
for j in range(5):
assert id(x[0][0]) == id(x[i][j])

What's the difference between if not myList and if myList is [] in Python?

I was working on some code when I ran into a little problem. I orginally had something like this:
if myList is []:
# do things if list is empty
else:
# do other things if list is not empty
When I ran the program (and had it so myList was empty), the program would go straight to the else statement, much to my surprise. However, after looking at this question, I changed my code to this:
if not myList:
# do things if list is empty
else:
# do other things if list is not empty
This made my program work as I'd expected it to (it ran the 'if not myList' part and not the 'else' statement).
My question is what changed in the logic of this if-statement? My debugger (I use Pycharm) said that myList was an empty list both times.

is compares objects' ids, so that a is b == (id(a) == id(b)). This means that the two objects are the same: not only they have the same value, but they also occupy the same memory region.
>>> myList = []
>>> myList is []
False
>>> id([]), id(myList)
(130639084, 125463820)
>>> id([]), id(myList)
(130639244, 125463820)
As you can see, [] has a different ID every time because a new piece of memory is allocated every time.

In Python is compares for identity (the same object). The smaller numbers are cached at startup and as such they return True in that case e.g.
>>> a = 1
>>> b = 1
>>> a is b
True
And None is a singleton. You are creating a new list object when you do []. Generally speaking you should only use is for None or when checking explicilty for a sentinel value. You see that pattern in libraries using _sentinel = object() as a sentinel value.

Array of tuples in Python

This is driving me nuts. The following code is basically the question. The first part halts almost immediately. The second seems to get stuck. However, they should be doing the exact same thing, since one == two.
one = [[(0,)]] + [[], [], [], [], [], [], [], [], []]
n = 9
two = [[(0,)]] + ([[]] * n)
if (one == two):
print "It is indeed the same thing!"
print "Fast:"
strings = one
print str(strings)
for j in range(0, n):
for i in strings[j]:
strings[j + 1].append(i + (0,))
strings[j + 1].append(i + (1,))
print "Slow:"
strings = two
print str(strings)
for j in range(0, n):
for i in strings[j]:
strings[j + 1].append(i + (0,))
strings[j + 1].append(i + (1,))

Equals == tests whether the arrays "look" the same (using standard tests of equality).
However they don't contain the same objects! i.e. the object references are not the same.
is tests whether the arrays are actually the same.
[[],[],[]] has three different lists.
[[]]*3 has three references to the same list.
So in one, for each j you're adding to a different sub-list. But in two you are adding to the same sub-list all the time.
Note re: testing with ==: equals is a method belonging to the list, and can be overridden to compare in the way you want to. It usually does "the sensible thing", which in this case, involves seeing if the lists are the same length, and if so, whether each of the elements are the same -- using == also. But is, on the other hand, is a more primitive thing: it literally sees if you are really referring to the same memory object.
to create multiple new (different) objects, you could do
[ [] for i in range(9) ]

The two options don't do the same thing. You can simplify your problem by the following two code snippets:
s = {"hello": "world"}
a = [s] * 2
print(a)
> [{'hello': 'world'}, {'hello': 'world'}]
Obviously you created a list with two dictionaries. But you created a list with the multiplier. That means, instead of two new separate dictionaries you create a list with two references to a single dictionary. Any change to this dictionary will affect all items in your list. Let's go on with our example:
s["hello"] = "python"
print(a)
> [{'hello': 'python'}, {'hello': 'python'}]
So we changed the initial dictionary which affects all elements. To summarize this: In your fast example you create a list with 10 individual list items. In your second example you create a list with one list that is referenced by each item. Changing this item like you do in the for loop will accumulate the size for all elements. So you create new items for each element on each iteration. In your example you should see a massive memory consumption because each iteration creates new elements for all elements in the list.

I suggest Python has no problem to loop over the first array, because its been initialized as its supposed to, while the other one only exists "theoretically" in memory.
Comparing each other initializes the second one and the result is both arrays are the same.
Looping has the same effect. While looping over the first one Python knows what to do, but looping over the other one makes Python initialize the array over and over again which maybe takes a while, because its a new array with every iteration.
But its only a suggestion.
And your code almost made my computer freeze :/

idiomatic python, manage default arguments in functions

I usually encounter that most of the people manage default arguments values in functions or methods like this:
def foo(L=None):
if L is None:
L = []
However i see other people doing something like:
def foo(L=None):
L = L or []
I don't know if i a missing something but, why most of the people use the first approach instead the second? Are they equally the same thing?, seems that the second is clearer and shorter.

They are not equal.
First approach checks exactly, that given arg L is None.
Second checks, that L is true in python way. In python, if you check in condition the list, rules are the following:
List is empty, then it is False
True otherwise
So what's the difference between mentioned approaches? Compare this code.
First:
def foo(L=None):
if L is None:
L = []
L.append('x')
return L
>>> my_list = []
>>> foo(my_list)
>>> my_list
['x']
Second:
def foo(L=None):
L = L or []
L.append('x')
return L
>>> my_list = []
>>> foo(my_list)
>>> my_list
[]
So first didn't create a new list, it used the given list. But second creates the new one.

The two are not equivalent if the argument is a false-y value. This doesn't matters often, as many false-y values aren't suitable arguments to most functions where you'd do this. Still, there are conceivable situations where it can matter. For example, if a function is supposed to fill a dictionary (creating a new one if none is given), and someone passes an empty ordered dictionary instead, then the latter approach would incorrectly return an ordinary dictionary.
That's not my primary reason for always using the is None version though. I prefer it as it is more explicit and the fact that or returns one of its operands isn't intuitive to me. I like to forget about it as long as I can ;-) The extra line is not a problem, this is relatively rare.

Maybe they don't know of the second one? I tend to use the first.
Actually there is a difference. The second one will let L = [] if you pass anything that evaluates to Boolean false. 0 empty string or others. The first will only do that if no L is passed or it was passed as None.

Why does this python code not replace deleted list elements?

Here is my code:
for each in range(0, number_of_trials):
temp_list = source_list
for i in range(10):
x = random.randrange(0, len(temp_list))
board[i] = temp_list[x]
del temp_list[x]
This code is deleting each element from temp_list, as would be expected. But temp_list is not being reset each time the initial for loop runs, setting it back to source_list. As a result, every delete from temp_list is permanent, lasting for every following iteration of the for loop. How can I avoid this and have temp_list "reset" back to its initial status each time?

The statement temp_list = source_list does not create a new list. It gives a new name temp_list to an existing list. It doesn't matter what name you use to access the list—any changes made via one name will be visible via another.
Instead, you need to copy the list, like this:
temp_list = source_list[:]
This creates a new list that starts with the same contents as source_list. Now you can change the new list without affecting the original.

>>> a = [ 1 ]
>>> b = a
>>> del a[0]
>>> print b
[]
Basically, when you use "=", both variables point to the same object. To make a copy, use the copy module.

Copy the list elements instead of the list reference.
temp_list = source_list[:]

This is because :
temp_list = source_list # Doesn't copies the list, but adds the reference.
So, each iteration you are refreshing the reference only.
To copy the list, you can use the trick [:]. This performs list slicing with nothing sliced and produces a new list exactly same as the list being sliced.
Therefore,
for each in range(0, number_of_trials):
temp_list = source_list[:] # Changed
for i in range(10):
x = random.randrange(0, len(temp_list))
board[i] = temp_list[x]
del temp_list[x]
This should work as expected. :)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Weird list behavior in python - python

Related

Why does Python list comprehension seem to behave differently than list "multiplication"? [duplicate]

What's the difference between if not myList and if myList is [] in Python?

Array of tuples in Python

idiomatic python, manage default arguments in functions

Why does this python code not replace deleted list elements?

Categories

Resources