idiomatic python, manage default arguments in functions - python

I usually encounter that most of the people manage default arguments values in functions or methods like this:
def foo(L=None):
if L is None:
L = []
However i see other people doing something like:
def foo(L=None):
L = L or []
I don't know if i a missing something but, why most of the people use the first approach instead the second? Are they equally the same thing?, seems that the second is clearer and shorter.

They are not equal.
First approach checks exactly, that given arg L is None.
Second checks, that L is true in python way. In python, if you check in condition the list, rules are the following:
List is empty, then it is False
True otherwise
So what's the difference between mentioned approaches? Compare this code.
First:
def foo(L=None):
if L is None:
L = []
L.append('x')
return L
>>> my_list = []
>>> foo(my_list)
>>> my_list
['x']
Second:
def foo(L=None):
L = L or []
L.append('x')
return L
>>> my_list = []
>>> foo(my_list)
>>> my_list
[]
So first didn't create a new list, it used the given list. But second creates the new one.

The two are not equivalent if the argument is a false-y value. This doesn't matters often, as many false-y values aren't suitable arguments to most functions where you'd do this. Still, there are conceivable situations where it can matter. For example, if a function is supposed to fill a dictionary (creating a new one if none is given), and someone passes an empty ordered dictionary instead, then the latter approach would incorrectly return an ordinary dictionary.
That's not my primary reason for always using the is None version though. I prefer it as it is more explicit and the fact that or returns one of its operands isn't intuitive to me. I like to forget about it as long as I can ;-) The extra line is not a problem, this is relatively rare.

Maybe they don't know of the second one? I tend to use the first.
Actually there is a difference. The second one will let L = [] if you pass anything that evaluates to Boolean false. 0 empty string or others. The first will only do that if no L is passed or it was passed as None.

Related

What's the difference between if not myList and if myList is [] in Python?

I was working on some code when I ran into a little problem. I orginally had something like this:
if myList is []:
# do things if list is empty
else:
# do other things if list is not empty
When I ran the program (and had it so myList was empty), the program would go straight to the else statement, much to my surprise. However, after looking at this question, I changed my code to this:
if not myList:
# do things if list is empty
else:
# do other things if list is not empty
This made my program work as I'd expected it to (it ran the 'if not myList' part and not the 'else' statement).
My question is what changed in the logic of this if-statement? My debugger (I use Pycharm) said that myList was an empty list both times.
is compares objects' ids, so that a is b == (id(a) == id(b)). This means that the two objects are the same: not only they have the same value, but they also occupy the same memory region.
>>> myList = []
>>> myList is []
False
>>> id([]), id(myList)
(130639084, 125463820)
>>> id([]), id(myList)
(130639244, 125463820)
As you can see, [] has a different ID every time because a new piece of memory is allocated every time.
In Python is compares for identity (the same object). The smaller numbers are cached at startup and as such they return True in that case e.g.
>>> a = 1
>>> b = 1
>>> a is b
True
And None is a singleton. You are creating a new list object when you do []. Generally speaking you should only use is for None or when checking explicilty for a sentinel value. You see that pattern in libraries using _sentinel = object() as a sentinel value.

This works BUT WHY?

I'm learing Python on codecademy and came across this solution for a function that's meant to remove duplicates from a list of numbers:
x = [1, 1, 2, 2]
def remove_duplicates(x):
p = []
for i in x:
if i != i:
p.append(i)
return i
I ran this in pycharm with some print statements and just got an empty list. I'm only curious because when I do this in my head, it makes no sense, but codecademy accepts this as an answer. Is it just a fluke? Or is this on a level I don't understand yet?
You are correct: it doesn't make any sense. First, it creates a list called p that gets each item that is not equal to itself. The only object that I know of that is not equal to itself is NaN, but you don't have any of those, so p is just an empty list. Defining p is useless, however, because it isn't even returned. What is returned is i, which is assigned to each item in the last, so it is the last item in the list by the end of the function. In short, that function is equivalent to this:
def remove_duplicates(x):
return x[-1]
I haven't heard what the function is supposed to return, but perhaps it is supposed to return the number of non-duplicate items. If it is, it "works" just because the last item in the list happens to be the number of non-duplicate items.
Take a look to this snippet to see the pythonic way to remove duplicated (good_result) and also to understand why your code doesn't make any sense:
x = [1, 1, 2, 2]
def remove_duplicates(x):
p = []
for i in x:
if i != i:
p.append(i)
return i
good_result = list(set(x))
print good_result
print remove_duplicates(x)
As you can see, your function is not returning the filtered list without duplicate values, it's just returning the last element of the list (index=-1). So codeacademy shouldn't accept that snippet as a valid answer to the question how to remove duplicateds from a list for sure.
Now, if we assume what codeacademy was really asking is for the number of unique values from a list, then is a casuality your broken code gives the right answer, which is the same as len(good_result). It worked just by luck just to say, it doesn't mean your code is correct :)
your code just returns the last element of the number, that is same as
return x[-1]
It doesn't return a list.
I think you need to check the question that they may be asking like,
a)function to return one of the duplicating element in a list.
b)function to return the no of duplicating elements in a list.
for the above two questions your answer is 2, by luck the answer is correct.

Remove all elements that satisfy a predicate from a set

Given a mutable set of objects,
A = set(1,2,3,4,5,6)
I can construct a new set containing only those objects that don't satisfy a predicate ...
B = set(x for x in A if not (x % 2 == 0))
... but how do I modify A in place to contain only those objects? If possible, do this in linear time, without constructing O(n)-sized scratch objects, and without removing anything from A, even temporarily, that doesn't satisfy the predicate.
(Integers are used here only to simplify the example. In the actual code they are Future objects and I'm trying to pull out those that have already completed, which is expected to be a small fraction of the total.)
Note that it is not, in general, safe in Python to mutate an object that you are iterating over. I'm not sure of the precise rules for sets (the documentation doesn't make any guarantee either way).
I only need an answer for 3.4+, but will take more general answers.
(Not actually O(1) due to implementation details, but I'm loathe to delete it as it's quite clean.)
Use symmetric_difference_update.
>>> A = {1,2,3,4,5,6}
>>> A.symmetric_difference_update(x for x in A if not (x % 2))
>>> A
{1, 3, 5}
With an horrible time complexity (quadratic), but in O(1) space:
>>> A = {1,2,3,4,5,6}
>>> while modified:
... modified = False
... for x in A:
... if not x%2:
... A.remove(x)
... modified = True
... break
...
>>> A
{1, 3, 5}
On the very specific use case you showed, there is a way to do this in O(1) space, but it doesn't generalize very well to sets containing anything other than int objects:
A = {1, 2, 3, 4, 5, 6}
for i in range(min(A), max(A) + 1):
if i % 2 != 0:
A.discard(i)
It also wastes time since it will check numbers that aren't even in the set. For anything other than int objects, I can't yet think of a way to do this without creating an intermediate set or container of some sort.
For a more general solution, it would be better to simply initially construct your set using the predicate (if you don't need to use the set for anything else first). Something like this:
def items():
# maybe this is a file or a stream or something,
# where ever your initial values are coming from.
for thing in source:
yield thing
def predicate(item):
return bool(item)
A = set(item for item in items() if predicate(item))
to maintain the use use of memory constant this is the only thing that come to my mind
def filter_Set(predicate,origen:set) -> set:
resul = set()
while origen:
elem = origen.pop()
if predicate( elem ):
resul.add( elem )
return resul
def filter_Set_inplace(predicate,origen:set):
resul = set()
while origen:
elem = origen.pop()
if predicate( elem ):
resul.add( elem )
while resul:
origen.add(resul.pop())
with this functions I move elems from one set to the other keeping only those that satisfied the predicate

How to remove and return element in python list

In python you can do list.pop(i) which removes and returns the element in index i, but is there a built in function like list.remove(e) where it removes and returns the first element equal to e?
Thanks
I mean, there is list.remove, yes.
>>> x = [1,2,3]
>>> x.remove(1)
>>> x
[2, 3]
I don't know why you need it to return the removed element, though. You've already passed it to list.remove, so you know what it is... I guess if you've overloaded __eq__ on the objects in the list so that it doesn't actually correspond to some reasonable notion of equality, you could have problems. But don't do that, because that would be terrible.
If you have done that terrible thing, it's not difficult to roll your own function that does this:
def remove_and_return(lst, item):
return lst.pop(lst.index(item))
Is there a builtin? No. Probably because if you already know the element you want to remove, then why bother returning it?1
The best you can do is get the index, and then pop it. Ultimately, this isn't such a big deal -- Chaining 2 O(n) algorithms is still O(n), so you still scale roughly the same ...
def extract(lst, item):
idx = lst.index(item)
return lst.pop(idx)
1Sure, there are pathological cases where the item returned might not be the item you already know... but they aren't important enough to warrant a new method which takes only 3 lines to write yourself :-)
Strictly speaking, you would need something like:
def remove(lst, e):
i = lst.index(e)
# error if e not in lst
a = lst[i]
lst.pop(i)
return a
Which would make sense only if e == a is true, but e is a is false, and you really need a instead of e.
In most case, though, I would say that this suggest something suspicious in your code.
A short version would be :
a = lst.pop(lst.index(e))

Weird list behavior in python

qtd_packs = 2
size_pack = 16
pasta = []
pasta.append ('packs/krun/')
pasta.append ('packs/parting2/')
for k in range(0, qtd_packs):
for n in range(1, size_pack+1):
samples_in.append (pasta[k]+str(n)+'.wav')
samples.append(samples_in)
del samples_in[0:len(samples_in)]
print(samples)
I'm basically trying to add the samples_in inside the samples list, then delete the old samples_in list to create a new one. This will happen 2 times, as the qtd_packs =2. But in the end, what I get is two empty lists:
[[], []]
I've append'ed the samples_in inside samples BEFORE deleting it. So what happened?
Thank you
In Python, lists are passed by reference. When you append samples_in to samples, Python appends a reference to samples_in to samples. If you want to append a copy of samples_in to samples, you can do:
samples.append(samples_in[:])
This effectively creates a new list from all the items in samples_in and passes that new list into samples.append(). So now when you clear the items in samples_in, you're not clearing the items in the list that was appended to samples as well.
Also, note that samples_in[:] is equivalent to samples_in[0:len(samples_in)].
The problem is that after this:
samples.append(samples_in)
The newly-appended value in samples is not a copy of samples_in, it's the exact same value. You can see this from the interactive interpreter:
>>> samples_in = [0]
>>> samples = []
>>> samples.append(samples_in)
>>> samples[-1] is samples_in
True
>>> id(samples[-1]), id(samples_in)
(12345678, 12345678)
Using an interactive visualizer might make it even easier to see what's happening.
So, when you modify the value through one name, like this:
>>> del samples_in[0:len(samples_in)]
The same modification is visible through both names:
>>> samples[-1]
[]
Once you realize that both names refer to the same value, that should be obvious.
As a side note, del samples_in[:] would do the exact same thing as del samples_in[0:len(samples_in)], because those are already the defaults for a slice.
What if you don't want the two names to refer to the same value? Then you have to explicitly make a copy.
The copy module has functions that can make a copy of (almost) anything, but many types have a simpler way to do it. For example, samples_in[:] asks for a new list, which copies the slice from 0 to the end (again, those are the defaults). So, if you'd done this:
>>> samples.append(samples_in[:])
… you would have a new value in samples[-1]. Again, you can test that easily:
>>> samples[-1], samples_in
([0], [0])
>>> samples[-1] == samples_in
True
>>> samples[-1] is samples_in
False
>>> id(samples[-1]), id(samples_in)
23456789, 12345678
And if you change one value, that doesn't affect the other—after all, they're separate values:
>>> del samples_in[:]
>>> samples[-1], samples_in
([0], [])
However, in this case, you really don't even need to make a copy. The only reason you're having a problem is that you're trying to reuse samples_in over and over. There's no reason to do that, and if you just created a new samples_in value each time, the problem wouldn't have come up in the first place. Instead of this:
samples_in = []
for k in range(0, qtd_packs):
for n in range(1, size_pack+1):
samples_in.append (pasta[k]+str(n)+'.wav')
samples.append(samples_in)
del samples_in[0:len(samples_in)]
Do this:
for k in range(0, qtd_packs):
samples_in = []
for n in range(1, size_pack+1):
samples_in.append (pasta[k]+str(n)+'.wav')
samples.append(samples_in)
beetea's answer below offers the solution if you want samples to contain two lists, each of which have the strings for one of your two qtd_packs:
qtd_packs = 2
size_pack = 16
pasta = []
pasta.append ('packs/krun/')
pasta.append ('packs/parting2/')
samples = []
samples_in = []
for k in range(0, qtd_packs):
for n in range(1, size_pack+1):
samples_in.append (pasta[k]+str(n)+'.wav')
samples.append(samples_in[:])
del samples_in[0:len(samples_in)]
print(samples)
produces this output:
[['packs/krun/1.wav', 'packs/krun/2.wav', 'packs/krun/3.wav', 'packs/krun/4.wav',
'packs/krun/5.wav', 'packs/krun/6.wav', 'packs/krun/7.wav', 'packs/krun/8.wav',
'packs/krun/9.wav', 'packs/krun/10.wav', 'packs/krun/11.wav', 'packs/krun/12.wav',
'packs/krun/13.wav', 'packs/krun/14.wav', 'packs/krun/15.wav', 'packs/krun/16.wav'],
['packs/parting2/1.wav', 'packs/parting2/2.wav', 'packs/parting2/3.wav',
'packs/parting2/4.wav', 'packs/parting2/5.wav', 'packs/parting2/6.wav',
'packs/parting2/7.wav', 'packs/parting2/8.wav', 'packs/parting2/9.wav',
'packs/parting2/10.wav', 'packs/parting2/11.wav', 'packs/parting2/12.wav',
'packs/parting2/13.wav', 'packs/parting2/14.wav', 'packs/parting2/15.wav',
'packs/parting2/16.wav']]
Now, when I originally read your question, I thought you were trying to make a single list containing all the strings. In that instance, you could use
samples.extend(samples_in)
instead of
samples.append(samples_in[:])
and you would get a flat list containing only the strings.

Categories

Resources