sort() in Python using cmp - python

I am trying to sort a list, move all 0 to the end of list.
example: [0,1,0,2,3,0,4]->[1,2,3,4,0,0,0]
and I see someone code it in 1 line
list.sort(cmp=lambda a,b:-1 if b==0 else 0)
But I don't understand what inside the parentheses mean.
Could anyone tell me? Thank you.

Preface:
Sort a list according to the normal comparison:
some_list.sort()
Supply a custom comparator:
some_list.sort(cmp=my_comparator)
A lambda function:
x = lambda a, b: a - b
# is roughly the same as
def x(a, b):
return a - b
An if-else-expression:
value = truthy_case if condition else otherwise
# is roughly the same as
if condition:
value = truthy_case
else:
value = otherwise
The line list.sort(cmp=lambda a,b:-1 if b==0 else 0) itself:
Now, the condition in the comparator is whether b==0, if so indicate that b has a bigger value than a (the sign of the result is negative), otherwise indicate that the values compare the same (the sign is zero).
Whilst Python's list.sort() is stable, this code is not sane, because the comparator needs to test a, too, not only b. A proper implementation would use the key argument:
some_list.sort(key=lambda a: 0 if a == 0 else -1)
Fixed list.sort(cmp=...) implementation:
If you want to use list.sort(cmp=...) (you don't) or if you are just curious, this is a sane implementation:
some_list.sort(cmp=lambda a, b: 0 if a == b else
+1 if a == 0 else
-1 if b == 0 else 0)
But notice:
In Py3.0, the cmp parameter was removed entirely (as part of a larger effort to simplify and unify the language, eliminating the conflict between rich comparisons and the __cmp__ methods).
An alternative:
Sorting a list is in O(𝘯 log 𝘯). I do not know if for this simple problem the code runs faster, but I wouldn't think so. An O(𝘯) solution is filtering:
new_list = [x for x in some_list if x != 0]
new_list.extend([0] * (len(some_list) - len(new_list)))
The difference will probably only matter for quite long lists, though.

>>> sorted(l, key=lambda x:str(x) if x == 0 else x)
[1, 3, 4, 8, 0, 0, 0]
Guess what's happening here? I am exploiting the fact that, as a preference, python will pick up integers first, then strings. SO I converted 0 into '0'.
Here's the proof.
>>> ll = [3,2,3, '1', '3', '0']
>>> sorted(ll)
[2, 3, 3, '0', '1', '3']

You should answer yourself and this is plan:
The ternary expression description is available here:
https://docs.python.org/3/reference/expressions.html?highlight=ternary%20operator#conditional-expressions
You can find a lot of expression description in that document:
https://docs.python.org/3/reference/expressions.html
Q: What does lambda mean?
Please spend just 5 days and read a Tutorial about Python language, which is a fork of the original Gvinno Van Rossum book.
https://docs.python.org/3/tutorial/controlflow.html#lambda-expressions

Related

Python's list comprehension: Modify list elements if a certain value occurs

How can I do the following in Python's list comprehension?
nums = [1,1,0,1,1]
oFlag = 1
res = []
for x in nums:
if x == 0:
oFlag = 0
res.append(oFlag)
print(res)
# Output: [1,1,0,0,0]
Essentially in this example, zero out the rest of the list once a 0 occurs.
Some context, a list comprehension is a sort of "imperative" syntax for the map and filter functions that exist in many functional programing languages. What you're trying to do is usually referred to as an accumulate, which is a slightly different operation. You can't implement an accumulate in terms of a map and filter except by using side effects. Python allows you have side effects in a list comprehension so it's definitely possible but list comprehensions with side effects are a little wonky. Here's how you could implement this using accumulate:
nums = [1,1,0,1,1]
def accumulator(last, cur):
return 1 if (last == 1 and cur == 1) else 0
list(accumulate(nums, accumulator))
or in one line:
list(accumulate(nums, lambda last, cur: 1 if (last == 1 and cur == 1) else 0))
Of course there are several ways to do this using an external state and a list comprehension with side effects. Here's an example, it's a bit verbose but very explicit about how state is being manipulated:
class MyState:
def __init__(self, initial_state):
self.state = initial_state
def getNext(self, cur):
self.state = accumulator(self.state, cur)
return self.state
mystate = MyState(1)
[mystate.getNext(x) for x in nums]
nums = [1,1,0,1,1]
[int(all(nums[:i+1])) for i in range(len(nums))]
This steps through the list, applying the all operator to the entire sub-list up to that point.
Output:
[1, 1, 0, 0, 0]
Granted, this is O(n^2), but it gets the job done.
Even more effective is simply to find the index of the first 0.
Make a new list made of that many 1s, padded with the appropriate quantity of zeros.
if 0 in nums:
idx = nums.index(0)
new_list = [1] * idx + [0] * (len(nums) - idx)
... or if the original list can contain elements other than 0 and 1, copy the list that far rather than repeating 1s:
new_list = nums[:idx] + [0] * (len(nums) - idx)
I had an answer using list comprehension, but #Prune beat me to it. It was really just a cautionary tail, showing how it would be done while making an argument against that approach.
Here's an alternative approach that might fit your needs:
import itertools
import operator
nums = [1,1,0,1,1]
res = itertools.accumulate(nums, operator.and_)
In this case res is an iterable. If you need a list, then
res = list(itertools.accumulate(nums, operator.and_))
Let's break this down. The accumulate() function can be used to generate a running total, or 'accumulated sums'. If only one argument is passed the default function is addition. Here we pass in operator.and_. The operator module exports a set of efficient functions corresponding to the intrinsic operators of Python. When an accumulated and is run on a list of 0's and 1's the result is a list that has 1's up till the first 0 is found, then all 0's after.
Of course we're not limited to using functions defined in the operator module. You can use any function that accepts 2 parameters of the type of the elements in the first parameter (and probably returns the same type). You can get creative, but here I'll keep it simple and just implement and:
import itertools
nums = [1,1,0,1,1]
res = itertools.accumulate(nums, lambda a, b: a and b)
Note: using operator.and_ probably runs faster. Here we're just providing an example using the lambda syntax.
While a list comprehension is not used, to me it has a similar feel. It fits in one line and isn't too hard to read.
For a list comprehension approach, you could use index with enumerate:
firstIndex = nums.index(0) if 0 in nums else -1
[1 if i < firstIndex else 0 for i, x in enumerate(nums)]
Another approach using numpy:
import numpy as np
print(np.cumprod(np.array(nums) != 0).tolist())
#[1, 1, 0, 0, 0]
Here we take the convert nums to a numpy array and check to see if the values are not equal to 0. We then take the cumulative product of the array, knowing that once a 0 is found we will multiply by 0 from that point forward.
Here is a linear-time solution that doesn't mutate global state, doesn't require any other iterators except the nums, and that does what you want, albeit requiring some auxiliary data-structures, and using a seriously hacky list-comprehension:
>>> nums = [1,1,0,1,1]
>>> [f for f, ns in [(1, nums)] for n in ns for f in [f & (n==1)]]
[1, 1, 0, 0, 0]
Don't use this. Use your original for-loop. It is more readable, and almost certainly faster. Don't strive to put everything in a list-comprehension. Strive to make your code simple, readable, and maintainable, which your code already was, and the above code is not.

Using XOR operator to determine if there are duplicates in a list of integers

I have a list of numbers:
a = [1,2,3,4,5,19,22,25,17,6,73,72,71,77,899,887,44,124, ...]
#this is an abbreviated version of the list
I need to determine if there are duplicates in the list or not using the XOR ("^") operator.
Can anyone give me any tips? I'm a newbie and have never encountered this problem or used the XOR operator before.
I've tried several approaches (which have amounted to blind stabs in the dark). The last one was this:
MyDuplicatesList = [1,5,12,156,166,2656,6,4,5,9] #changed the list to make it easer
for x in MyDuplicatesList:
if x^x:
print("True")
I realize I'm probably violating a protocol by asking such an open-ended question, but I'm completely stumped.
Why XOR?
# true if there are duplicates
print len(set(a)) != len(a)
Ok, THIS is pythonic. It finds all the duplicates and makes a list of them.
a = [1,2,3,4,5,19,22,25,17,6,73,72,71,77,899,887,44,124,1]
b = [a[i] for i in range(len(a)) for j in range(i+1,len(a)) if i ^ j > 0 if a[i] ^ a[j] < 1]
print b
Simplified version of Dalen's idea:
a = [1,2,3,4,5,19,22,25,17,6,73,72,71,77,899,887,44,124,1]
def simplifyDalen(source):
dup = 0
for x in source:
for y in source:
dup += x ^ y == 0
return dup ^ len(source) > 0
result = simplifyDalen(a)
if result:
print "There are duplicates!"
else:
print "There are no duplicates!"
So far my bit index idea is the fastest (because it's one pass algorithm, I guess, not many to many)
When you xor two same numbers, you get 0. As you should know.
from operator import xor
def check (lst):
dup = 0
for x in lst:
for y in lst:
dup += xor(x, y)!=0
l = len(lst)
return dup!=(l**2 -l)
c = check([0,1,2,3,4,5,4,3,5])
if c:
print "There are duplicates!"
else:
print "There are no duplicates!"
BTW, this is extremely stupid way to do it. XORing is fast, but O(n**2) (always through whole set) is unnecessary loss.
For one thing, the iteration should be stopped when first duplicate is encountered.
For another, this really should be done using set() or dict().
But you can get the idea.
Also, using xor() function instead of bitwise operator '^' slows the thing down a bit. I did it for clarity as I complicated the rest of the code. And so that people know that it exists as an alternative.
Here is an example on how to do it better.
It's a slight modification of code suggested by Organis in comments.
def check (lst):
l = len(lst)
# I skip comparing first with first and last with last
# to prevent 4 unnecessary iterations. It's not much, but it makes sense.
for x in xrange(1, l):
for y in xrange(l-1):
# Skip elements on same position as they will always xor to 0 :D
if x!=y: # Can be (if you insist): if x^y != 0:...
if (lst[x] ^ lst[y])==0:
return 1 # Duplicate found
return 0 # No duplicates
c = check([0,1,2,3,4,5,4,3,5])
if c:
print "There are duplicates!"
else:
print "There are no duplicates!"
To repeat, XOR is not to be used for comparison, at least not in high-level programming languages. It's possible you would need similar thing in assembly for some reason, but eq works nicely there as anywhere. If we simply used == here instead, we would be able to check for duplicates in lists containing anything, not just integers.
XOR is fancy for other uses, as for ordinary bitwising (masking, unmasking, changing bits...), so in encryption systems and similar stuff.
XOR takes the binary repersentation of a number and then compares it bit-by-bit with another and outputs a 1 if the two bits are different and a 0 otherwise. Example: 1 ^ 2 = 3 because in binary 1 is 01 and 2 is 10 so comparing bit-by-bit we get 11 or 3. XORing a number with itself always gives 0, so we can use this property to check if two numbers are the same. There are myriad better ways to check for duplicates in a list than using XOR, but if you want / need to do it this way, hopefully the above information gives you an idea of where to start.
Well, lets use list items as bit indices:
def dupliXor(source):
bigs = 0
for x in source:
b = 1<<x
nexts = bigs ^ b
# if xor removes the bit instead
# of adding it then it is duplicate
if nexts < bigs:
print True
return
bigs = nexts
print False
a = [1,2,3,4,5,19,22,25,17,6,73,72,71,77,899,887,44,124,1]
dupliXor(a) # True
a = [1,2,3,4,5,19,22,25,17,6,73,72,71,77,899,887,44,124]
dupliXor(a) # False

Reverse a list without using built-in functions

I'm using Python 3.5.
As part of a problem, I'm trying to design a function that takes a list as input and reverts it. So if x = [a, b, c] the function would make x = [c, b, a].
The problem is, I'm not allowed to use any built-in functions, and it has got me stuck. My initial thought was the following loop inside a function:
for revert in range(1, len(x) + 1):
y.append(x[-revert])
And it works. But the problem is I'm using len(x), which I believe is a built-in function, correct?
So I searched around and have made the following very simple code:
y = x[::-1]
Which does exactly what I wanted, but it just seems almost too simple/easy and I'm not sure whether "::" counts as a function.
So I was wondering if anyone had any hints/ideas how to manually design said function? It just seems really hard when you can't use any built-in functions and it has me stuck for quite some time now.
range and len are both built-in functions. Since list methods are accepted, you could do this with insert. It is reeaallyy slow* but it does the job for small lists without using any built-ins:
def rev(l):
r = []
for i in l:
r.insert(0, i)
return r
By continuously inserting at the zero-th position you end up with a reversed version of the input list:
>>> print(rev([1, 2, 3, 4]))
[4, 3, 2, 1]
Doing:
def rev(l):
return l[::-1]
could also be considered a solution. ::-1 (:: has a different result) isn't a function (it's a slice) and [] is, again, a list method. Also, contrasting insert, it is faster and way more readable; just make sure you're able to understand and explain it. A nice explanation of how it works can be found in this S.O answer.
*Reeaaalllyyyy slow, see juanpa.arrivillaga's answer for cool plot and append with pop and take a look at in-place reverse on lists as done in Yoav Glazner's answer.
:: is not a function, it's a python literal. as well as []
How to check if ::, [] are functions or not. Simple,
import dis
a = [1,2]
dis.dis(compile('a[::-1]', '', 'eval'))
1 0 LOAD_NAME 0 (a)
3 LOAD_CONST 0 (None)
6 LOAD_CONST 0 (None)
9 LOAD_CONST 2 (-1)
12 BUILD_SLICE 3
15 BINARY_SUBSCR
16 RETURN_VALUE
If ::,[] were functions, you should find a label CALL_FUNCTION among python instructions executed by a[::-1] statement. So, they aren't.
Look how python instructions looks like when you call a function, lets say list() function
>>> dis.dis(compile('list()', '', 'eval'))
1 0 LOAD_NAME 0 (list)
3 CALL_FUNCTION 0
6 RETURN_VALUE
So, basically
def rev(f):
return f[::-1]
works fine. But, I think you should do something like Jim suggested in his answer if your question is a homework or sent by you teacher. But, you can add this quickest way as a side note.
If you teacher complains about [::-1] notation, show him the example I gave you.
Another way ( just for completeness :) )
def another_reverse(lst):
new_lst = lst.copy() # make a copy if you don't want to ruin lst...
new_lst.reverse() # notice! this will reverse it in place
return new_lst
Here's a solution that doesn't use built-in functions but relies on list methods. It reverse in-place, as implied by your specification:
>>> x = [1,2,3,4]
>>> def reverse(seq):
... temp = []
... while seq:
... temp.append(seq.pop())
... seq[:] = temp
...
>>> reverse(x)
>>> x
[4, 3, 2, 1]
>>>
ETA
Jim, your answer using insert at position 0 was driving me nuts! That solution is quadratic time! You can use append and pop with a temporary list to achieve linear time using simple list methods. See (reverse is in blue, rev is green):
If it feels a little bit like "cheating" using seq[:] = temp, we could always loop over temp and append every item into seq and the time complexity would still be linear but probably slower since it isn't using the C-based internals.
Your example that works:
y = x[::-1]
uses Python slices notation which is not a function in the sense that I assume you're requesting. Essentially :: acts as a separator. A more verbose version of your code would be:
y = x[len(x):None:-1]
or
y = x[start:end:step]
I probably wouldn't be complaining that python makes your life really, really easily.
Edit to be super pedantic. Someone could argue that calling [] at all is using an inbuilt python function because it's really syntactical sugar for the method __getitem__().
x.__getitem__(0) == x[0]
And using :: does make use of the slice() object.
x.__getitem__(slice(len(x), None, -1) == x[::-1]
But... if you were to argue this, anything you write in python would be using inbuilt python functions.
Another way for completeness, range() takes an optional step parameter that will allow you to step backwards through the list:
def reverse_list(l):
return [l[i] for i in range(len(l)-1, -1, -1)]
The most pythonic and efficient way to achieve this is by list slicing. And, since you mentioned you do not need any inbuilt function, it completely suffice your requirement. For example:
>>> def reverse_list(list_obj):
... return list_obj[::-1]
...
>>> reverse_list([1, 3, 5 , 3, 7])
[7, 3, 5, 3, 1]
Just iterate the list from right to left to get the items..
a = [1,2,3,4]
def reverse_the_list(a):
reversed_list = []
for i in range(0, len(a)):
reversed_list.append(a[len(a) - i - 1])
return reversed_list
new_list = reverse_the_list(a)
print new_list

Python command working but why?

I have a simple code that goes like this in Python:
a = [1,2,3]
b = [2,4,6]
def union(a,b):
pos = 0
while pos < len(b):
n = b[pos]
if n in a is not 'True':
a = a
else:
a.append(n)
pos = pos +1
return a
print union(a,b)
As you can see, the first IF statement makes no sense. However, if I code it this way:
if n in a is 'True':
a.append(n)
it does not work. The first code segment changes a = [1,2,4,6] - only adding numbers from list 'b' that are not in list 'a' already. If I change the 'IF' snippet to "is 'True" as suggested, it does not work.
While this function does what I intended it to do, I feel it is not clean and I have no idea why "if n in a is 'True':" would not behave equal to the else part of the "if n in a is not 'True':" function.
Can somebody please help me understand this?
It is not a very pythonic way to use boolean check and then compare it with a string, so it would be better to do it this way:
a = [1,2,3]
b = [2,4,6]
def union(x,y):
for v in y:
if v not in x:
x.append(v)
return x
print union(a,b)
OR:
a.extend(set(b).difference(set(a)))
print a
>>> [1, 2, 3, 4, 6]
OR in case you don't care about new objects creating than:
print list(set(a).union(b))
in and is/is not are both relational operators, and in Python relational operators are chained. Therefore n in a is not 'True' is equivalent to n in a and a is not 'True', and n in a is 'True' is equivalent to n in a and a is 'True'. Clearly these are not negations of each other since they both have n in a.
But don't use is unless you know you need it, and never compare against a boolean either (unless yadda yadda).
You should just use True not the string 'True'
or better yet, just
if n not in a:
a.append(n)
If you are a beginner, you may not realise that Python has a builtin type called set
set objects already have methods for intersection/union etc.
You can use
if n in a
or
if n not in a
instead of the is 'True'.

Equivalent for inject() in Python?

In Ruby, I'm used to using Enumerable#inject for going through a list or other structure and coming back with some conclusion about it. For example,
[1,3,5,7].inject(true) {|allOdd, n| allOdd && n % 2 == 1}
to determine if every element in the array is odd. What would be the appropriate way to accomplish the same thing in Python?
To determine if every element is odd, I'd use all()
def is_odd(x):
return x%2==1
result = all(is_odd(x) for x in [1,3,5,7])
In general, however, Ruby's inject is most like Python's reduce():
result = reduce(lambda x,y: x and y%2==1, [1,3,5,7], True)
all() is preferred in this case because it will be able to escape the loop once it finds a False-like value, whereas the reduce solution would have to process the entire list to return an answer.
Sounds like reduce in Python or fold(r|l)'?' from Haskell.
reduce(lambda x, y: x and y % == 1, [1, 3, 5])
I think you probably want to use all, which is less general than inject. reduce is the Python equivalent of inject, though.
all(n % 2 == 1 for n in [1, 3, 5, 7])

Categories

Resources