Problem with quicksort and python

Problem with quicksort and python - python

I'm writing a python program that plays poker for a class, and I need to sort a list of five card hands. I have a function called wins(), which takes two hands and returns True if the first one beats the second one, False if it doesn't. I wrote an implementation of quicksort to sort the list of hands, and I noticed that it was taking much longer than expected, so I programmed it to print the length of each list it was sorting. The function looks like this:
def sort(l):
if len(l) <= 1:
return l
print len(l)
pivot = choice(l)
l.remove(pivot)
left = []
right = []
for i in l:
if wins(i, pivot) == True:
right.append(i)
else:
left.append(i)
return sort(left) + [pivot] + sort(right)
and when I had it sort 64 hands, it printed:
64,
53,
39,
26,
25,
24,
23,
22,
21,
20,
19,
18,
17,
16,
15,
14,
13,
12,
11,
10,
9,
8,
7,
6,
5,
4,
3,
2,
12,
7,
3,
2,
3,
2,
4,
3,
2,
13,
9,
6,
2,
3,
2,
2,
3,
10,
8,
2,
5,
4,
3,
2. Notice the consecutive sequence in the middle? I can't figure out why it does that, but it's causing quicksort to behave like an O(n^2). It doesn't make sense for it to choose the strongest hand as a pivot on every iteration, but that's what seems to be happening. What am I overlooking?

Edited answer after comments:
Could you try the following code and share your results. It introduces an equivalence class to reduce the population in the "lose or tie" (or "smaller than or equal to") group while forming an answer via recursion.
# modified quicksort with equivalence class.
def sort(l):
if len(l) <= 1:
return l
print len(l)
pivot = choice(l)
l.remove(pivot)
left = []
right = []
# I added an equivalence class
pivotEquivalence = [pivot]
# and an equivalence partitioning
# to reduce number of elements
# in the upcoming recursive calls
for i in l:
if wins(i, pivot) == True:
right.append(i)
elif wins(pivot,i) == True:
left.append(i)
else
pivotEquivalence.append(i)
return sort(left) + pivotEquivalence + sort(right)
======
Instead of choosing the pivot randomly, try taking the middle indexed element. on average it will guarantee the O(N log N).
Also for O notation, you need to make many simulations to collect empirical data. Just one example might be very misleading.
Try to print the pivot, and it's index as well. Shuffle your list by random.shuffle(list) and try again to see the distribution. Just ensure system time seeding as well.
Could you send post the complete code, and your data for community to repeat the problem?
Sincerely,
Umut

Related

If/else statement with a triplet that sum to a given value

I'm writing a sum up game where two players will take turns picking a random number in the range (1,9), no repeated number allowed. So I'm struggling at
If at any point exactly three of the player's numbers sum to 15, then that player has won.
If the first player picks [7, 2, 3, 5], he will win because 7+3+5 = 15
So my question is why doesn't the program stop when first_player has inputs == 15
I want to avoid importing any libs.

Instead of generating all permutations at each step, maintain a map of what each permutation sums to, then add two branches to each branch at each move.
Think of each entry as a set of bits, ie with each permutation you either include a given entry or not, eg if the numbers are [7, 3, 2] you might store [1, 0, 1] for the combination of the 7 and the 2.
You can make a hashmap of 101->9 etc and when someone adds a 3 to it you add an entry for 1010->9 and 1011->12. As soon as you see the target you know the game is over.
So the evolution of [7, 3, 2] would be
0->0
1->7
00->0
01->3
10->7
11->10
000->0
001->2
010->3
011->5
100->7
101->9
110->10
111->12

A more efficient way would be to find only those numbers whose sum is equal to the target that is 15.
entry = [7, 5, 1, 3]
def is_sum_15(nums):
res = []
search_numbers(nums, 3, 15, 0, [], res)
return len(res) != 0
def search_numbers(nums, k, n, index, path, res):
if k < 0 or n < 0:
return
if k == 0 and n == 0:
res.append(path)
for i in range(index, len(nums)):
search_numbers(nums, k-1, n-nums[i], i+1, path+[nums[i]], res)
print(is_sum_15(entry)) # True

An inefficient but easy way is to use itertools.permutations:
>>> entry = [7, 2, 3, 5]
>>> import itertools
>>> [sum(triplet) for triplet in itertools.permutations(entry, r=3) if sum(tr]
[12, 14, 12, 15, 14, 15, 12, 14, 12, 10, 14, 10, 12, 15, 12, 10, 15, 10, 14, 15, 14, 10, 15, 10]
>>> any(sum(triplet) == 15 for triplet in itertools.permutations(entry, r=3))
True
It's inefficient, because you would be trying all permutations every time entry gets expanded with a new number.

Returning a list of the ordered elements with a conition in while loop structure

I am trying to solve a assignment where are 13 lights and starting from 1, light is turned off at every 5th light, when the count reaches 13, start from 1st item again. The function should return the order of lights turned off. In this case, for a list of 13 items, the return list would be [5, 10, 2, 8, 1, 9, 4, 13, 12, 3, 7, 11, 6]. Also, turned off lights would not count again.
So the way I was going to approach this problem was to have a list named turnedon, which is [1,2,3,4,5,6,7,8,9,10,11,12,13] and an empty list called orderoff and append to this list whenever a light gets turned off in the turnedon list. So while the turnedon is not empty, iterate through the turnedon list and append the light getting turned off and remove that turnedoff light from the turnedon list, if that makes sense. I cannot figure out what should go into the while loop though. Any idea would be really appreciated.
def orderoff():
n=13
turnedon=[]
for n in range(1,n+1):
turnedon.append(n)
orderoff=[]
while turneon !=[]:

This problem is equivalent to the well-known Josephus problem, in which n prisoners stand in a circle, and they are killed in a sequence where each time, the next person to be killed is k steps around the circle from the previous person; the steps are only counted over the remaining prisoners. A sample solution in Python can be found on the Rosetta code website, which I've adapted slightly below:
def josephus(n, k):
p = list(range(1, n+1))
i = 0
seq = []
while p:
i = (i+k-1) % len(p)
seq.append(p.pop(i))
return seq
Example:
>>> josephus(13, 5)
[5, 10, 2, 8, 1, 9, 4, 13, 12, 3, 7, 11, 6]

This works, but the results are different from yours:
>>> pos = 0
>>> result = []
>>> while len(result) < 13 :
... pos += 5
... pos %= 13
... if pos not in result :
... result.append(pos)
...
>>> result = [i+1 for i in result] # make it 1-based, not 0-based
>>> result
[6, 11, 3, 8, 13, 5, 10, 2, 7, 12, 4, 9, 1]
>>>

I think a more optimal solution would be to use a loop, add the displacement each time, and use modules to keep the number in range
def orderoff(lights_num,step):
turnd_off=[]
num =0
for i in range(max):
num =((num+step-1)%lights_num)+1
turnd_off.append(num)
return turnd_off
print(orderoff(13))

Is there a way to find the nᵗʰ entry in itertools.combinations() without converting the entire thing to a list?

I am using the itertools library module in python.
I am interested the different ways to choose 15 of the first 26000 positive integers. The function itertools.combinations(range(1,26000), 15) enumerates all of these possible subsets, in a lexicographical ordering.
The binomial coefficient 26000 choose 15 is a very large number, on the order of 10^54. However, python has no problem running the code y = itertools.combinations(range(1,26000), 15) as shown in the sixth line below.
If I try to do y[3] to find just the 3rd entry, I get a TypeError. This means I need to convert it into a list first. The problem is that trying to convert it into a list gives a MemoryError. All of this is shown in the screenshot above.
Converting it into a list does work for smaller combinations, like 6 choose 3, shown below.
My question is:
Is there a way to access specific elements in itertools.combinations() without converting it into a list?
I want to be able to access, say, the first 10000 of these ~10^54 enumerated 15-element subsets.
Any help is appreciated. Thank you!

You can use a generator expression:
comb = itertools.combinations(range(1,26000), 15)
comb1000 = (next(comb) for i in range(1000))
To jump directly to the nth combination, here is an itertools recipe:
def nth_combination(iterable, r, index):
"""Equivalent to list(combinations(iterable, r))[index]"""
pool = tuple(iterable)
n = len(pool)
if r < 0 or r > n:
raise ValueError
c = 1
k = min(r, n-r)
for i in range(1, k+1):
c = c * (n - k + i) // i
if index < 0:
index += c
if index < 0 or index >= c:
raise IndexError
result = []
while r:
c, n, r = c*r//n, n-1, r-1
while index >= c:
index -= c
c, n = c*(n-r)//n, n-1
result.append(pool[-1-n])
return tuple(result)
It's also available in more_itertools.nth_combination
>>> import more_itertools # pip install more-itertools
>>> more_itertools.nth_combination(range(1,26000), 15, 123456)
(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 18, 19541)
To instantly "fast-forward" a combinations instance to this position and continue iterating, you can set the state to the previously yielded state (note: 0-based state vector) and continue from there:
>>> comb = itertools.combinations(range(1,26000), 15)
>>> comb.__setstate__((0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 17, 19540))
>>> next(comb)
(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 18, 19542)

If you want to access the first few elements, it's pretty straightforward with islice:
import itertools
print(list(itertools.islice(itertools.combinations(range(1,26000), 15), 1000)))
Note that islice internally iterates the combinations up to the specified point, so it can't magically give you the middle elements without iterating all the way there. You'd have to go down the route of computing the elements you want combinatorially in that case.

Building up a counting function

I need to build up a counting function starting from a dictionary. The dictionary is a classical Bag_of_Words and looks like as follows:
D={'the':5, 'pow':2, 'poo':2, 'row':2, 'bub':1, 'bob':1}
I need the function that for a given integer returns the number of words with at least that number of occurrences. In the example F(2)=4, all words but 'bub' and 'bob'.
First of all I build up the inverse dictionary of D:
ID={5:1, 2:3, 1:2}
I think I'm fine with that. Then here is the code:
values=list(ID.keys())
values.sort(reverse=True)
Lk=[]
Nw=0
for val in values:
Nw=Nw+ID[val]
Lk.append([Nw, val])
The code works fine but I do not like it. The point is that I would prefer to use a list comprehension to build up Lk; also I really ate the Nw variable I have used. It does not seems pythonic at all

you can create a sorted array of your word counts then find the insertion point with np.searchsorted to get how many items are to either side of it... np.searchsorted is very efficient and fast. If your dictionary doesn't change often this call is basically free compared to other methods
import numpy as np
def F(n, D):
#creating the array each time would be slow if it doesn't change move this
#outside the function
arr = np.array(D.values())
arr.sort()
L = len(arr)
return L - np.searchsorted(arr, n) #this line does all the work...
what's going on....
first we take just the word counts (and convert to a sorted array)...
D = {"I'm": 12, "pretty": 3, "sure":12, "the": 45, "Donald": 12, "is": 3, "on": 90, "crack": 11}
vals = np.arrau(D.values())
#vals = array([90, 12, 12, 3, 11, 12, 45, 3])
vals.sort()
#vals = array([ 3, 3, 11, 12, 12, 12, 45, 90])
then if we want to know how many values are greater than or equal to n, we simply find the length of the list beyond the first number greater than or equal to n. We do this by determining the leftmost index where n would be inserted (insertion sort) and subtracting that from the total number of positions (len)
# how many are >= 10?
# insertion point for value of 10..
#
# | index: 2
# v
# array([ 3, 3, 11, 12, 12, 12, 45, 90])
#find how many elements there are
#len(arr) = 8
#subtract.. 2-8 = 6 elements that are >= 10

A fun little trick for counting things: True has a numerical value of 1 and False has a numerical value of 0. SO we can do things like
sum(v >= k for v in D.values())
where k is the value you're comparing against.

collections.Counter() is ideal choice for this. Use them on dict.values() list. Also, you need not to install them explicitly like numpy. Sample example:
>>> from collections import Counter
>>> D = {'the': 5, 'pow': 2, 'poo': 2, 'row': 2, 'bub': 1, 'bob': 1}
>>> c = Counter(D.values())
>>> c
{2: 3, 1: 2, 5: 1}

Generating a list of prime numbers using list comprehension

I'm trying to create a list of all the prime numbers less than or equal to a given number. I did that successfully using for loops. I was trying to achieve the same using list comprehension using python. But my output has some unexpected values.
Here is my code..
pr=[2]
pr+=[i for i in xrange(3,num+1) if not [x for x in pr if i%x==0]]
where num is the number I had taken as input from user.
The output of the above code for
num=20 is this: [2, 3, 5, 7, 9, 11, 13, 15, 17, 19]
I'm puzzled as to why 9 and 15 are there in the output. What am I doing wrong here?

It simply doesn’t work that way. List comprehensions are evaluated separately, so you can imagine it like this:
pr = [2]
tmp = [i for i in xrange(3,num+1) if not [x for x in pr if i%x==0]]
pr += tmp
By the time tmp is evaluated, pr only contains 2, so you only ever check if a number is divisible by 2 (i.e. if it’s even). That’s why you get all uneven numbers.
You simply can’t solve this nicely† using list comprehensions.
† Not nicely, but ugly and in a very hackish way, by abusing that you can call functions inside a list comprehension:
pr = [2]
[pr.append(i) for i in xrange(3,num+1) if not [x for x in pr if i%x==0]]
print(pr) # [2, 3, 5, 7, 11, 13, 17, 19]
This abuses list comprehensions and basically collects a None value for each prime number you add to pr. So it’s essentially like your normal for loop except that we unnecessarily collect None values in a list… so you should rather allow yourself to use a line break and just use a normal loop.

Your list pr doesn't update until after your entire list comprehension is done. This means your list only contains 2, so every number dividable by 2 is not in the list (as you can see). You should update the list whenever you found a new prime number.

This is because the pr += [...] is evaluated approximately as this:
pr = [2]
tmp = [i for i in xrange(3,num+1) if not [x for x in pr if i%x==0]]
pr.extend(tmp)
So while tmp is generated, contents of pr remains the same ([2]).
I would go with function like this:
>>> import itertools
>>> def primes():
... results = []
... for i in itertools.count(2):
... if all(i%x != 0 for x in results):
... results.append(i)
... yield i
...
# And then you can fetch first 10 primes
>>> list(itertools.islice(primes(), 10))
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
# Or get primes smaller than X
>>> list(itertools.takewhile(lambda x: x < 50, primes()))
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]
Note, that using all is more efficient than creating array and testing whether it's empty.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Problem with quicksort and python - python

Related

If/else statement with a triplet that sum to a given value

Returning a list of the ordered elements with a conition in while loop structure

Is there a way to find the nᵗʰ entry in itertools.combinations() without converting the entire thing to a list?

Building up a counting function

Generating a list of prime numbers using list comprehension

Categories

Resources