Sorting algorithm to keep equal values separated - python

Psychology experiments often require you to pseudo-randomize the trial order, so that the trials are apparently random, but you don't get too many similar trials consecutively (which could happen with a purely random ordering).
Let's say that the visual display on each trial has a colour and a size:
display_list = []
colours = {0: 'red', 1: 'blue', 2: 'green', 3: 'yellow'}
sizes = [1] * 20 + [2] * 20 + [3] * 20 + [4] * 20 + [5] * 20 + [6] * 20
for i in range(120):
display_list.append({'colour': colours[i % 4], 'size': sizes[i]})
print(display_list)
And we can look at the maximum number of consecutive trials that has the same value for either property using this function:
def consecutive_properties(seq, field):
longest_run = 0
prev_value = None
current_run = 0
for d in seq:
if d[field] == prev_value:
current_run += 1
else:
current_run = 1
if current_run > longest_run:
longest_run = current_run
prev_value = d[field]
return longest_run
Output:
>>> print("Consecutive colours: ", consecutive_properties(display_list, 'colour')
('Consecutive colours: ', 1)
>>> print("Consecutive sizes: ", consecutive_properties(display_list, 'size'))
('Consecutive sizes: ', 20)
Are there any algorithms you know of that would allow minimizing the consecutive runs of either or both properties, or at least keep these runs below a specified length? If the latter, let's say no more than 4 in a row of the same colour or size.
What I've tried:
The solution I have now basically does a slightly intelligent bogosort, which has to be horribly inefficient. Basically:
You break the entire list into chunks containing all the permutations of the properties: if you break down display_list into chunks of length 24, each chunk has each colour paired with each size. Let's assume that the trial list can always be broken down into these permutation chunks, since you know what the permutations are from the design of the experiment.
You choose a maximum run length per chunk
You shuffle each chunk until the run lengths for each chunk are below the maximum value (this actually means that in the overall trial list, your runs might be double that length, since you could have a run of this length at the end of one chunk and the start of the next)

Question: Are there any algorithms you know of that would allow minimizing the
consecutive runs of either or both properties, or at least keep these
runs below a specified length?
Yes. There is an easy algorithm for doing this by simply reducing the probability of a color or size being chosen if it is already occurring in a run.
from random import randrange
def choose(colors, numselections, maxrun):
'Repeatedly choose colors. Gradually reduce selection probability to avoid runs.'
colors = list(colors)
n = len(colors)
total = n * maxrun
current_run = 0
for _ in range(numselections):
i = randrange(total - current_run) // maxrun
yield colors[i]
colors[i], colors[-1] = colors[-1], colors[i]
current_run = current_run + 1 if i==n-1 else 1
if __name__ == '__main__':
colors = ['red', 'blue', 'green', 'yellow']
for color in choose(colors, 100, maxrun=4):
print color
Note, this approach requires less effort than the other answers which use reselection techniques to avoid runs. Also, note the runs are faded-out gradually rather than all at once as in the other answers.

You're clearly not concerned with anything like true randomness, so if you define a distance metric, and draw your sequence randomly, you can reject any new draw if it's distance is "too close" to the previous draw, and simply draw again.
If you're drawing from a finite set (say, a pack of cards) then the whole set can
be the draw pile, and your sort would consist of swapping two elements when a close
pair is found, but also reject a swap partner if the swapped element would become unacceptable, so each swap step leaves the whole set improved.
If your criteria are not too hard to satisfy, this will terminate very quickly.

If the likelihood of consecutive elements is not very high (as in your example), I would simply reshuffle if the condition is not met. As you can see, most of the time you get away with one try, so it is quite efficient.
In [1]: from random import shuffle
In [2]: from itertools import groupby
In [3]: from collections import Counter
In [4]: def pseudo_shuffle(lst, limit, tries=1):
...: temp = list(lst)
...: shuffle(temp)
...: if max(sum(1 for x in g) for k, g in groupby(temp)) <= limit:
...: return tries #return temp
...: return pseudo_shuffle(lst, limit, tries=tries+1)
In [5]: colors = 30*['red', 'blue', 'green', 'yellow']
In [6]: sizes = [1] * 20 + [2] * 20 + [3] * 20 + [4] * 20 + [5] * 20 + [6] * 20
In [7]: Counter([pseudo_shuffle(colors, 4) for _ in range(1000)])
Out[7]: Counter({1: 751, 2: 200, 3: 38, 4: 10, 5: 1})
In [8]: Counter([pseudo_shuffle(sizes, 4) for _ in range(1000)])
Out[8]: Counter({1: 954, 2: 44, 3: 2})

As ddyer said, you are interested in randomness rather than sorting. My solution here would be:
Pick a random A element from your source list
Pick a random position I from your destination list
Insert A at position I to dest. list
Check if the destination list is valid. If not, restore the previous state and repeat
A working snippet:
from random import randint
from operator import itemgetter
from itertools import islice
def reshuffle(_items, max_consequent):
items = _items[:]
new_order = []
while items:
src_pos = randint(0, len(items)-1)
dest_pos = randint(0, len(new_order))
item = items[src_pos]
new_order.insert(dest_pos, item)
if is_new_order_fine(new_order, max_consequent):
items.pop(src_pos)
else:
new_order.pop(dest_pos)
return new_order
def is_new_order_fine(items, n_max):
return (
not has_consecutive_keys(items, n_max, key=itemgetter('colour')) and
not has_consecutive_keys(items, n_max, key=itemgetter('size')))
# can be optimised - don't check all items, just surrounding N
def has_consecutive_keys(items, n_max, key):
_has_n_keys = False
if len(items) >= n_max:
last_value = key(items[0])
n_consequent = 1
for item in items[1:]: # can optimize by using iterator
if key(item) == last_value:
n_consequent += 1
else:
last_value = key(item)
n_consequent = 1
if n_consequent >= n_max:
_has_n_keys = True
break
return _has_n_keys
Note that you don't need to check all items in the destination list each time, K on the left and right around the inserted new item (not implemented in the snippet)
Edit:
You can use groupby in has_consecutive_keys (but without sorting!)

Sorry, it's not an answer, but it's hard to post code in comments. Here is a simpler way to write the consecutive_properties function
from operator import itemgetter
from itertools import groupby
def consecutive_properties(seq, field):
return max(sum(1 for x in g) for k,g in groupby(seq, key=itemgetter(field)))
When I understand your question properly I'll try to turn this into an answer :)

Related

Find minimum number of elements in a list that covers specific values

A recruiter wants to form a team with different skills and he wants to pick the minimum number of persons which can cover all the required skills.
N represents number of persons and K is the number of distinct skills that need to be included. list spec_skill = [[1,3],[0,1,2],[0,2,4]] provides information about skills of each person. e.g. person 0 has skills 1 and 3, person 1 has skills 0, 1 and 2 and so on.
The code should outputs the size of the smallest team that recruiter could find (the minimum number of persons) and values indicating the specific IDs of the people to recruit onto the team.
I implemented the code with brute force as below but since some data are more than thousands, it seems I need to be solved with heuristic approaches. In this case it is possible to have approximate answer.
Any suggestion how to solve it with heuristic methods will be appreciated.
N,K = 3,5
spec_skill = [[1,3],[0,1,2],[0,2,4]]
A = list(range(K))
set_a = set(A)
solved = False
for L in range(0, len(spec_skill)+1):
for subset in itertools.combinations(spec_skill, L):
s = set(item for sublist in subset for item in sublist)
if set_a.issubset(s):
print(str(len(subset)) + '\n' + ' '.join([str(spec_skill.index(item)) for item in subset]))
solved = True
break
if solved: break
Here is my way of doing this. There might be potential optimization possibilities in the code, but the base idea should be understandable.
import random
import time
def man_power(lst, K, iterations=None, period=0):
"""
Specify a fixed number of iterations
or a period in seconds to limit the total computation time.
"""
# mapping each sublist into a (sublist, original_index) tuple
lst2 = [(lst[i], i) for i in range(len(lst))]
mini_sample = [0]*(len(lst)+1)
if period<0 or (period == 0 and iterations is None):
raise AttributeError("You must specify iterations or a positive period")
def shuffle_and_pick(lst, iterations):
mini = [0]*len(lst)
for _ in range(iterations):
random.shuffle(lst2)
skillset = set()
chosen_ones = []
idx = 0
fullset = True
# Breaks from the loop when all skillsets are found
while len(skillset) < K:
# No need to go further, we didn't find a better combination
if len(chosen_ones) >= len(mini):
fullset = False
break
before = len(skillset)
skillset.update(lst2[idx][0])
after = len(skillset)
if after > before:
# We append with the orginal index of the sublist
chosen_ones.append(lst2[idx][1])
idx += 1
if fullset:
mini = chosen_ones.copy()
return mini
# Estimates how many iterations we can do in the specified period
if iterations is None:
t0 = time.perf_counter()
mini_sample = shuffle_and_pick(lst, 1)
iterations = int(period / (time.perf_counter() - t0)) - 1
mini_result = shuffle_and_pick(lst, iterations)
if len(mini_sample)<len(mini_result):
return mini_sample, len(mini_sample)
else:
return mini_result, len(mini_result)

Compute counts of set partitions with mulitplicity and without order

I do have a piece of code that compute partitions of a set of (potentialy duplicated) integers. But i am interested in the set of possible partition and there multiplicity.
You can for exemple launch the follwoing code :
import numpy as np
from collections import Counter
import pandas as pd
def _B(i):
# for a given multiindex i, we defined _B(i) as the set of integers containg i_j times the number j:
if len(i) != 1:
B = []
for j in range(len(i)):
B.extend(i[j]*[j])
else:
B = i*[0]
return B
def _partition(collection):
# from here: https://stackoverflow.com/a/62532969/8425270
if len(collection) == 1:
yield (collection,)
return
first = collection[0]
for smaller in _partition(collection[1:]):
# insert `first` in each of the subpartition's subsets
for n, subset in enumerate(smaller):
yield smaller[:n] + ((first,) + subset,) + smaller[n + 1 :]
# put `first` in its own subset
yield ((first,),) + smaller
def to_list(tpl):
# the final hierarchy is
return list(list(i) if isinstance(i, tuple) else i for i in tpl)
def _Pi(inst_B):
# inst_B must be a tuple
if type(inst_B) != tuple :
inst_B = tuple(inst_B)
pp = [tuple(sorted(p)) for p in _partition(inst_B)]
c = Counter(pp)
Pi = c.keys()
N = list()
for pi in Pi:
N.append(c[pi])
Pi = [to_list(pi) for pi in Pi]
return Pi, N
if __name__ == "__main__":
import cProfile
pr = cProfile.Profile()
pr.enable()
sh = (3, 3, 3)
rez = list()
rez_sorted= list()
rez_ref = list()
for idx in np.ndindex(sh):
if sum(idx) > 0:
print(idx)
Pi, N = _Pi(_B(idx))
print(pd.DataFrame({'Pi': Pi, 'N': N * np.array([np.math.factorial(len(pi) - 1) for pi in Pi])}))
pr.disable()
# after your program ends
pr.print_stats(sort="tottime")
This code computes, for several examples of tuples of integer numbers (generated by np.ndindex) the partitions and counts i need. Everything happens in the _partition and the _Pi functions, this is were you should look at.
If you look closely at how these two functions are working, you'll see that they comput eevery potential partition and THEN count up how many times they appeared. For small problems, this is fine, but if the size of the prolbme increase, this starts to take a looooot of time. Try setting sh = (5,5,5), you'll see what i mean;
So the problem is the following :
Is there a way to compute directly the partitions and there number of occurences instead ?
Edit: I cross-posted on mathoverflow there, and they propose a solution in this article, in corrolary 2.10 (page 10 of the pdf). The problem could be solved by implmenting the sets p(v,r) in this corrolary.
I was hoping, as in the univariate case, that those sets would have a nice recursive expression but i ould not find one yet.
More Edit : This problem is equivalent to finding all (multiset)-partitions of a multiset. If the solution for finding (set)-partitions of a set is given by Bell partial polynomials, here we need multivariate version of these polynomials.

Fill order from smaller packages?

The input is an integer that specifies the amount to be ordered.
There are predefined package sizes that have to be used to create that order.
e.g.
Packs
3 for $5
5 for $9
9 for $16
for an input order 13 the output should be:
2x5 + 1x3
So far I've the following approach:
remaining_order = 13
package_numbers = [9,5,3]
required_packages = []
while remaining_order > 0:
found = False
for pack_num in package_numbers:
if pack_num <= remaining_order:
required_packages.append(pack_num)
remaining_order -= pack_num
found = True
break
if not found:
break
But this will lead to the wrong result:
1x9 + 1x3
remaining: 1
So, you need to fill the order with the packages such that the total price is maximal? This is known as Knapsack problem. In that Wikipedia article you'll find several solutions written in Python.
To be more precise, you need a solution for the unbounded knapsack problem, in contrast to popular 0/1 knapsack problem (where each item can be packed only once). Here is working code from Rosetta:
from itertools import product
NAME, SIZE, VALUE = range(3)
items = (
# NAME, SIZE, VALUE
('A', 3, 5),
('B', 5, 9),
('C', 9, 16))
capacity = 13
def knapsack_unbounded_enumeration(items, C):
# find max of any one item
max1 = [int(C / item[SIZE]) for item in items]
itemsizes = [item[SIZE] for item in items]
itemvalues = [item[VALUE] for item in items]
# def totvalue(itemscount, =itemsizes, itemvalues=itemvalues, C=C):
def totvalue(itemscount):
# nonlocal itemsizes, itemvalues, C
totsize = sum(n * size for n, size in zip(itemscount, itemsizes))
totval = sum(n * val for n, val in zip(itemscount, itemvalues))
return (totval, -totsize) if totsize <= C else (-1, 0)
# Try all combinations of bounty items from 0 up to max1
bagged = max(product(*[range(n + 1) for n in max1]), key=totvalue)
numbagged = sum(bagged)
value, size = totvalue(bagged)
size = -size
# convert to (iten, count) pairs) in name order
bagged = ['%dx%d' % (n, items[i][SIZE]) for i, n in enumerate(bagged) if n]
return value, size, numbagged, bagged
if __name__ == '__main__':
value, size, numbagged, bagged = knapsack_unbounded_enumeration(items, capacity)
print(value)
print(bagged)
Output is:
23
['1x3', '2x5']
Keep in mind that this is a NP-hard problem, so it will blow as you enter some large values :)
You can use itertools.product:
import itertools
remaining_order = 13
package_numbers = [9,5,3]
required_packages = []
a=min([x for i in range(1,remaining_order+1//min(package_numbers)) for x in itertools.product(package_numbers,repeat=i)],key=lambda x: abs(sum(x)-remaining_order))
remaining_order-=sum(a)
print(a)
print(remaining_order)
Output:
(5, 5, 3)
0
This simply does the below steps:
Get value closest to 13, in the list with all the product values.
Then simply make it modify the number of remaining_order.
If you want it output with 'x':
import itertools
from collections import Counter
remaining_order = 13
package_numbers = [9,5,3]
required_packages = []
a=min([x for i in range(1,remaining_order+1//min(package_numbers)) for x in itertools.product(package_numbers,repeat=i)],key=lambda x: abs(sum(x)-remaining_order))
remaining_order-=sum(a)
print(' '.join(['{0}x{1}'.format(v,k) for k,v in Counter(a).items()]))
print(remaining_order)
Output:
2x5 + 1x3
0
For you problem, I tried two implementations depending on what you want, in both of the solutions I supposed you absolutely needed your remaining to be at 0. Otherwise the algorithm will return you -1. If you need them, tell me I can adapt my algorithm.
As the algorithm is implemented via dynamic programming, it handles good inputs, at least more than 130 packages !
In the first solution, I admitted we fill with the biggest package each time.
I n the second solution, I try to minimize the price, but the number of packages should always be 0.
remaining_order = 13
package_numbers = sorted([9,5,3], reverse=True) # To make sure the biggest package is the first element
prices = {9: 16, 5: 9, 3: 5}
required_packages = []
# First solution, using the biggest package each time, and making the total order remaining at 0 each time
ans = [[] for _ in range(remaining_order + 1)]
ans[0] = [0, 0, 0]
for i in range(1, remaining_order + 1):
for index, package_number in enumerate(package_numbers):
if i-package_number > -1:
tmp = ans[i-package_number]
if tmp != -1:
ans[i] = [tmp[x] if x != index else tmp[x] + 1 for x in range(len(tmp))]
break
else: # Using for else instead of a boolean value `found`
ans[i] = -1 # -1 is the not found combinations
print(ans[13]) # [0, 2, 1]
print(ans[9]) # [1, 0, 0]
# Second solution, minimizing the price with order at 0
def price(x):
return 16*x[0]+9*x[1]+5*x[2]
ans = [[] for _ in range(remaining_order + 1)]
ans[0] = ([0, 0, 0],0) # combination + price
for i in range(1, remaining_order + 1):
# The not found packages will be (-1, float('inf'))
minimal_price = float('inf')
minimal_combinations = -1
for index, package_number in enumerate(package_numbers):
if i-package_number > -1:
tmp = ans[i-package_number]
if tmp != (-1, float('inf')):
tmp_price = price(tmp[0]) + prices[package_number]
if tmp_price < minimal_price:
minimal_price = tmp_price
minimal_combinations = [tmp[0][x] if x != index else tmp[0][x] + 1 for x in range(len(tmp[0]))]
ans[i] = (minimal_combinations, minimal_price)
print(ans[13]) # ([0, 2, 1], 23)
print(ans[9]) # ([0, 0, 3], 15) Because the price of three packages is lower than the price of a package of 9
In case you need a solution for a small number of possible
package_numbers
but a possibly very big
remaining_order,
in which case all the other solutions would fail, you can use this to reduce remaining_order:
import numpy as np
remaining_order = 13
package_numbers = [9,5,3]
required_packages = []
sub_max=np.sum([(np.product(package_numbers)/i-1)*i for i in package_numbers])
while remaining_order > sub_max:
remaining_order -= np.product(package_numbers)
required_packages.append([max(package_numbers)]*np.product(package_numbers)/max(package_numbers))
Because if any package is in required_packages more often than (np.product(package_numbers)/i-1)*i it's sum is equal to np.product(package_numbers). In case the package max(package_numbers) isn't the one with the samllest price per unit, take the one with the smallest price per unit instead.
Example:
remaining_order = 100
package_numbers = [5,3]
Any part of remaining_order bigger than 5*2 plus 3*4 = 22 can be sorted out by adding 5 three times to the solution and taking remaining_order - 5*3.
So remaining order that actually needs to be calculated is 10. Which can then be solved to beeing 2 times 5. The rest is filled with 6 times 15 which is 18 times 5.
In case the number of possible package_numbers is bigger than just a handful, I recommend building a lookup table (with one of the others answers' code) for all numbers below sub_max which will make this immensely fast for any input.
Since no declaration about the object function is found, I assume your goal is to maximize the package value within the pack's capability.
Explanation: time complexity is fixed. Optimal solution may not be filling the highest valued item as many as possible, you have to search all possible combinations. However, you can reuse the possible optimal solutions you have searched to save space. For example, [5,5,3] is derived from adding 3 to a previous [5,5] try so the intermediate result can be "cached". You may either use an array or you may use a set to store possible solutions. The code below runs the same performance as the rosetta code but I think it's clearer.
To further optimize, use a priority set for opts.
costs = [3,5,9]
value = [5,9,16]
volume = 130
# solutions
opts = set()
opts.add(tuple([0]))
# calc total value
cost_val = dict(zip(costs, value))
def total_value(opt):
return sum([cost_val.get(cost,0) for cost in opt])
def possible_solutions():
solutions = set()
for opt in opts:
for cost in costs:
if cost + sum(opt) > volume:
continue
cnt = (volume - sum(opt)) // cost
for _ in range(1, cnt + 1):
sol = tuple(list(opt) + [cost] * _)
solutions.add(sol)
return solutions
def optimize_max_return(opts):
if not opts:
return tuple([])
cur = list(opts)[0]
for sol in opts:
if total_value(sol) > total_value(cur):
cur = sol
return cur
while sum(optimize_max_return(opts)) <= volume - min(costs):
opts = opts.union(possible_solutions())
print(optimize_max_return(opts))
If your requirement is "just fill the pack" it'll be even simpler using the volume for each item instead.

python itertools permutations with tied values

I want to find efficiently permutations of a vector which has tied values.
E.g., if perm_vector = [0,0,1,2] I would want to obtain as output all combinations of [0,0,1,2], [0,0,2,1], [0,1,2,0] and so on, but I don't want to obtain [0,0,1,2] twice which is what the standard itertools.permutations(perm_vector) would give.
I tried the following but it works really SLOW when perm_vector grows in len:
vectors_list = []
for it in itertools.permutations(perm_vector):
vectors_list.append(list(it))
df_vectors_list = pd.DataFrame( vectors_list)
df_gb = df_vectors_list.groupby(list(df_vectors_list.columns))
vectors_list = pd.DataFrame(df_gb.groups.keys()).T
The question is of more general "speed-up" nature, actually. The main time is spent on creating the permutations of long vectors - even without the duplicity, creation of permutations of a vector of 12 unique values takes a "infinity". Is there a possibility to call the itertools iteratively without accessing the entire permutations data but working on bunches of it?
Try this if perm_vector is small:
import itertools as iter
{x for x in iter.permutations(perm_vector)}
This should give you unique values, because now it becomes a set, which by default delete duplications.
If perm_vector is large, you might want to try backtracking:
def permu(L, left, right, cache):
for i in range(left, right):
L[left], L[i] = L[i], L[left]
L_tuple = tuple(L)
if L_tuple not in cache:
permu(L, left + 1, right, cache)
L[left], L[i] = L[i], L[left]
cache[L_tuple] = 0
cache = {}
permu(perm_vector, 0, len(perm_vector), cache)
cache.keys()
How about this:
from collections import Counter
def starter(l):
cnt = Counter(l)
res = [None] * len(l)
return worker(cnt, res, len(l) - 1)
def worker(cnt, res, n):
if n < 0:
yield tuple(res)
else:
for k in cnt.keys():
if cnt[k] != 0:
cnt[k] = cnt[k] - 1
res[n] = k
for r in worker(cnt, res, n - 1):
yield r
cnt[k] = cnt[k] + 1

Find the nth lucky number generated by a sieve in Python

I'm trying to make a program in Python which will generate the nth lucky number according to the lucky number sieve. I'm fairly new to Python so I don't know how to do all that much yet. So far I've figured out how to make a function which determines all lucky numbers below a specified number:
def lucky(number):
l = range(1, number + 1, 2)
i = 1
while i < len(l):
del l[l[i] - 1::l[i]]
i += 1
return l
Is there a way to modify this so that I can instead find the nth lucky number? I thought about increasing the specified number gradually until a list of the appropriate length to find the required lucky number was created, but that seems like a really inefficient way of doing it.
Edit: I came up with this, but is there a better way?
def lucky(number):
f = 2
n = number * f
while True:
l = range(1, n + 1, 2)
i = 1
while i < len(l):
del l[l[i] - 1::l[i]]
i += 1
if len(l) >= number:
return l[number - 1]
f += 1
n = number * f
I came up with this, but is there a better way?
Truth is, there will always be a better way, the remaining question being: is it good enough for your need?
One possible improvement would be to turn all this into a generator function. That way, you would only compute new values as they are consumed. I came up with this version, which I only validated up to about 60 terms:
import itertools
def _idx_after_removal(removed_indices, value):
for removed in removed_indices:
value -= value / removed
return value
def _should_be_excluded(removed_indices, value):
for j in range(len(removed_indices) - 1):
value_idx = _idx_after_removal(removed_indices[:j + 1], value)
if value_idx % removed_indices[j + 1] == 0:
return True
return False
def lucky():
yield 1
removed_indices = [2]
for i in itertools.count(3, 2):
if not _should_be_excluded(removed_indices, i):
yield i
removed_indices.append(i)
removed_indices = list(set(removed_indices))
removed_indices.sort()
If you want to extract for example the 100th term from this generator, you can use itertools nth recipe:
def nth(iterable, n, default=None):
"Returns the nth item or a default value"
return next(itertools.islice(iterable, n, None), default)
print nth(lucky(), 100)
I hope this works, and there's without any doubt more room for code improvement (but as stated previously, there's always room for improvement!).
With numpy arrays, you can make use of boolean indexing, which may help. For example:
>>> a = numpy.arange(10)
>>> print a
[0 1 2 3 4 5 6 7 8 9]
>>> print a[a > 3]
[4 5 6 7 8 9]
>>> mask = np.array([True, False, True, False, True, False, True, False, True, False])
>>> print a[mask]
[0 2 4 6 8]
Here is a lucky number function using numpy arrays:
import numpy as np
class Didnt_Findit(Exception):
pass
def lucky(n):
'''Return the nth lucky number.
n --> int
returns int
'''
# initial seed
lucky_numbers = [1]
# how many numbers do you need to get to n?
candidates = np.arange(1, n*100, 2)
# use numpy array boolean indexing
next_lucky = candidates[candidates > lucky_numbers[-1]][0]
# accumulate lucky numbers till you have n of them
while next_lucky < candidates[-1]:
lucky_numbers.append(next_lucky)
#print lucky_numbers
if len(lucky_numbers) == n:
return lucky_numbers[-1]
mask_start = next_lucky - 1
mask_step = next_lucky
mask = np.array([True] * len(candidates))
mask[mask_start::mask_step] = False
#print mask
candidates = candidates[mask]
next_lucky = candidates[ candidates > lucky_numbers[-1]][0]
raise Didnt_Findit('n = ', n)
>>> print lucky(10)
33
>>> print lucky(50)
261
>>> print lucky(500)
3975
Checked mine and #icecrime's output for 10, 50 and 500 - they matched.
Yours is much faster than mine and scales better with n.
n=input('enter n ')
a= list(xrange(1,n))
x=a[1]
for i in range(1,n):
del a[x-1::x]
x=a[i]
l=len(a)
if i==l-1:
break
print "lucky numbers till %d" % n
print a
lets do this with an example.lets print lucky numbers till 100
put n=100
firstly a=1,2,3,4,5....100
x=a[1]=2
del a[1::2] leaves
a=1,3,5,7....99
now l=50
and now x=3
then del a[2::3] leaving a =1,3,7,9,13,15,.....
and loop continues till i==l-1

Categories

Resources