Finite permutations of a list python - python

I have a list and would like to generate a finite number of permutation with no repeated elements.
itertools.permutations(x)
gives all possible orderings but I only need a specific number of permutation. (my initial list contains ~200 elements => 200! will take an unreasonable amount of time and I don't need all of them)
what I have done so far
def createList(My_List):
New_List = random.sample(My_List, len(My_List))
return New_List
def createManyList(Nb_of_Lists):
list_of_list = []
for i in range(0, Nb_of_Lists):
list_of_list.append(createList())
return list_of_list
It's working but my List_of_list will not have unique permutations or at least I have no guaranty about it.
Is there any way around to do so? Thanks

Just use islice, which allows you to take a number of elements from an iterable:
from itertools import permutations, islice
n_elements = 1000
list(islice(permutations(x), 0, 1000))
This will return a list of (the first) 1000 permutations.
The reason this works is that permutations returns an iterator, which is an object that generates values to return as they are needed, not immediately. Therefore, the process goes something like this:
The calling function (in this case, list) asks for the next value from islice
islice checks if 1000 values have been returned; if not, it asks for the next value from permutations
permutations returns the next value, in order
Because of this, the full list of permutations never needs to be generated; we take only as many as we want.

You can do:
i = 0
while i < Nb_of_Lists:
if createlist() not in list_of_lists:
list_of_list.append(createList())
else:
i -= 1
This will check if that permutation was already used.

You don't need to roll your own permutation. You just to halt the generator once you get enough:
# python 2.7
import random
import itertools
def createList(My_List):
New_List = random.sample(My_List, len(My_List))
return New_List
x = createList(xrange(20))
def getFirst200():
for i, result in enumerate(itertools.permutations(x)):
if i == 200:
raise StopIteration
yield result
print list(getFirst200()) # print first 200 of the result
This is faster and more memory efficient than 'generate of full set then take first 200' approach

Related

Stopping a Python iterator/generator after a given number of times

I have a generator with many elements, say
long_generator = (i**2 for i in range(10**1000))
I would like to extract the first n elements (without obviously parsing the generator until the end): what pythonic way could do this?
The function iter has a second parameter being a sentinel based on the returned value:
numbers = iter(lambda:next(long_generator), 81) # but this assumes we know the results
So would there be an equivalent based on the number of "iterations" instead?
I came up with the following function:
def first_elements(iterable, n:int):
"""Iterates over n elements of an iterable"""
for _ in range(n):
yield next(iterable)
And you could get a list as follows: first_10 = list(first_elements(long_generator, 10))
Is there some built-in or better/more elegant way?

Code taking forever to execute under certain conditions. Needs to run faster

This code is part of a challenge that requires the code to give back permutations of a string with no duplicates. The code executes but under some of the challenges it doesn't pass because of the time gate and i dont know a way to make it execute faster.
from itertools import permutations as perm
def permutations(string):
permList = list(perm(string))
joinedList = [''.join(tups) for tups in permList]
ans = []
[ans.append(x) for x in joinedList if x not in ans]
return ans
Again code runs for certain examples but examples with large strings and alot of matches the code takes too long and fails the challenge.
If you want to prevent duplicates, use a set, not lists. Your code takes forever because you're constantly scanning the list while you are inserting new data. Instead you can do a constant time lookup / replacement
And you can save on storage costs by using generator comprehension rather than list comprehension
def permutations(string):
permList = (''.join(p) for p in perm(string)))
result = set()
for p in permList:
result.add(p)
return list(result)
Ideally, you should keep the output in a generator for as long as possible; evaluating it takes time and space.
Here we maintain a set of seen elements in order to avoid yielding them again thus keeping each unique.
import itertools
def unique_permutations(seq):
seen = set()
for p in itertools.permutations(seq):
if p not in seen:
seen.add(p)
yield(p)
for p in unique_permutations('aaab'):
print(p)

How to make this code execute faster?

def lookfor(alist, number):
if number in alist:
return alist.index(number)
else:
return "no"
So basically I input hundreds of thousands of numbers and I have to send each one of them thorugh "lookfor" to get an output of either the index of "number" in "alist" or get"no" if the number isn't there.
It perfectly computes when I input not as many numbers but takes several minutes when I input xx,xxx-xxx,xxx numbers.
Any suggestions?
Your code iterates through the list until it finds the number you seek (or until it reaches the end), and if it does find the number, it has to iterate the exact same amount to return the index. Why not take advantage of the behavior of the .index method? Just keep in mind that it raises a ValueError if the number is not present in the list.
def lookfor(alist, number):
try:
return alist.index(number)
except ValueError:
return "no"
afterword: use the timeit module to find the most efficient solution, but be sure to use a variety of inputs so that you can find the overall fastest solution.
def index_on(lst):
index = {val:i for i,val in enumerate(lst)}
def lookup(val):
return index.get(val, 'no')
return lookup
search = index_on(alist)
search('123-4567') # => 293 (index in alist)
search('123-4500') # => 'no' (not found)
Your code currently needs to search through the entire list for each call to lookfor. This can be very slow if alist is big enough.
Instead, you should create a dictionary that maps each element to its index in alist. For example, for alist = [7,4,88], you'd have: indexmap = {7:0, 4:1, 88:2}. Then you can search the dictionary with:
def lookfor(indexmap, number):
return indexmap.get(number, "no")
If alist is constant, you can create indexmap during initialization:
indexmap = {number: index for index,number in enumerate(alist)}
If alist changes over time, you can maintain this dictionary together with alist. For example, if you normally add items with append, you can use:
alist.append(number)
if number not in indexmap:
indexmap[number] = len(alist) - 1

What is the average case performance of this permutation-generating algorithm?

I'm trying to determine the average case performance of this permutation-generating algorithm. It uses the recursive approach, in which the first element is swapped with each other element, producing a new set of permutations - these sets then go through the same routine, but with the first element fixed.
Here's the code in Python:
# Returns a list of all permutations of the given list.
def permutate(set):
# A list of all found permutations, to be returned
permutations = []
# Takes a set which has all elements below index i fixed and finds all permutations
def recurse(set, i):
# If all elements are fixed, store the current permutation
if i + 1 == len(set):
permutations.append(set)
else:
# Swap the "first" element with each other element to generate new permutations
for element in xrange(i, len(set)):
set[element], set[i] = set[i], set[element]
recurse(set, i + 1)
set[element], set[i] = set[i], set[element]
# Use the recursive algorithm to find all permutations, starting with no fixed elements
recurse(set, 0)
return permutations
print permutate([1, 2, 3])
I don't have much experience with analyzed recursive function performances, so I don't know how to solve this. If I had to make a guess, I would say that the runtime is Θ(n!), because a set with n elements has n! permutations (so there must be that much effort put in by the algorithm, right?)
Any help would be appreciated.
First of all, the complexity is O(n!) for the reason mentioned in the comment to the question.
But there are two other things.
Do not use set as a variable name, because you shadow a built-in data type
Your algorithm is not correct because of python implementation details
at the bottom of recursion, you append a resulting permutation to the permutations variable. But list in python is not passed by a value, so you actually append a reference to the input list. Because after recurse finishes its work, the input set is in the same order that it was at the beginning, so the permutation variable will store n! references to the same list. To fix that, you can use deepcopy method of copy module, the resulting code is (notice that you can stop resursion when i == len(s)):
import copy
# Returns a list of all permutations of the given list.
def permutate(s):
# A list of all found permutations, to be returned
permutations = []
# Takes a set which has all elements below index i fixed and finds all permutations
def recurse(s, i):
# If all elements are fixed, store the current permutation
if i == len(s):
# append a deepcopy of s
permutations.append(copy.deepcopy(s))
else:
# Swap the "first" element with each other element to generate new permutations
for element in xrange(i, len(s)):
s[element], s[i] = s[i], s[element]
recurse(s, i + 1)
s[element], s[i] = s[i], s[element]
# Use the recursive algorithm to find all permutations, starting with no fixed elements
recurse(s, 0)
return permutations
print permutate([1, 2, 3])

Nicest, efficient way to get result tuple of sequence items fulfilling and not fulfilling condition

(This is professional best practise/ pattern interest, not home work request)
INPUT: any unordered sequence or generator items, function myfilter(item) returns True if filter condition is fulfilled
OUTPUT: (filter_true, filter_false) tuple of sequences of
original type which contain the
elements partitioned according to
filter in original sequence order.
How would you express this without doing double filtering, or should I use double filtering? Maybe filter and loop/generator/list comprehencion with next could be answer?
Should I take out the requirement of keeping the type or just change requirement giving tuple of tuple/generator result, I can not return easily generator for generator input, or can I? (The requirements are self-made)
Here test of best candidate at the moment, offering two streams instead of tuple
import itertools as it
from sympy.ntheory import isprime as myfilter
mylist = xrange(1000001,1010000,2)
left,right = it.tee((myfilter(x), x) for x in mylist)
filter_true = (x for p,x in left if p)
filter_false = (x for p,x in right if not p)
print 'Hundred primes and non-primes odd numbers'
print '\n'.join( " Prime %i, not prime %i" %
(next(filter_true),next(filter_false))
for i in range(100))
Here is a way to do it which only calls myfilter once for each item and will also work if mylist is a generator
import itertools as it
left,right = it.tee((myfilter(x), x) for x in mylist)
filter_true = (x for p,x in left if p)
filter_false = (x for p,x in right if not p)
Let's suppose that your probleme is not memory but cpu, myfilter is heavy and you don't want to iterate and filter the original dataset twice. Here are some single pass ideas :
The simple and versatile version (memoryvorous) :
filter_true=[]
filter_false=[]
for item in items:
if myfilter(item):
filter_true.append(item)
else:
filter_false.append(item)
The memory friendly version : (doesn't work with generators (unless used with list(items)))
while items:
item=items.pop()
if myfilter(item):
filter_true.append(item)
else:
filter_false.append(item)
The generator friendly version :
while True:
try:
item=next(items)
if myfilter(item):
filter_true.append(item)
else:
filter_false.append(item)
except StopIteration:
break
The easy way (but less efficient) is to tee the iterable and filter both of them:
import itertools
left, right = itertools.tee( mylist )
filter_true = (x for x in left if myfilter(x))
filter_false = (x for x in right if myfilter(x))
This is less efficient than the optimal solution, because myfilter will be called repeatedly for each element. That is, if you have tested an element in left, you shouldn't have to re-test it in right because you already know the answer. If you require this optimisation, it shouldn't be hard to implement: have a look at the implementation of tee for clues. You'll need a deque for each returned iterable which you stock with the elements of the original sequence that should go in it but haven't been asked for yet.
I think your best bet will be constructing two separate generators:
filter_true = (x for x in mylist if myfilter(x))
filter_false = (x for x in mylist if not myfilter(x))

Categories

Resources