Generating a list of repetitions regardless of the order - python

I want to generate combinations that associate indices in a list with "slots". For instance,(0, 0, 1) means that 0 and 1 belong to the same slot while 2 belongs to an other. (0, 1, 1, 1) means that 1, 2, 3 belong to the same slot while 0 is by itself. In this example, 0 and 1 are just ways of identifying these slots but do not carry information for my usage.
Consequently, (0, 0, 0) is absolutely identical to (1, 1, 1) for my purposes, and (0, 0, 1) is equivalent to (1, 1, 0).
The classical cartesian product generates a lot of these repetitions I'd like to get rid of.
This is what I obtain with itertools.product :
>>> LEN, SIZE = (3,1)
>>> list(itertools.product(range(SIZE+1), repeat=LEN))
>>>
[(0, 0, 0),
(0, 0, 1),
(0, 1, 0),
(0, 1, 1),
(1, 0, 0),
(1, 0, 1),
(1, 1, 0),
(1, 1, 1)]
And this is what I'd like to get:
>>> [(0, 0, 0),
(0, 0, 1),
(0, 1, 0),
(0, 1, 1)]
It is easy with small lists but I don't quite see how to do this with bigger sets. Do you have a suggestion?
If it's unclear, please tell me so that I can clarify my question. Thank you!
Edit: based on Sneftel's answer, this function seems to work, but I don't know if it actually yields all the results:
def test():
for p in product(range(2), repeat=3):
j=-1
good = True
for k in p:
if k> j and (k-j) > 1:
good = False
elif k >j:
j = k
if good:
yield p

I would start by making the following observations:
The first element of each combination must be 0.
The second element must be 0 or 1.
The third element must be 0, 1 or 2, but it can only be 2 if the second element was 1.
These observations suggest the following algorithm:
def assignments(n, m, used=0):
"""Generate assignments of `n` items to `m` indistinguishable
buckets, where `used` buckets have been used so far.
>>> list(assignments(3, 1))
[(0, 0, 0)]
>>> list(assignments(3, 2))
[(0, 0, 0), (0, 0, 1), (0, 1, 0), (0, 1, 1)]
>>> list(assignments(3, 3))
[(0, 0, 0), (0, 0, 1), (0, 1, 0), (0, 1, 1), (0, 1, 2)]
"""
if n == 0:
yield ()
return
aa = list(assignments(n - 1, m, used))
for first in range(used):
for a in aa:
yield (first,) + a
if used < m:
for a in assignments(n - 1, m, used + 1):
yield (used,) + a
This handles your use case (12 items, 5 buckets) in a few seconds:
>>> from timeit import timeit
>>> timeit(lambda:list(assignments(12, 5)), number=1)
4.513746023178101
>>> sum(1 for _ in assignments(12, 5))
2079475
This is substantially faster than the function you give at the end of your answer (the one that calls product and then drops the invalid assignments) would be if it were modified to handle the (12, 5) use case:
>>> timeit(lambda:list(test(12, 5)), number=1)
540.693009853363

Before checking for duplicates, you should harmonize the notation (assuming you don't want to set up some fancy AI): iterate through the lists and assign set-affiliation numbers for differing elements starting at 0, counting upwards. That is, you create a temporary dictionary per line that you are processing.
An exemplary output would be
(0,0,0) -> (0,0,0)
(0,1,0) -> (0,1,0)
but
(1,0,1) -> (0,1,0)
Removing the duplicates can then easily be performed as the problem is reduced to the problem of the solved question at Python : How to remove duplicate lists in a list of list?

If you only consider the elements of the cartesian product where the first occurrences of all indices are sorted and consecutive from zero, that should be sufficient. itertools.combinations_with_replacement() will eliminate those that are not sorted, so you'll only need to check that indices aren't being skipped.

In your specific case you could simply take the first or the second half of the list of those items produced by a cartesian product.
import itertools
alphabet = '01'
words3Lettered = [''.join(letter) for letter in itertools.product(alphabet,repeat=3)]
for n lettered words use repeat=n
words3Lettered looks like this:
['000', '001', '010', '011', '100', '101', '110', '111']
next,
usefulWords = words3Lettered[:len(words3Lettered)/2]
which looks like this:
['000', '001', '010', '011']
you might be interested in the other half i.e. words3Lettered[len(words3Lettered)/2:] though the other half was supposed to "fold" onto the first half.
most probably you want to use the combination of letters in numeric form so...
indexes = [tuple(int(j) for j in word) for word in usefulWords]
which gives us:
[(0, 0, 0), (0, 0, 1), (0, 1, 0), (0, 1, 1)]

Related

Product of prime factors of a number, less than that number

First of all, I apologise for the title, I did not know how to put my problem in words. Well, here it is:
For an integer a greater than 1, let F be a sorted list of prime factors of a. I need to find all tuples c (filled with whole numbers), such that length of each tuple is equal to the size of F and (F[0] ** c[0]) * (F[1] ** c[1]) * (...) < a. I should add that I write in Python.
Example:
a = 60
F = [2,3,5]
# expected result:
C = {(0, 0, 0), (0, 0, 1), (0, 0, 2), (0, 1, 0), (0, 1, 1), (0, 2, 0),
(0, 2, 1), (0, 3, 0), (1, 0, 0), (1, 0, 1), (1, 0, 2), (1, 1, 0), (1, 1, 1),
(1, 2, 0), (1, 3, 0), (2, 0, 0), (2, 0, 1), (2, 1, 0), (2, 2, 0), (3, 0, 0),
(3, 0, 1), (3, 1, 0), (4, 0, 0), (4, 1, 0), (5, 0, 0)}
I generated this result using itertools.product(), specifically:
m = math.floor(round(math.log(a, min(F)), 12))
for i in itertools.product(range(m + 1), repeat=len(F)):
if math.prod([F[j] ** i[j] for j in range(len(F))]) < a: print(i)
I think it works but it's inefficient. For example number 5 appears only in one tuple, but was checked many more times! Is there any way to make it faster? I would use multiple while loops (with break statements) but since I don't know what is the length of F, I don't think that is possible.
You base all your range limits on just min(F). Let's customize each to the log(a, factor) to reduce the cases:
from math import ceil, log, prod
from itertools import product
a = 60
F = [2, 3, 5]
ranges = [range(0, ceil(log(a, factor))) for factor in F]
C = []
for powers in product(*ranges):
if prod(F[i] ** power for i, power in enumerate(powers)) < a:
C.append(powers)
print(C)
By my measure, your code generates 216 test cases to come up with 25 results, but the above code only generates 1/3 of those test cases.
You could iterate over all the "valid" tuples with a generator, like so:
def exponent_tuples(prime_factors, limit):
def next_tuple(t):
n = math.prod(f ** tt for f, tt in zip(prime_factors, t))
for idx, (f, tt) in enumerate(zip(prime_factors, t)):
n *= f
if n < limit:
return (0,) * idx + (tt + 1,) + t[idx + 1 :]
n //= f**(tt+1)
return None
t = (0,) * len(prime_factors)
while t is not None:
yield t
t = next_tuple(t)
for t in exponent_tuples([2, 3, 5], 60):
print(t)
The idea here is to basically increment the tuple entries like digits of a number and have the respective digit roll over to zero and carry the 1 whenever you reach the defined limit.
I'm pretty sure this does exactly what you want, except for maybe the order in which it yields the tuples (can be adjusted by modifying the next_tuple function)
EDIT: Simplified the code a bit
The almost cooked proposition would go like this (shell execution)
>>> max_exponents(42,[2,3,7])
[5, 3, 1]
>>> #pick 2
>>> max_exponents(42//2**2,[3,7])
[2, 1]
>>> #pick 1
>>> max_exponents(42//(2**2*3**1),[7])
[0]
I'm almost done. This will adapt to any number of factors !
Somehow your proposition reduces to this (more readable form ?)
import math as m
import pprint
a = 60
prime_factors = [2,3,5]
exponents =list(map(lambda x:m.floor(m.log(a,x)),prime_factors))
rez = []
for i in range(exponents[0]+1):
for j in range(exponents[1]+1):
for k in range(exponents[2]+1):
if 2**i*3**j*5**k <= a:
rez.append((i,j,k))
pprint.pprint(rez)
and you would like to know wether there's a way to make if faster (with less tests). So we're no more on the implementation side, but more on the conception (algorithm) side ?
For example, once the first exponent c[0] has been chosen, the next ones should be selected amongst the one fitting in a//(2**c[a]) as the other answerer proposed i guess

Generating all possible combinations of n-sized vector that follow certain conditions on each element

I have a list d of length r such that d = (d_1, d_2,..., d_r).
I would like to generate all possible vectors of length r such that for any i (from 0 to r), v_i is between 0 and d_i.
For example,
if r =2 and d= (1,2), v_1 can be 0 or 1 and v_2 can be 0,1 or 2.
Hence there are 6 possible vectors:
[0,0] , [0,1], [0,2], [1,0] , [1,1], [1,2]
I have looked into Itertools and combinations and I have a feeling I will have to use recursion however I have not managed to solve it yet and was hoping for some help or advice into the right direction.
Edit:
I have written the following code for my problem and it works however I did it in a very inefficient way by disregarding the condition and generating all possible vectors then pruning the invalid ones. I took the largest d_i and generated all vectors of size r from (0,0,...0) all the way to (max_d_i,max_d_i,....max_d_i) and then eliminated those that were invalid.
Code:
import itertools
import copy
def main(d):
arr = []
correct_list =[]
curr = []
r= len(d)
greatest = max(d)
for i in range(0,greatest+1):
arr = arr + [i]
#all_poss_arr is a list that holds all possible vectors of length r from (0,0,...,0) to (max,max,...,max)
# for example if greatest was 3 and r= 4, all_poss_arr would have (0,0,0,0), then (0,0,0,1) and so on,
#all the way to (3,3,3,3)
all_poss_arr = list(itertools.product(arr,repeat = r))
#Now I am going to remove all the vectors that dont follow the v_i is between 0 and d_i
for i in range(0,len(all_poss_arr)):
curr = all_poss_arr[i]
cnt = 0
for j in range(0,len(curr)):
if curr[j] <= d[j]:
cnt = cnt +1
if cnt == r:
curr = list(curr)
currcopy = copy.copy(curr)
correct_list = correct_list + [currcopy]
cnt =0
return correct_list
If anyone knows a better way, let me know, it is much appreciated.
You basically want a Cartesian product. I'll demonstrate a basic, functional and iterative approach.
Given
import operator as op
import functools as ft
import itertools as it
def compose(f, g):
"""Return a function composed of two functions."""
def h(*args, **kwargs):
return f(g(*args, **kwargs))
return h
d = (1, 2)
Code
Option 1: Basic - Manual Unpacking
list(it.product(range(d[0] + 1), range(d[1] + 1)))
# [(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2)]
Option 2: Functional - Automated Mapping
def vector_combs(v):
"""Return a Cartesian product of unpacked elements from `v`."""
plus_one = ft.partial(op.add, 1)
range_plus_one = compose(range, plus_one)
res = list(it.product(*map(range_plus_one, v)))
return res
vector_combs(d)
# [(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2)]
Option 3: Iterative - Range Replication (Recommended)
list(it.product(*[range(x + 1) for x in d]))
# [(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2)]
Details
Option 1
The basic idea is illustrated in Option 1:
Make a Cartesian product using a series of modified ranges.
Note, each range is manually incremented and passed in as an index from d. We automate these limitations in with the last options.
Option 2
We apply a functional approach to handle the various arguments and functions:
Partial the 1 argument to the add() function. This returns a function that will increment any number.
Let's pass this function into range through composition. This allows us to have a modified range function that auto increments the integer passed in.
Finally we map the latter function to each element in tuple d. Now d works with any length r.
Example (d = (1, 2, 1), r = 3):
vector_combs((1, 2, 1))
# [(0, 0, 0),
# (0, 0, 1),
# (0, 1, 0),
# (0, 1, 1),
# (0, 2, 0),
# (0, 2, 1),
# (1, 0, 0),
# (1, 0, 1),
# (1, 1, 0),
# (1, 1, 1),
# (1, 2, 0),
# (1, 2, 1)]
Option 3
Perhaps most elegantly, just use a list comprehension to create r ranges. ;)

Python removing tuples from list that satisfy given conditions

I have a list of tuples and i want to remove tuples so that there is only one tuple in the list that has a given length and sum.
That's a bad explanation so for example:
[(0,1,2), (0,2,1), (0,0,1)]
remove (0,1,2) or (0,2,1)
I want to be able to iterate though the list and remove any tuples that satisfy the following conditions:
len(tuple1) == len(tuple2) and sum(tuple1) == sum(tuple2)
but keep either tuple1 or tuple2 in the list.
I tried:
for t1 in list:
for t2 in list:
if len(t1) == len(t2) and sum(t1) == sum(t2):
list.remove(t1)
but im pretty sure this removes all tuples and the console crashed.
In essence this is a "uniqness filter", but where we specify a function f, and only if that f(x) occurs a second time, we filter that element out.
We can implement such uniqness filter, given f(x) produces hashable values, with:
def uniq(iterable, key=lambda x: x):
seen = set()
for item in iterable:
u = key(item)
if u not in seen:
yield item
seen.add(u)
Then we can use this filter as:
result = list(uniq(data, lambda x: (len(x), sum(x))))
for example:
>>> list(uniq(data, lambda x: (len(x), sum(x))))
[(0, 1, 2), (0, 0, 1)]
Here we will always retain the first occurrence of the "duplicates".
Let me offer a slightly different solution. Note that this is not something I'd use for a one-off script, but for a real project. Because your [(0, 0, 1)] actually represents something logical/physical.
set(..) removes duplicates. How about we use that? The only thing to keep it mind is that the hash value and equality of the elements need to be modified.
class Converted(object):
def __init__(self, tup):
self.tup = tup
self.transformed = len(tup), sum(tup)
def __eq__(self, other):
return self.transformed == other.transformed
def __hash__(self):
return hash(self.transformed)
inp = [(0,1,2), (0,2,1), (0,0,1)]
out = [x.tup for x in set(map(Converted, inp))]
print(out)
# [(0, 0, 1), (0, 1, 2)]
You can also use groupby to group elements by sum and len and fetch 1 element from each group to create a new list:
from itertools import groupby
def _key(t):
return (len(t), sum(t))
data = [(0, 1, 2), (0, 2, 1), (0, 0, 1), (1, 0, 0), (0, 1, 0), (3, 0, 0, 0)]
result = []
for k, g in groupby(sorted(data, key=_key), key=_key):
result.append(next(g))
print(result)
# [(0, 0, 1), (0, 1, 2), (3, 0, 0, 0)]
The complexity of your problem comes mainly from the fact that you have two independent filters you want to implement. A good way to go about filtering on data with this sort of requirement is to use groupby. However, before you can do that you need to sort first. Since you normally sort over one key, you'll need to sort twice before you can group:
from itertools import groupby
def lensumFilter(data):
return [next(g) for _, g in groupby(sorted(sorted(data, key = len), key = sum),
key = lambda x: (len(x), sum(x)))]
>>> print(lensumFilter( [(0, 1, 2), (0, 2, 1), (0, 0, 1)] )
[(0, 0, 1), (0, 2, 1)]
>>> print(lensumFilter( [(0, 1, 2), (0, 2, 1), (0, 0, 0, 3), (0, 0, 1)] )
[(0, 0, 1), (0, 2, 1), (0, 0, 0, 3)]
>>> print(lensumFilter( [(0, 1, 2), (0, 2, 2), (0, 4), (0, 0, 0, 5), (0, 0, 3)] )
[(0, 1, 2), (0, 4), (0, 2, 2), (0, 0, 0, 5)]
Note that if you change how the sorts work, you change how the output will look. For instance, I sorted on length and then sum so my results are in order with respect to sum (smallest sum first) and then in order with respect to length (fewest number of elements first) within sum-groupings. That's why (0, 1, 2) comes before (0, 4) but (0, 4) comes before (0, 2, 2).
If you want to do something concise and more pythonic, you could use the function filter.
It will keep all the elements that are matching your requirements (here sum being not equal when same length):
tup_remove = (0,2,1)
list(filter(lambda current_tup: not (sum(tup_remove) == sum(current_tup) and len(tup_remove) == len(current_tup))
For better readability and extensibility, I would encourage you to use a function:
def not_same_sum_len_tuple(tup_to_check, current_tuple):
"""Return True when not same sum AND same length"""
same_sum = sum(tup_to_check) == sum(current_tuple) # Check the sum
same_len = len(tup_remove) == len(current_tuple) # Check the length
return not (same_sum and same_len)
tup_remove = (0,2,1)
list(filter(lambda current_tup: not_same_sum_len_tuple(tup_remove, current_tup), tup_list))
It's probably easier to just make a new list that meets your conditions.
old_list = [(0,1,2), (0,2,1), (0,0,1)]
new_list = []
for old_t in old_list:
for new_t in new_list:
if len(old_t) == len(new_t) and sum(old_t) == sum(new_t):
break
else:
new_list.append(old_t)
# new_list == [(0, 1, 2), (0, 0, 1)]
This is a simpler solution but may not be performant. Just make a dict with (len(t), sum(t)) as keys and the tuples as values. The last tuple stays.
lst = [(0,1,2), (0,2,1), (0,0,1)]
d = {(len(t), sum(t)): t for t in lst}
list(d.values())
In one line;
list({(len(t), sum(t)): t for t in lst}.values())
To make it performant just memoize len and sum.
from functools import lru_cache
mlen, msum = (lru_cache(maxsize=None)(f) for f in (len, sum))
list({(mlen(t), msum(t)): t for t in lst}.values())

All possible combination of 3 numbers in a set in Python

I want to print all possible combination of 3 numbers from the set (0 ... n-1), while each one of those combinations is unique. I get the variable n via this code:
n = raw_input("Please enter n: ")
But I'm stuck at coming up with the algorithm. Any help please?
from itertools import combinations
list(combinations(range(n),3))
This would work as long as you are using later than Python 2.6
If you want all the possible combinations with repetition in values and differ in position you need to use product like this:
from itertools import product
t = range(n)
print set(product(set(t),repeat = 3))
for example, if n = 3, the output will be:
set([(0, 1, 1), (1, 1, 0), (1, 0, 0), (0, 0, 1), (1, 0, 1), (0, 0, 0), (0, 1, 0), (1, 1, 1)])
hope this helps
itertools is your friend here, specifically permutations.
Demo:
from itertools import permutations
for item in permutations(range(n), 3):
print item
This is assuming you have Python 2.6 or newer.
combos = []
for x in xrange(n):
for y in xrange(n):
for z in xrange(n):
combos.append([x,y,z])

How can I write a function to return all the binary numbers with N digits, and in sorted order? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
Each binary number should be represented as a tuple. When the function is called, the result should be a tuple containing 2^N binary numbers.
Ex. Binary(2)----> ((0,0), (0,1), (1,0), (1,1))
I am trying to use a while loop to do this.
Just some advice on where I could begin would be very helpful.
You can use itertools.product, to get what you want
print [item for item in itertools.product([0, 1], repeat = 4)]
Output
[(0, 0, 0, 0), (0, 0, 0, 1), (0, 0, 1, 0), (0, 0, 1, 1), (0, 1, 0, 0),
(0, 1, 0, 1), (0, 1, 1, 0), (0, 1, 1, 1), (1, 0, 0, 0), (1, 0, 0, 1),
(1, 0, 1, 0), (1, 0, 1, 1), (1, 1, 0, 0), (1, 1, 0, 1), (1, 1, 1, 0),
(1, 1, 1, 1)]
Change the repeat to the desired value.
Edit:
Performance comparison with list and comprehension.
print timeit.timeit("[item for item in itertools.product([0, 1], repeat = 4)]", number = 1000000)
print timeit.timeit("list(itertools.product([0, 1], repeat = 4))", number = 1000000)
List comprehension is slightly faster than list.
You'll probably need two loops -- an outer one to loop thru the values, and an inner one to process the binary digits for each value.
You can either loop thru the values as integers & convert them to binary -- or you can carry a "current value" in binary around the loop, copying & incrementing it.
try this:
def binary(n):
num_digits = len(bin(n).replace('0b',''))
all_bin_numbers=()
for i in range(n):
bin_num=()
for digit in str(bin(i)).replace('0b','').rjust(num_digits, '0'):
bin_num += (int(digit),)
all_bin_numbers += (bin_num,)
return all_bin_numbers
print binary(2)
e: holy cow, that itertools answer.
e2: so it appears I didn't fully read the question, I was thinking you wanted all the binary numbers up to and including your specified n.
from numpy import binary_repr
[map(int, binary_repr(i, N)) for i in range(2**N)]

Categories

Resources