How to iterate over lists from the middle out - python

I would like to iterate over all lists/tuples of length n with elements from -s...s. Currently I do this with:
for k in itertools.product(range(-s,s+1), repeat = n):
#process k and maybe print out the result
However this not useful for me as there are a huge number of such tuples and my code may never terminate. I would really like to start with the most interesting ones first. In this case the order I would like for the iteration is:
All tuples that contain only 0 (there is only one)
All tuples that contain only 0, 1 and -1 excluding those tuples we have already seen.
All tuples that contain only 0, 1,-1, 2 and-2 excluding those tuples we have already seen.
And so on...
How can one do this?

How about this:
import itertools
def sorted_tuples(length, max_s):
nums = [0]
for s in range(max_s):
for p in itertools.combinations_with_replacement(nums, length):
if s in p or -s in p:
yield p
nums = [-(s+1)] + nums + [s+1]
for i in sorted_tuples(3,2):
print(i)
# prints the following
(0, 0, 0)
(-1, -1, -1)
(-1, -1, 0)
(-1, -1, 1)
(-1, 0, 0)
(-1, 0, 1)
(-1, 1, 1)
(0, 0, 1)
(0, 1, 1)
(1, 1, 1)

So your code to be done with the lists in much more expensive than sorting?
Then you can sort the list of these lists with a key argument. The things you call list are tuples indeed, right? At least in my python 2.7 itertools. I would convert them to arrays, since I think you cannot use abs otherwise. Then the sorting function is:
lists.sort(key = lambda t: np.max(np.abs(np.array(t))))
does this work fast enough?

Related

Python Bug? What I'm doing wrong?

I'm trying to make a simple iterator which cycles through a list and returns three consecutive numbers from the list in python, but I get really weird result - code works fine only when numbers in the list are in ascending order.
import itertools
c=[0,1,2,3,0,5,6]
counter=itertools.cycle(c)
def func(x):
if x==len(c)-1:
return c[x],c[0],c[1]
elif x==len(c)-2:
return c[x],c[len(c)-1],c[0]
else:
return c[x],c[x+1],c[x+2]
for i in range(len(c)+2):
print(func(next(counter)))
'Im trying to make a simple iterator which cycles through a list and returns three consecutive numbers from the list in python, but I get really weird result - code works fine only when numbers in the list are in ascending order.Atom prints the following in the 5th tuple. Please help..
(0, 1, 2)
(1, 2, 3)
(2, 3, 0)
(3, 0, 5)
(0, 1, 2)
(5, 6, 0)
(6, 0, 1)
(0, 1, 2)
(1, 2, 3)
'
I believe you are confusing the values of c and the indices. It seems in func you expect that an index is passed but you are in fact passing a value from c. NOTE: counter is cycling over the values of c not over indices.
Also please note that in python you can use negative indices so you can write c[-1] as a short of c[len(c) - 1].

Sort a complex Python dictionary by just one of its values

I am writing a little optimization tool for purchasing stamps at the post office.
In the process I am using a dictionary, which I am sorting according to what I learned in this other "famous" question:
Sort a Python dictionary by value
In my case my dictionary is mildly more complex:
- one four-item-tuple to make the key
- and another five-item-tuple to make the data.
The origin of this dictionary is an iteration, where each successful loop is adding one line:
MyDicco[A, B, C, D] = eval, post, number, types, over
This is just a tiny example of a trivial run, trying for 75 cents:
{
(0, 0, 1, 1): (22, 75, 2, 2, 0)
(0, 0, 0, 3): (31, 75, 3, 1, 0)
(0, 0, 2, 0): (2521, 100, 2, 1, 25)
(0, 1, 0, 0): (12511, 200, 1, 1, 125)
(1, 0, 0, 0): (27511, 350, 1, 1, 275)
}
So far I am using this code to sort (is is working):
MyDiccoSorted = sorted(MyDicco.items(), key=operator.itemgetter(1))
I am sorting by my evaluation-score, because the sorting is all about bringing the best solution to the top. The evaluation-score is just one datum out of a five-item-tuple (in the example those are the evaluation-scores: 22, 31, 2521, 12511 and 27511).
As you can see in the example above, it is sorting (as I want it) by the second tuple, index 1. But I had to (grumpily) bring my "evaluation-score" to the front of my second tuple. The code is obviously using the entire second-tuple for the sorting-process, which is heavy and not needed.
Here is my question: How can I please sort more precisely. I do not want to sort by the entire second tuple of my dictionary: I want to target the first item precisely.
And ideally I would like to put this value back to its original position, namely to be the last item in the second tuple - and still sort by it.
I have read-up on and experimented with the syntax of operator.itemgetter() but have not managed to just "grab" the "first item of my second item".
https://docs.python.org/3/library/operator.html?highlight=operator.itemgetter#operator.itemgetter
(note: It is permissible to use tuples as keys and values, according to:
https://docs.python.org/3/tutorial/datastructures.html?highlight=dictionary
and those are working fine for my project; this question is just about better sorting)
For those who like a little background (you will yell at me that I should use some other method, but I am learning about dictionaries right now (which is one of the purposes of this project)):
This optimization is for developing countries, where often certain values of stamps are not available, or are limited in stock at any given post office. It will later run on Android phones.
We are doing regular mailings (yes, letters). Figuring out the exact postage for each destination with the available values and finding solutions with low stocks of certain values is a not-trivial process, if you consider six different destination-based-postages and hundreds of letters to mail.
There are other modules which help turning the theoretical optimum solution into something that can actually be purchased on any given day, by strategic dialog-guidance...
About my dictionary in this question:
I iterate over all reasonable (high enough to make the needed postage and only overpaying up to a fraction of one stamp) combinations of stamp-values.
Then I calculate a "success" value, which is based on the number of stamps needed (priority), the number of types needed (lower priority)(because purchasing different stamps takes extra time at the counter) and a very high penalty for paying-over. So lowest value means highest success.
I collect all reasonable "solutions" in a dictionary where the tuple of needed-stamps serves as the key, and another tuple of some results-data makes up the values. It is mildly over-defined because a human needs to read it at this phase in the project (for debugging).
If you are curious and want to read the example (first line):
The colums are:
number of stamps of 350 cents
number of stamps of 200 cents
number of stamps of 50 cents
number of stamps of 25 cents
evaluation-score
calculated applied postage
total number of stamps applied
total number of stamp-types
over-payment in cents if any
Or in words: (Assuming a postal service is offering existing stamps of 350, 200, 50 and 25 cents), I can apply postage of 75 cents by using 1x 50 cents and 1x 25 cents. This gives me a success-rating of 22 (the best in this list), postage is 75 cents, needing two stamps of two different values and having 0 cents overpayment.
You can just use a double index, something like this should work:
MyDiccoSorted = sorted(MyDicco.items(), key=lambda s: s[1][2])
Just set 2 to whatever the index is of the ID in the tuple.
I find it easier to use lambda expressions than to remember the various operator functions.
Assuming, for the moment, that your eval score is the 3rd item of your value tuple (i.e. (post, number, eval, types, over):
MyDiccoSorted = sorted(MyDicco.items(), key=lamba x:x[1][2])
Alternatively, you can create a named function to do the job:
def myKey(x): return x[1][2]
MyDiccoSorted = sorted(MyDicco.items(), key=myKey)
You can use a lambda expression instead of operator.itemgetter() , to get the precise element to sort on. Assuming your eval is the first item in the tuple of values, otherwise use the index of the precise element you want in x[1][0] .Example -
MyDiccoSorted = sorted(MyDicco.items(), key=lambda x: x[1][0])
How this works -
A dict.items() returns something similar to a list of tuples (though not exactly that in Python 3.x) , Example -
>>> d = {1:2,3:4}
>>> d.items()
dict_items([(1, 2), (3, 4)])
Now, in sorted() function, the key argument accepts a function object (which can be lambda , or operator.itemgetter() which also return a function, or any simple function) , the function that you pass to key should accept one argument, which would be the element of the list being sorted.
Then that key function is called with each element, and you are expected to return the correct value to sort the list on. An example to help you understand this -
>>> def foo(x):
... print('x =',x)
... return x[1]
...
>>> sorted(d.items(),key=foo)
x = (1, 2)
x = (3, 4)
[(1, 2), (3, 4)]
does this do what you need?
sorted(MyDicco.items(), key=lambda x: x[1][0])
index_of_evaluation_score = 0
MyDiccoSorted = sorted(MyDicco.items(), key=lambda key_value: key_value[1][index_of_evaluation_score])
Placing your evaluation score back at the end where you wanted it, you can use the following:
MyDicco = {
(0, 0, 1, 1): (75, 2, 2, 0, 22),
(0, 0, 0, 3): (75, 3, 1, 0, 31),
(0, 0, 2, 0): (100, 2, 1, 25, 2521),
(0, 1, 0, 0): (200, 1, 1, 125, 12511),
(1, 0, 0, 0): (350, 1, 1, 275, 27511)}
MyDiccoSorted = sorted(MyDicco.items(), key=lambda x: x[1][4])
print MyDiccoSorted
Giving:
[((0, 0, 1, 1), (75, 2, 2, 0, 22)), ((0, 0, 0, 3), (75, 3, 1, 0, 31)), ((0, 0, 2, 0), (100, 2, 1, 25, 2521)), ((0, 1, 0, 0), (200, 1, 1, 125, 12511)), ((1, 0, 0, 0), (350, 1, 1, 275, 27511))]
I think one of the things you might be looking for is a stable sort.
Sorting functions in Python are generally "stable" sorts. For example, if you sort:
1 4 6
2 8 1
1 2 3
2 1 8
by its first column, you'll get:
1 4 6
1 2 3
2 8 1
2 1 8
The order of rows sharing the same value in column 1 does not change. 1 4 6 is sorted before 1 2 3 because that was the original order of these rows before the column 1 sort. Sorting has been 'stable' since version 2.2 of Python. More details here.
On another note I'm interested in how much you had to explain your code. That is a sign that the code would benefit from refactoring to make its purpose clearer.
Named tuples could be used to remove the hard-to-read tuple indices you see in many answer here, e.g. key=lambda x: x[1][0]-- what does that actually mean? What is it doing?
Here's a version using named tuples that helps readers (most importantly, you!) understand what your code is trying to do. Note how the lambda now explains itself much better.
from collections import namedtuple
StampMix = namedtuple('StampMix', ['c350', 'c200', 'c50', 'c25'])
Stats = namedtuple('Stats', ['score', 'postage', 'stamps', 'types', 'overpayment'])
data = {
(0, 0, 1, 1): (22, 75, 2, 2, 0),
(0, 0, 0, 3): (31, 75, 3, 1, 0),
(0, 0, 2, 0): (2521, 100, 2, 1, 25),
(0, 1, 0, 0): (12511, 200, 1, 1, 125),
(1, 0, 0, 0): (27511, 350, 1, 1, 275)
}
candidates = {}
for stampmix, stats in data.items():
candidates[StampMix(*stampmix)] = Stats(*stats)
print(sorted(candidates.items(), key=lambda candidate: candidate[1].score))
You can see the benefits of this approach in the output:
>>> python namedtuple.py
(prettied-up output follows...)
[
(StampMix(c350=0, c200=0, c50=1, c25=1), Stats(score=22, postage=75, stamps=2, types=2, overpayment=0)),
(StampMix(c350=0, c200=0, c50=0, c25=3), Stats(score=31, postage=75, stamps=3, types=1, overpayment=0)),
(StampMix(c350=0, c200=0, c50=2, c25=0), Stats(score=2521, postage=100, stamps=2, types=1, overpayment=25)),
(StampMix(c350=0, c200=1, c50=0, c25=0), Stats(score=12511, postage=200, stamps=1, types=1, overpayment=125)),
(StampMix(c350=1, c200=0, c50=0, c25=0), Stats(score=27511, postage=350, stamps=1, types=1, overpayment=275))
]
and it will help with your algorithms too. For example:
def score(stats):
return stats.postage * stats.stamps * stats.types + 1000 * stats.overpayment

Efficient enumeration of ordered subsets in Python

I'm not sure of the appropriate mathematical terminology for the code I'm trying to write. I'd like to generate combinations of unique integers, where "ordered subsets" of each combination are used to exclude certain later combinations.
Hopefully an example will make this clear:
from itertools import chain, combinations
​
mylist = range(4)
max_depth = 3
rev = chain.from_iterable(combinations(mylist, i) for i in xrange(max_depth, 0, -1))
for el in list(rev):
print el
That code results in output that contains all the subsets I want, but also some extra ones that I do not. I have manually inserted comments to indicate which elements I don't want.
(0, 1, 2)
(0, 1, 3)
(0, 2, 3)
(1, 2, 3)
(0, 1) # Exclude: (0, 1, _) occurs as part of (0, 1, 2) above
(0, 2) # Exclude: (0, 2, _) occurs above
(0, 3) # Keep
(1, 2) # Exclude: (1, 2, _) occurs above
(1, 3) # Keep: (_, 1, 3) occurs above, but (1, 3, _) does not
(2, 3) # Keep
(0,) # Exclude: (0, _, _) occurs above
(1,) # Exclude: (1, _, _) occurs above
(2,) # Exclude: (2, _) occurs above
(3,) # Keep
Thus, the desired output of my generator or iterator would be:
(0, 1, 2)
(0, 1, 3)
(0, 2, 3)
(1, 2, 3)
(0, 3)
(1, 3)
(2, 3)
(3,)
I know I could make a list of all the (wanted and unwanted) combinations and then filter out the ones I don't want, but I was wondering if there was a more efficient, generator or iterator based way.
You are trying to exclude any combination that is a prefix of a previously-returned combination. Doing so is straightforward.
If a tuple t has length max_depth, it can't be a prefix of a previously-returned tuple, since any tuple it's a prefix of would have to be longer.
If a tuple t ends with mylist[-1], then it can't be a prefix of a previously-returned tuple, since there are no elements that could legally be added to the end of t to extend it.
If a tuple t has length less than max_depth and does not end with mylist[-1], then t is a prefix of the previously-returned tuple t + (mylist[-1],), and t should not be returned.
Thus, the combinations you should generate are exactly the ones of length max_depth and the shorter ones that end with mylist[-1]. The following code does so, in exactly the same order as your original code, and correctly handling cases like maxdepth > len(mylist):
def nonprefix_combinations(iterable, maxlen):
iterable = list(iterable)
if not (iterable and maxlen):
return
for comb in combinations(iterable, maxlen):
yield comb
for length in xrange(maxlen-2, -1, -1):
for comb in combinations(iterable[:-1], length):
yield comb + (iterable[-1],)
(I've assumed here that in the case where maxdepth == 0, you still don't want to include the empty tuple in your output, even though for maxdepth == 0, it isn't a prefix of a previously-returned tuple. If you do want the empty tuple in this case, you can change if not (iterable and maxlen) to if not iterable.)
I noticed an interesting pattern in your desired output and I have a generator that produces that. Does this work for all your cases?
from itertools import combinations
def orderedSetCombination(iterable, r):
# Get the last element of the iterable
last = (iterable[-1], )
# yield all the combinations of the iterable without the
# last element
for iter in combinations(iterable[:-1], r):
yield iter
# while r > 1 reduce r by 1 and yield all the combinations
while r>1:
r -= 1
for iter in combinations(iterable[:-1], r):
yield iter+last
# yield the last item
yield last
iter = [0,1,2,3]
for el in (list(orderedSetCombination(iter, 3))):
print(el)
Here is my explaination of the logic:
# All combinations that does not include the last element of the iterable
# taking r = max_depth items at a time
(0,1,2)
# from here on, its the combinations of all the elements except
# the last element and the last element is added to it.
# so here taking r = r -1 items at a time and adding the last element
# combinations([0,1,2], r=2)
(0,1,3)
(0,2,3)
(1,2,3)
# the only possible value right now at index r = 2 is the last element (3)
# since all possible values of (0,1,_) (0,2,_) (1,2,_) are already listed
# So reduce r by 1 again and continue: combinations([0,1,2], r=1)
(0, 3)
(1, 3)
(2, 3)
# continue until r == 0 and then yield the last element
(3,)

All possible combination of 3 numbers in a set in Python

I want to print all possible combination of 3 numbers from the set (0 ... n-1), while each one of those combinations is unique. I get the variable n via this code:
n = raw_input("Please enter n: ")
But I'm stuck at coming up with the algorithm. Any help please?
from itertools import combinations
list(combinations(range(n),3))
This would work as long as you are using later than Python 2.6
If you want all the possible combinations with repetition in values and differ in position you need to use product like this:
from itertools import product
t = range(n)
print set(product(set(t),repeat = 3))
for example, if n = 3, the output will be:
set([(0, 1, 1), (1, 1, 0), (1, 0, 0), (0, 0, 1), (1, 0, 1), (0, 0, 0), (0, 1, 0), (1, 1, 1)])
hope this helps
itertools is your friend here, specifically permutations.
Demo:
from itertools import permutations
for item in permutations(range(n), 3):
print item
This is assuming you have Python 2.6 or newer.
combos = []
for x in xrange(n):
for y in xrange(n):
for z in xrange(n):
combos.append([x,y,z])

Generating a list of repetitions regardless of the order

I want to generate combinations that associate indices in a list with "slots". For instance,(0, 0, 1) means that 0 and 1 belong to the same slot while 2 belongs to an other. (0, 1, 1, 1) means that 1, 2, 3 belong to the same slot while 0 is by itself. In this example, 0 and 1 are just ways of identifying these slots but do not carry information for my usage.
Consequently, (0, 0, 0) is absolutely identical to (1, 1, 1) for my purposes, and (0, 0, 1) is equivalent to (1, 1, 0).
The classical cartesian product generates a lot of these repetitions I'd like to get rid of.
This is what I obtain with itertools.product :
>>> LEN, SIZE = (3,1)
>>> list(itertools.product(range(SIZE+1), repeat=LEN))
>>>
[(0, 0, 0),
(0, 0, 1),
(0, 1, 0),
(0, 1, 1),
(1, 0, 0),
(1, 0, 1),
(1, 1, 0),
(1, 1, 1)]
And this is what I'd like to get:
>>> [(0, 0, 0),
(0, 0, 1),
(0, 1, 0),
(0, 1, 1)]
It is easy with small lists but I don't quite see how to do this with bigger sets. Do you have a suggestion?
If it's unclear, please tell me so that I can clarify my question. Thank you!
Edit: based on Sneftel's answer, this function seems to work, but I don't know if it actually yields all the results:
def test():
for p in product(range(2), repeat=3):
j=-1
good = True
for k in p:
if k> j and (k-j) > 1:
good = False
elif k >j:
j = k
if good:
yield p
I would start by making the following observations:
The first element of each combination must be 0.
The second element must be 0 or 1.
The third element must be 0, 1 or 2, but it can only be 2 if the second element was 1.
These observations suggest the following algorithm:
def assignments(n, m, used=0):
"""Generate assignments of `n` items to `m` indistinguishable
buckets, where `used` buckets have been used so far.
>>> list(assignments(3, 1))
[(0, 0, 0)]
>>> list(assignments(3, 2))
[(0, 0, 0), (0, 0, 1), (0, 1, 0), (0, 1, 1)]
>>> list(assignments(3, 3))
[(0, 0, 0), (0, 0, 1), (0, 1, 0), (0, 1, 1), (0, 1, 2)]
"""
if n == 0:
yield ()
return
aa = list(assignments(n - 1, m, used))
for first in range(used):
for a in aa:
yield (first,) + a
if used < m:
for a in assignments(n - 1, m, used + 1):
yield (used,) + a
This handles your use case (12 items, 5 buckets) in a few seconds:
>>> from timeit import timeit
>>> timeit(lambda:list(assignments(12, 5)), number=1)
4.513746023178101
>>> sum(1 for _ in assignments(12, 5))
2079475
This is substantially faster than the function you give at the end of your answer (the one that calls product and then drops the invalid assignments) would be if it were modified to handle the (12, 5) use case:
>>> timeit(lambda:list(test(12, 5)), number=1)
540.693009853363
Before checking for duplicates, you should harmonize the notation (assuming you don't want to set up some fancy AI): iterate through the lists and assign set-affiliation numbers for differing elements starting at 0, counting upwards. That is, you create a temporary dictionary per line that you are processing.
An exemplary output would be
(0,0,0) -> (0,0,0)
(0,1,0) -> (0,1,0)
but
(1,0,1) -> (0,1,0)
Removing the duplicates can then easily be performed as the problem is reduced to the problem of the solved question at Python : How to remove duplicate lists in a list of list?
If you only consider the elements of the cartesian product where the first occurrences of all indices are sorted and consecutive from zero, that should be sufficient. itertools.combinations_with_replacement() will eliminate those that are not sorted, so you'll only need to check that indices aren't being skipped.
In your specific case you could simply take the first or the second half of the list of those items produced by a cartesian product.
import itertools
alphabet = '01'
words3Lettered = [''.join(letter) for letter in itertools.product(alphabet,repeat=3)]
for n lettered words use repeat=n
words3Lettered looks like this:
['000', '001', '010', '011', '100', '101', '110', '111']
next,
usefulWords = words3Lettered[:len(words3Lettered)/2]
which looks like this:
['000', '001', '010', '011']
you might be interested in the other half i.e. words3Lettered[len(words3Lettered)/2:] though the other half was supposed to "fold" onto the first half.
most probably you want to use the combination of letters in numeric form so...
indexes = [tuple(int(j) for j in word) for word in usefulWords]
which gives us:
[(0, 0, 0), (0, 0, 1), (0, 1, 0), (0, 1, 1)]

Categories

Resources