Getting the endpoints of sets across a range - python

For the life of me, I can't see how to do this. I need to collect the non-overlapping endpoints of several sets within a range of numbers with python.
For example the user could input a range of 10 and two sets 2 and 3. I need to get the end points of these sets within this range such that:
set 2 groupings: 1-2,6-7
set 3 groupings: 3-5,8-10
The range, number of sets, and size of any individual set is arbitrary. I cannot fall outside the range, so no half sets.
I keep thinking there should be a simple formula for this, but I can't come up with it.
Edit
As requested for an example input of range 12, and sets 1, 2, and 3 the output should be:
set 1: 1,7
set 2: 2-3,8-9
set 3: 4-6,10-12
As near as I can figure, I'm looking at some kind of accumulator pattern. Something like this psuedo code:
for each miniRange in range:
for each set in sets:
listOfCurrSetEndpoints.append((start, end))

I don't think there's a good built-in solution to this. (It would be easier if there were a built-in equivalent to Haskell's scan function.) But this is concise enough:
>>> import itertools
>>> from collections import defaultdict
>>> partition_lengths = [1, 2, 3]
>>> range_start = 1
>>> range_end = 12
>>> endpoints = defaultdict(list)
>>> for p_len in itertools.cycle(partition_lengths):
... end = range_start + p_len - 1
... if end > range_end: break
... endpoints[p_len].append((range_start, end))
... range_start += p_len
...
>>> endpoints
defaultdict(<type 'list'>, {1: [(1, 1), (7, 7)], 2: [(2, 3), (8, 9)], 3: [(4, 6), (10, 12)]})
You can now format the endpoints dictionary for output however you like.
As an aside, I'm really confused by your use of "set" in this question, which is why I used "partition" instead.

I'm not exactly happy with it, but I did get a working program. If someone can come up with a better answer, I'll be happy to accept it instead.
import argparse, sys
if __name__ == "__main__":
parser = argparse.ArgumentParser(description='Take a number of pages, and the pages in several sets, to produce an output for copy and paste into the print file downloader', version='%(prog)s 2.0')
parser.add_argument('pages', type=int, help='Total number of pages to break into sets')
parser.add_argument('stapleset', nargs='+', type=int, help='number of pages in each set')
args = parser.parse_args()
data = {}
for c,s in enumerate(args.stapleset):
data[c] = []
currPage = 0
while currPage <= args.pages:
for c,s in enumerate(args.stapleset):
if currPage + 1 > args.pages:
pass
elif currPage + s > args.pages:
data[c].append((currPage+1,args.pages))
else:
data[c].append((currPage+1,currPage+s))
currPage = currPage + s
for key in sorted(data.iterkeys()):
for c,t in enumerate(data[key]):
if c > 0:
sys.stdout.write(",")
sys.stdout.write("{0}-{1}".format(t[0],t[1]))
sys.stdout.write("\n\n")

Related

Print FormatList Issue

I'm just having a small issue of formatting right now. My current code prints out my out in a wonky way and I'm trying to make it look smoother. How would I change my print formatting?
height = {}
length = len(preys)
rank = 0
while preys != [None]*length:
for index,(animal,prey) in enumerate(zip(animals,preys)):
if prey not in animals:
try:
if height[prey] < rank:
height[prey] = rank
except KeyError:
height[prey] = 0
height[animal] = height[prey] + 1
preys[index] = None
animals[index] = None
rank += 1
for arg in sys.argv:
print (sorted (height.items(),key = lambda x:x[1],reverse=True))
if name == "main":
main()
The output looks like this:
[('Lobster', 4), ('Bird', 4), ('Fish', 3), ('Whelk', 3), ('Crab', 3), ('Mussels', 2), ('Prawn', 2), ('Zooplankton', 1), ('Limpets', 1), ('Phytoplankton', 0), ('Seaweed', 0)]
and I'm trying to make it look like:
Heights:
Bird: 4
Crab: 3
Fish: 3
Limpets: 1
Lobster: 4
Mussels: 2
Phytoplankton: 0
Prawn: 2
Seaweed: 0
Whelk: 3
Zooplankton: 1
I've attempted to use the: print(formatList(height)), format however printing anything before the "height" causes errors
Since the output only prints once, we know that sys.argv wasn't passed any extra arguments. There's no need to create a loop that will only be executed once (and, if there were multiple arguments, it isn't generally useful to print the same output several times). Instead, loop over the height object itself.
You are also currently sorting by value, when it's obvious that you want to sort by key. Since the latter is the default sort anyway, I'm not sure why you added (messy) code to sort by value.
Use string formatting to express the appearance you want for each item.
print('Heights:')
for item in sorted(height.items()):
print('{}: {}'.format(*item))
Note that sorting an iterable of tuple objects will sort by the first element first, and then, if there are two tuples with the same first item, they will be sorted by the second element. For example, ('Bird', 1) would come before ('Bird', 2). Since dictionaries can't have duplicate keys, this won't be an issue here, but it's something to keep in mind.
I think you can try something like this:
sorted_ list = sorted (height.items(),key = lambda x:x[1],reverse=True)
print(''.join('%s : %s\n' % x for x in sorted_list))
Or in one expression(if it is not ugly for you):
print(''.join('%s : %s\n' % x
for x in sorted (height.items(),key = lambda x:x[1],reverse=True)))

Is it possible to write a combination function with recursion technique?

Yesterday, I encountered a problem which requires calculating combinations in an iterable with range 5.
Instead of using itertools.combination, I tried to make a primitive function of my own. It looks like:
def combine_5(elements):
"""Find all combinations in elements with range 5."""
temp_list = []
for i in elements:
cur_index = elements.index(i)
for j in elements[cur_index+1 : ]:
cur_index = elements.index(j)
for k in elements[cur_index+1 : ]:
cur_index = elements.index(k)
for n in elements[cur_index+1 : ]:
cur_index = elements.index(n)
for m in elements[cur_index+1 : ]:
temp_list.append((i,j,k,n,m))
return temp_list
Then I thought maybe I can abstract it a bit, to make a combine_n function. And below is my initial blueprint:
# Unfinished version of combine_n
def combine_n(elements, r, cur_index=-1):
"""Find all combinations in elements with range n"""
r -= 1
target_list = elements[cur_index+1 : ]
for i in target_list:
cur_index = elements.index(i)
if r > 0:
combine_n(elements, r, cur_index)
pass
else:
pass
Then I've been stuck there for a whole day, the major problem is that I can't convey a value properly inside the recursive function. I added some code that fixed one problem. But as it works for every recursive loop, new problems arose. More fixes lead to more bugs, a vicious cycle.
And then I went for help to itertools.combination's source code. And it turns out it didn't use recursion technique.
Do you think it is possible to abstract this combine_5 function into a combine_n function with recursion technique? Do you have any ideas about its realization?
FAILURE SAMPLE 1:
def combine_n(elements, r, cur_index=-1):
"""Find all combinations in elements with range n"""
r -= 1
target_list = elements[cur_index+1 : ]
for i in target_list:
cur_index = elements.index(i)
if r > 0:
combine_n(elements, r, cur_index)
print i
else:
print i
This is my recent try after a bunch of overcomplicated experiments.
The core ideas is: if I can print them right, I can collect them into a container later.
But the problem is, in a nested for loop, when the lower for-loop hit with an empty list.
The temp_list.append((i,j,k,n,m)) clause of combine_5 will not work.
But in FAILURE SAMPLE 1, it still will print the content of the upper for-loop
like combine_n([0,1], 2) will print 2, 1, 2.
I need to find a way to convey this empty message to the superior for-loop.
Which I didn't figure out so far.
Yes, it's possible to do it with recursion. You can make combine_n return a list of tuples with all the combinations beginning at index cur_index, and starting with a partial combination of cur_combo, which you build up as you recurse:
def combine_n(elements, r, cur_index=0, cur_combo=()):
r-=1
temp_list = []
for elem_index in range(cur_index, len(elements)-r):
i = elements[elem_index]
if r > 0:
temp_list = temp_list + combine_n(elements, r, elem_index+1, cur_combo+(i,))
else:
temp_list.append(cur_combo+(i,))
return temp_list
elements = list(range(1,6))
print = combine_n(elements, 3)
output:
[(1, 2, 3), (1, 2, 4), (1, 2, 5), (1, 3, 4), (1, 3, 5), (1, 4, 5), (2, 3, 4), (2, 3, 5), (2, 4, 5), (3, 4, 5)]
The for loop only goes up to len(elements)-r, because if you go further than that then there aren't enough remaining elements to fill the remaining places in the tuple. The tuples only get added to the list with append at the last level of recursion, then they get passed back up the call stack by returning the temp_lists and concatenating at each level back to the top.

How to split a list into subsets with no repeating elements in python

I need code that takes a list (up to n=31) and returns all possible subsets of n=3 without any two elements repeating in the same subset twice (think of people who are teaming up in groups of 3 with new people every time):
list=[1,2,3,4,5,6,7,8,9]
and returns
[1,2,3][4,5,6][7,8,9]
[1,4,7][2,3,8][3,6,9]
[1,6,8][2,4,9][3,5,7]
but not:
[1,5,7][2,4,8][3,6,9]
because 1 and 7 have appeared together already (likewise, 3 and 9).
I would also like to do this for subsets of n=2.
Thank you!!
Here's what I came up with:
from itertools import permutations, combinations, ifilter, chain
people = [1,2,3,4,5,6,7,8,9]
#get all combinations of 3 sets of 3 people
combos_combos = combinations(combinations(people,3), 3)
#filter out sets that don't contain all 9 people
valid_sets = ifilter(lambda combo:
len(set(chain.from_iterable(combo))) == 9,
combos_combos)
#a set of people that have already been paired
already_together = set()
for sets in valid_sets:
#get all (sorted) combinations of pairings in this set
pairings = list(chain.from_iterable(combinations(combo, 2) for combo in sets))
pairings = set(map(tuple, map(sorted, pairings)))
#if all of the pairings have never been paired before, we have a new one
if len(pairings.intersection(already_together)) == 0:
print sets
already_together.update(pairings)
This prints:
~$ time python test_combos.py
((1, 2, 3), (4, 5, 6), (7, 8, 9))
((1, 4, 7), (2, 5, 8), (3, 6, 9))
((1, 5, 9), (2, 6, 7), (3, 4, 8))
((1, 6, 8), (2, 4, 9), (3, 5, 7))
real 0m0.182s
user 0m0.164s
sys 0m0.012s
Try this:
from itertools import permutations
lst = list(range(1, 10))
n = 3
triplets = list(permutations(lst, n))
triplets = [set(x) for x in triplets]
def array_unique(seq):
checked = []
for x in seq:
if x not in checked:
checked.append(x)
return checked
triplets = array_unique(triplets)
result = []
m = n * 3
for x in triplets:
for y in triplets:
for z in triplets:
if len(x.union(y.union(z))) == m:
result += [[x, y, z]]
def groups(sets, i):
result = [sets[i]]
for x in sets:
flag = True
for y in result:
for r in x:
for p in y:
if len(r.intersection(p)) >= 2:
flag = False
break
else:
continue
if flag == False:
break
if flag == True:
result.append(x)
return result
for i in range(len(result)):
print('%d:' % (i + 1))
for x in groups(result, i):
print(x)
Output for n = 10:
http://pastebin.com/Vm54HRq3
Here's my attempt of a fairly general solution to your problem.
from itertools import combinations
n = 3
l = range(1, 10)
def f(l, n, used, top):
if len(l) == n:
if all(set(x) not in used for x in combinations(l, 2)):
yield [l]
else:
for group in combinations(l, n):
if any(set(x) in used for x in combinations(group, 2)):
continue
for rest in f([i for i in l if i not in group], n, used, False):
config = [list(group)] + rest
if top:
# Running at top level, this is a valid
# configuration. Update used list.
for c in config:
used.extend(set(x) for x in combinations(c, 2))
yield config
break
for i in f(l, n, [], True):
print i
However, it is very slow for high values of n, too slow for n=31. I don't have time right now to try to improve the speed, but I might try later. Suggestions are welcome!
My wife had this problem trying to arrange breakout groups for a meeting with nine people; she wanted no pairs of attendees to repeat.
I immediately busted out itertools and was stumped and came to StackOverflow. But in the meantime, my non-programmer wife solved it visually. The key insight is to create a tic-tac-toe grid:
1 2 3
4 5 6
7 8 9
And then simply take 3 groups going down, 3 groups going across, and 3 groups going diagonally wrapping around, and 3 groups going diagonally the other way, wrapping around.
You can do it just in your head then.
- : 123,456,789
| : 147,258,368
\ : 159,267,348
/ : 168,249,357
I suppose the next question is how far can you take a visual method like this? Does it rely on the coincidence that the desired subset size * the number of subsets = the number of total elements?

Number of elements in Python Set

I have a list of phone numbers that have been dialed (nums_dialed).
I also have a set of phone numbers which are the number in a client's office (client_nums)
How do I efficiently figure out how many times I've called a particular client (total)
For example:
>>>nums_dialed=[1,2,2,3,3]
>>>client_nums=set([2,3])
>>>???
total=4
Problem is that I have a large-ish dataset: len(client_nums) ~ 10^5; and len(nums_dialed) ~10^3.
which client has 10^5 numbers in his office? Do you do work for an entire telephone company?
Anyway:
print sum(1 for num in nums_dialed if num in client_nums)
That will give you as fast as possible the number.
If you want to do it for multiple clients, using the same nums_dialed list, then you could cache the data on each number first:
nums_dialed_dict = collections.defaultdict(int)
for num in nums_dialed:
nums_dialed_dict[num] += 1
Then just sum the ones on each client:
sum(nums_dialed_dict[num] for num in this_client_nums)
That would be a lot quicker than iterating over the entire list of numbers again for each client.
>>> client_nums = set([2, 3])
>>> nums_dialed = [1, 2, 2, 3, 3]
>>> count = 0
>>> for num in nums_dialed:
... if num in client_nums:
... count += 1
...
>>> count
4
>>>
Should be quite efficient even for the large numbers you quote.
Using collections.Counter from Python 2.7:
dialed_count = collections.Counter(nums_dialed)
count = sum(dialed_count[t] for t in client_nums)
Thats very popular way to do some combination of sorted lists in single pass:
nums_dialed = [1, 2, 2, 3, 3]
client_nums = [2,3]
nums_dialed.sort()
client_nums.sort()
c = 0
i = iter(nums_dialed)
j = iter(client_nums)
try:
a = i.next()
b = j.next()
while True:
if a < b:
a = i.next()
continue
if a > b:
b = j.next()
continue
# a == b
c += 1
a = i.next() # next dialed
except StopIteration:
pass
print c
Because "set" is unordered collection (don't know why it uses hashes, but not binary tree or sorted list) and it's not fair to use it there. You can implement own "set" through "bisect" if you like lists or through something more complicated that will produce ordered iterator.
The method I use is to simply convert the set into a list and then use the len() function to count its values.
set_var = {"abc", "cba"}
print(len(list(set_var)))
Output:
2

Learning Python and using dictionaries

I'm working through exercises in Building Skills in Python, which to my knowledge don't have any published solutions.
In any case, I'm attempting to have a dictionary count the number of occurrences of a certain number in the original list, before duplicates are removed. For some reason, despite a number of variations on the theme below, I cant seem to increment the value for each of the 'keys' in the dictionary.
How could I code this with dictionaries?
dv = list()
# arbitrary sequence of numbers
seq = [2,4,5,2,4,6,3,8,9,3,7,2,47,2]
# dictionary counting number of occurances
seqDic = { }
for v in seq:
i = 1
dv.append(v)
for i in range(len(dv)-1):
if dv[i] == v:
del dv[-1]
seqDic.setdefault(v)
currentCount = seqDic[v]
currentCount += 1
print currentCount # debug
seqDic[v]=currentCount
print "orig:", seq
print "new: ", dv
print seqDic
defaultdict is not dict (it's a subclass, and may do too much of the work for you to help you learn via this exercise), so here's a simple way to do it with plain dict:
dv = list()
# arbitrary sequence of numbers
seq = [2,4,5,2,4,6,3,8,9,3,7,2,47,2]
# dictionary counting number of occurances
seqDic = { }
for i in seq:
if i in seqDic:
seqDic[i] += 1
else:
dv.append(i)
seqDic[i] = 1
this simple approach works particularly well here because you need the if i in seqDic test anyway for the purpose of building dv as well as seqDic. Otherwise, simpler would be:
for i in seq:
seqDic[i] = 1 + seqDic.get(i, 0)
using the handy method get of dict, which returns the second argument if the first is not a key in the dictionary. If you like this idea, here's a solution that also builds dv:
for i in seq:
seqDic[i] = 1 + seqDic.get(i, 0)
if seqDic[i] == 1: dv.append(i)
Edit: If you don't case about the order of items in dv (rather than wanting dv to be in the same order as the first occurrence of item in seq), then just using (after the simple version of the loop)
dv = seqDic.keys()
also works (in Python 2, where .keys returns a list), and so does
dv = list(seqDic)
which is fine in both Python 2 and Python 3. Under the same hypothesis (that you don't care about the order of items in dv) there are also other good solutions, such as
seqDic = dict.fromkeys(seq, 0)
for i in seq: seqDic[i] += 1
dv = list(seqDic)
here, we first use the fromkeys class method of dictionaries to build a new dict which already has 0 as the value corresponding to each key, so we can then just increment each entry without such precautions as .get or membership checks.
defaultdict makes this easy:
>>> from collections import defaultdict
>>> seq = [2,4,5,2,4,6,3,8,9,3,7,2,47,2]
>>> seqDic = defaultdict(int)
>>> for v in seq:
... seqDic[v] += 1
>>> print seqDic
defaultdict(<type 'int'>, {2: 4, 3: 2, 4: 2, 5: 1, 6: 1, 7: 1, 8: 1, 9: 1, 47: 1})
I'm not really sure what you try to do .. count how often each number appears?
#arbitrary sequence of numbers
seq = [2,4,5,2,4,6,3,8,9,3,7,2,47,2]
#dictionary counting number of occurances
seqDic = {}
### what you want to do, spelled out
for number in seq:
if number in seqDic: # we had the number before
seqDic[number] += 1
else: # first time we see it
seqDic[number] = 1
#### or:
for number in seq:
current = seqDic.get(number, 0) # current count in the dict, or 0
seqDic[number] = current + 1
### or, to show you how setdefault works
for number in seq:
seqDic.setdefault(number, 0) # set to 0 if it doesnt exist
seqDic[number] += 1 # increase by one
print "orig:", seq
print seqDic
How about this:
#arbitrary sequence of numbers
seq = [2,4,5,2,4,6,3,8,9,3,7,2,47,2]
#dictionary counting number of occurances
seqDic = { }
for v in seq:
if v in seqDic:
seqDic[v] += 1
else:
seqDic[v] = 1
dv = seqDic.keys()
print "orig:", seq
print "new: ", dv
print seqDic
It's clean and I think it demonstrates what you are trying to learn how to do in a simple manner. It is possible to do this using defaultdict as others have pointed out, but knowing how to do it this way is instructive too.
Or, if you use Python3, you can use collections.Counter, which is essentially a dict, albeit subclassed.
>>> from collections import Counter
>>> seq = [2,4,5,2,4,6,3,8,9,3,7,2,47,2]
>>> Counter(seq)
Counter({2: 4, 3: 2, 4: 2, 5: 1, 6: 1, 7: 1, 8: 1, 9: 1, 47: 1}
for v in seq:
try:
seqDic[v] += 1
except KeyError:
seqDic[v] = 1
That's the way I've always done the inner loop of things like this.
Apart from anything else, it's significantly faster than testing membership before working on the element, so if you have a few hundred thousand elements it saves a lot of time.

Categories

Resources