Related
Working on the 'two-sum' problem..
Input: An unsorted array A (of integers), and a target sum t
The goal: to return a list of tuple pairs (x,y) where x + y = t
I've implemented a hash-table H to store the contents of A. Through use of a nested loop to iterate through H, I'm achieving the desired output. However, in the spirit of learning the art of Python, I'd like to replace the nested loop with a nice 1-liner using comprehension & a maybe lambda function? Suggestions?
Source Code:
import csv
with open('/Users/xxx/Developer/Algorithms/Data Structures/_a.txt') as csvfile:
csv_reader = csv.reader(csvfile, delimiter ='\n')
hash_table = {int(num[0]):int(num[0]) for(num) in csv_reader} #{str:int}
def two_sum(hash_table, target):
pairs = list()
for x in hash_table.keys():
for y in hash_table.keys():
if x == y:
continue
if x + y == target:
pairs.append((x,y))
return pairs
When you have two ranges and you want to loop both of them separately to get all the combinations as in your case, you can combine the loops into one using itertools.product. You can replace the code below
range1 = [1,2,3,4]
range2 = [3, 4, 5]
for x in range1:
for y in range2:
print(x, y)
with
from itertools import product
for x, y in product(range1, range2):
print(x, y)
Both code blocks produce
1 3
1 4
1 5
2 3
2 4
2 5
3 3
3 4
3 5
4 3
4 4
4 5
But you would still need the if check with this construct. However, what product returns is a generator and you can pass that as the iterable to map or filter along with a lambda function.
In your case you only want to include pairs that meet the criteria. Thus, filter is what you want. In my simple example, if we only want combinations whose sum is even, then we could do something like
gen = product(range1, range2)
f = lambda i: (i[0] + i[1]) % 2 == 0
desired_pairs = filter(f, gen)
This can be written as a one-liner like
desired_pairs = filter(lambda i: (i[0] + i[1]) % 2 == 0, product(range1, range2))
without being too complicated for being understood.
Note that like product and map, what filter returns is a generator, which is good if you are just going to loop over it later to do some other work. If you really need a list just do convert it to a list as
desired_pairs = list(filter(lambda i: (i[0] + i[1]) % 2 == 0, product(range1, range2)))
If we print this we get
[(1, 3), (1, 5), (2, 4), (3, 3), (3, 5), (4, 4)]
I am very new to Python programming and have come across a problem statement i have no clue how to solve.
I have four lines of input:
0 1
2 4
6 7
3 5
For accepting these 4 lines of input i can do the below:
for i in range(4):
a,b = list(map(int,input().split(' ')))
I am supposed to merge the intervals into(Output) :
0 1
2 5
6 7
Intervals (2,4) and (3,5) they should be merged into one (2,5).
I am not sure how should i go about this ?
Can someone help me in getting a direction?
Thanks in advance.
If you're looking for a Python library that handles intervals arithmetic, consider python-interval. Disclaimer: I'm the maintainer of that library.
import intervals as I
interval = I.empty()
for i, j in [(0, 1), (2, 4), (6, 7), (3, 5)]:
interval = interval | I.closed(i, j)
print(interval)
results in
[0,1] | [2,5] | [6,7]
See its documentation for more information.
Try this
from functools import reduce
# inp = [(0,1),(2,9),(6,7),(3,5)]
inp = [(0,1),(2,4),(6,7),(3,5)]
print(inp)
def merge(li,item):
if li:
if li[-1][1] >= item[0]:
li[-1] = li[-1][0], max(li[-1][1],item[1])
return li
li.append(item)
return li
print(reduce(merge, sorted(inp), []))
I have a for loop that gives me the following output.
0.53125
0.4375
0.546875
0.578125
0.75
0.734375
0.640625
0.53125
0.515625
0.828125
0.5
0.484375
0.59375
0.59375
0.734375
0.71875
0.609375
0.484375
.
.
.
How do I find the mean of the first 9 values, the next 9 values and so on and store them into a list like [0.58,0.20,...]? I have tried a lot of things but the values seem to be incorrect. What is the correct way of doing this?
What I did:
matchedRatioList = []
matchedRatio = 0
i = 0
for feature in range(90):
featureToCompare = featuresList[feature]
number = labelsList[feature]
match = difflib.SequenceMatcher(None,featureToCompare,imagePixList)
matchingRatio = match.ratio()
print(matchingRatio)
matchedRatio += matchingRatio
if i == 8:
matchedRatioList.append(matchedRatio / 9)
i = 0
matchedRatio = 0
i += 1
Once you have the list of numbers you can calculate the average of each group of 9 numbers using list comprehensions:
from statistics import mean
numbers = [0.53125, 0.4375, 0.546875, 0.578125, 0.75, 0.734375, 0.640625,
0.53125, 0.515625, 0.828125, 0.5, 0.484375, 0.59375, 0.59375,
0.734375, 0.71875, 0.609375, 0.484375]
group_len = 9
matched_ratios = [mean(group) for group in [numbers[i:i+group_len]
for i in range(0, len(numbers), group_len)]]
print(matched_ratios)
# [0.5850694444444444, 0.6163194444444444]
Your solution is close. Start with i = 1 and check for i == 9
matchedRatioList = []
matchedRatio = 0
i = 1 # change here
for feature in range(90):
...
matchedRatio += matchingRatio
if i == 9: # change here
matchedRatioList.append(matchedRatio / 9)
i = 0
matchedRatio = 0
i += 1
I do not know what you have tried so far, but I can present you with one solution to the problem.
Save all values in your for-loop to a buffer array. Use an if-statement with iterator % 9 == 0 inside your for-loop, which will make some portion of code execute only every 9 values.
Inside the if-statement you can write the mean value of your buffer array to a different output array. Reset your buffer array inside this if-statement as well, then this process is repeated and should behave in the way you want.
Try this
r = []
for b in a:
c += b
if i == 8:
c = c/9
r.append(c)
c = 0
i = 0
i += 1
since nobody used reduce so far :)
import functools
l = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18]
m = []
for i in range(9,len(l), 9):
m.append(functools.reduce(lambda x, y: x + y, l[i-9:i])/9)
print(m)
Using mean function from the statistics module of Python.
import statistics
# Sample Values list I created.
values_list = list()
for i in range(1,82):
values_list.append(i)
mean_list = list()
for i in range(0, len(values_list), 9):
mean_list.append(statistics.mean(values_list[i:i+9]))
for i in mean_list:
print(i)
This is the simplest way in which you can do it.
https://docs.python.org/3/library/statistics.html#statistics.mean
One-line solution given loop output in numbers:
[float(sum(a))/len(a) for a in zip(*[iter(numbers)]*9)]
Putting ideas from the other answers together, this could be the whole program:
from statistics import mean
matching_ratios = (difflib.SequenceMatcher(None, feature, imagePixList).ratio()
for feature in featuresList[:90])
matchedRatioList = [mean(group) for group in zip(*[matching_ratios] * 9)]
See update below...
I'm writing a Python simulation that assigns an arbitrary number of imaginary players one goal from an arbitrary pool of goals. The goals have two different levels or proportions of scarcity, prop_high and prop_low, at approximately a 3:1 ratio.
For example, if there are 16 players and 4 goals, or 8 players and 4 goals, the two pools of goals would look like this:
{'A': 6, 'B': 6, 'C': 2, 'D': 2}
{'A': 3, 'B': 3, 'C': 1, 'D': 1}
...with goals A and B occurring 3 times as often as C and D. 6+6+2+2 = 16, which corresponds to the number of players in the simulation, which is good.
I want to have a pool of goals equal to the number of players and distributed so that there are roughly three times as many prop_high goals as there are prop_low goals.
What's the best way to build an allocation algorithm according to a rough or approximate ratio—something that can handle rounding?
Update:
Assuming 8 players, here's how the distributions from 2 to 8 goals should hopefully look (prop_high players are starred):
A B C D E F G H
2 6* 2
3 6* 1 1
4 3* 3* 1 1
5 3* 2* 1 1 1
6 2* 2* 1* 1 1 1
7 2* 1* 1* 1 1 1 1
8 1* 1* 1* 1* 1 1 1 1
These numbers don't correspond to players. For example, with 5 goals and 8 players, goals A and B have a high proportion in the pool (3 and 2 respectively) while goals C, D, and E are more rare (1 each).
When there's an odd number of goals, the last of the prop_high gets one less than the others. As the number of goals approaches the number of players, each of the prop_high items gets one less until the end, when there is one of each goal in the pool.
What I've done below is assign quantities to the high and low ends of the pool and then make adjustments to the high end, subtracting values according to how close the number of goals is to the number of players. It works well with 8 players (the number of goals in the pool is always equal to 8), but that's all.
I'm absolutely sure there's a better, more Pythonic way to handle this sort of algorithm, and I'm pretty sure it's a relatively common design pattern. I just don't know where to start googling to find a more elegant way to handle this sort of structure (instead of the brute force method I'm using for now)
import string
import math
letters = string.uppercase
num_players = 8
num_goals = 5
ratio = (3, 1)
prop_high = ratio[0] / float(sum(ratio)) / (float(num_goals)/2)
prop_low = ratio[1] / float(sum(ratio)) / (float(num_goals)/2)
if num_goals % 2 == 1:
is_odd = True
else:
is_odd = False
goals_high = []
goals_low = []
high = []
low = []
# Allocate the goals to the pool. Final result will be incorrect.
count = 0
for i in range(num_goals):
if count < num_goals/2: # High proportion
high.append(math.ceil(prop_high * num_players))
goals_high.append(letters[i])
else: # Low proportion
low.append(math.ceil(prop_low * num_players))
goals_low.append(letters[i])
count += 1
# Make adjustments to the pool allocations to account for rounding and odd numbers
ratio_high_total = len(high)/float(num_players)
overall_ratio = ratio[1]/float(sum(ratio))
marker = (num_players / 2) + 1
offset = num_goals - marker
if num_players == num_goals:
for i in high:
high[int(i)] -= 1
elif num_goals == 1:
low[0] = num_players
elif ratio_high_total == overall_ratio and is_odd:
high[-1] -= 1
elif ratio_high_total >= overall_ratio: # Upper half of possible goals
print offset
for i in range(offset):
index = -(int(i) + 1)
high[index] -= 1
goals = goals_high + goals_low
goals_quantities = high + low
print "Players:", num_players
print "Types of goals:", num_goals
print "Total goals in pool:", sum(goals_quantities)
print "High pool:", goals_high, high
print "Low pool:", goals_low, low
print goals, goals_quantities
print "High proportion:", prop_high, " || Low proportion:", prop_low
Rather than try to get the fractions right, I'd just allocate the goals one at a time in the appropriate ratio. Here the 'allocate_goals' generator assigns a goal to each of the low-ratio goals, then to each of the high-ratio goals (repeating 3 times). Then it repeats. The caller, in allocate cuts off this infinite generator at the required number (the number of players) using itertools.islice.
import collections
import itertools
import string
def allocate_goals(prop_low, prop_high):
prop_high3 = prop_high * 3
while True:
for g in prop_low:
yield g
for g in prop_high3:
yield g
def allocate(goals, players):
letters = string.ascii_uppercase[:goals]
high_count = goals // 2
prop_high, prop_low = letters[:high_count], letters[high_count:]
g = allocate_goals(prop_low, prop_high)
return collections.Counter(itertools.islice(g, players))
for goals in xrange(2, 9):
print goals, sorted(allocate(goals, 8).items())
It produces this answer:
2 [('A', 6), ('B', 2)]
3 [('A', 4), ('B', 2), ('C', 2)]
4 [('A', 3), ('B', 3), ('C', 1), ('D', 1)]
5 [('A', 3), ('B', 2), ('C', 1), ('D', 1), ('E', 1)]
6 [('A', 2), ('B', 2), ('C', 1), ('D', 1), ('E', 1), ('F', 1)]
7 [('A', 2), ('B', 1), ('C', 1), ('D', 1), ('E', 1), ('F', 1), ('G', 1)]
8 [('A', 1), ('B', 1), ('C', 1), ('D', 1), ('E', 1), ('F', 1), ('G', 1), ('H', 1)]
The great thing about this approach (apart from, I think, that it's easy to understand) is that it's quick to turn it into a randomized version.
Just replace allocate_goals with this:
def allocate_goals(prop_low, prop_high):
all_goals = prop_low + prop_high * 3
while True:
yield random.choice(all_goals)
Some time ago (okay, two and a half years) I asked a question that I think would be relevant here. Here's how I think you could use this: first, build a list of the priorities assigned to each goal. In your example, where the first half of the goal pool (rounded down) gets priority 3 and the rest get priority 1, one way to do this is
priorities = [3] * len(goals) / 2 + [1] * (len(goals) - len(goals) / 2)
Of course, you can create your list of priorities in any way you want; it doesn't have to be half 3s and half 1s. The only requirement is that all the entries be positive numbers.
Once you have the list, normalize it to have a sum equal to the number of players:
# Assuming num_players is already defined to be the number of players
normalized_priorities = [float(p) / sum(priorities) * num_players
for p in priorities]
Then apply one of the algorithms from my question to round these floating-point numbers to integers representing the actual allocations. Among the answers given, there are only two algorithms that do the rounding properly and satisfy the minimum variance criterion: adjusted fractional distribution (including the "Update" paragraph) and minimizing roundoff error. Conveniently, both of them appear to work for non-sorted lists. Here are my Python implementations:
import math, operator
from heapq import nlargest
from itertools import izip
item1 = operator.itemgetter(1)
def floor(f):
return int(math.floor(f))
def frac(f):
return math.modf(f)[0]
def adjusted_fractional_distribution(fn_list):
in_list = [floor(f) for f in fn_list]
loss_list = [frac(f) for f in fn_list]
fsum = math.fsum(loss_list)
add_list = [0] * len(in_list)
largest = nlargest(int(round(fsum)), enumerate(loss_list),
key=lambda e: (e[1], e[0]))
for i, loss in largest:
add_list[i] = 1
return [i + a for i,a in izip(in_list, add_list)]
def minimal_roundoff_error(fn_list):
N = int(math.fsum(fn_list))
temp_list = [[floor(f), frac(f), i] for i, f in enumerate(fn_list)]
temp_list.sort(key = item1)
lower_sum = sum(floor(f) for f in fn_list)
difference = N - lower_sum
for i in xrange(len(temp_list) - difference, len(temp_list)):
temp_list[i][0] += 1
temp_list.sort(key = item2)
return [t[0] for t in temp_list]
In all my tests, both these methods are exactly equivalent, so you can pick either one to use.
Here's a usage example:
>>> goals = 'ABCDE'
>>> num_players = 17
>>> priorities = [3,3,1,1,1]
>>> normalized_priorities = [float(p) / sum(priorities) * num_players
for p in priorities]
[5.666666..., 5.666666..., 1.888888..., 1.888888..., 1.888888...]
>>> minimal_roundoff_error(normalized_priorities)
[5, 6, 2, 2, 2]
If you want to allocate the extra players to the first goals within a group of equal priority, rather than the last, probably the easiest way to do this is to reverse the list before and after applying the rounding algorithm.
>>> def rlist(l):
... return list(reversed(l))
>>> rlist(minimal_roundoff_error(rlist(normalized_priorities)))
[6, 5, 2, 2, 2]
Now, this may not quite match the distributions you expect, because in my question I specified a "minimum variance" criterion that I used to judge the result. That might not be appropriate for you case. You could try the "remainder distribution" algorithm instead of one of the two I mentioned above and see if it works better for you.
def remainder_distribution(fn_list):
N = math.fsum(fn_list)
rn_list = [int(round(f)) for f in fn_list]
remainder = N - sum(rn_list)
first = 0
last = len(fn_list) - 1
while remainder > 0 and last >= 0:
if abs(rn_list[last] + 1 - fn_list[last]) < 1:
rn_list[last] += 1
remainder -= 1
last -= 1
while remainder < 0 and first < len(rn_list):
if abs(rn_list[first] - 1 - fn_list[first]) < 1:
rn_list[first] -= 1
remainder += 1
first += 1
return rn_list
I need code that takes a list (up to n=31) and returns all possible subsets of n=3 without any two elements repeating in the same subset twice (think of people who are teaming up in groups of 3 with new people every time):
list=[1,2,3,4,5,6,7,8,9]
and returns
[1,2,3][4,5,6][7,8,9]
[1,4,7][2,3,8][3,6,9]
[1,6,8][2,4,9][3,5,7]
but not:
[1,5,7][2,4,8][3,6,9]
because 1 and 7 have appeared together already (likewise, 3 and 9).
I would also like to do this for subsets of n=2.
Thank you!!
Here's what I came up with:
from itertools import permutations, combinations, ifilter, chain
people = [1,2,3,4,5,6,7,8,9]
#get all combinations of 3 sets of 3 people
combos_combos = combinations(combinations(people,3), 3)
#filter out sets that don't contain all 9 people
valid_sets = ifilter(lambda combo:
len(set(chain.from_iterable(combo))) == 9,
combos_combos)
#a set of people that have already been paired
already_together = set()
for sets in valid_sets:
#get all (sorted) combinations of pairings in this set
pairings = list(chain.from_iterable(combinations(combo, 2) for combo in sets))
pairings = set(map(tuple, map(sorted, pairings)))
#if all of the pairings have never been paired before, we have a new one
if len(pairings.intersection(already_together)) == 0:
print sets
already_together.update(pairings)
This prints:
~$ time python test_combos.py
((1, 2, 3), (4, 5, 6), (7, 8, 9))
((1, 4, 7), (2, 5, 8), (3, 6, 9))
((1, 5, 9), (2, 6, 7), (3, 4, 8))
((1, 6, 8), (2, 4, 9), (3, 5, 7))
real 0m0.182s
user 0m0.164s
sys 0m0.012s
Try this:
from itertools import permutations
lst = list(range(1, 10))
n = 3
triplets = list(permutations(lst, n))
triplets = [set(x) for x in triplets]
def array_unique(seq):
checked = []
for x in seq:
if x not in checked:
checked.append(x)
return checked
triplets = array_unique(triplets)
result = []
m = n * 3
for x in triplets:
for y in triplets:
for z in triplets:
if len(x.union(y.union(z))) == m:
result += [[x, y, z]]
def groups(sets, i):
result = [sets[i]]
for x in sets:
flag = True
for y in result:
for r in x:
for p in y:
if len(r.intersection(p)) >= 2:
flag = False
break
else:
continue
if flag == False:
break
if flag == True:
result.append(x)
return result
for i in range(len(result)):
print('%d:' % (i + 1))
for x in groups(result, i):
print(x)
Output for n = 10:
http://pastebin.com/Vm54HRq3
Here's my attempt of a fairly general solution to your problem.
from itertools import combinations
n = 3
l = range(1, 10)
def f(l, n, used, top):
if len(l) == n:
if all(set(x) not in used for x in combinations(l, 2)):
yield [l]
else:
for group in combinations(l, n):
if any(set(x) in used for x in combinations(group, 2)):
continue
for rest in f([i for i in l if i not in group], n, used, False):
config = [list(group)] + rest
if top:
# Running at top level, this is a valid
# configuration. Update used list.
for c in config:
used.extend(set(x) for x in combinations(c, 2))
yield config
break
for i in f(l, n, [], True):
print i
However, it is very slow for high values of n, too slow for n=31. I don't have time right now to try to improve the speed, but I might try later. Suggestions are welcome!
My wife had this problem trying to arrange breakout groups for a meeting with nine people; she wanted no pairs of attendees to repeat.
I immediately busted out itertools and was stumped and came to StackOverflow. But in the meantime, my non-programmer wife solved it visually. The key insight is to create a tic-tac-toe grid:
1 2 3
4 5 6
7 8 9
And then simply take 3 groups going down, 3 groups going across, and 3 groups going diagonally wrapping around, and 3 groups going diagonally the other way, wrapping around.
You can do it just in your head then.
- : 123,456,789
| : 147,258,368
\ : 159,267,348
/ : 168,249,357
I suppose the next question is how far can you take a visual method like this? Does it rely on the coincidence that the desired subset size * the number of subsets = the number of total elements?