how to group near points in list in python - python

I am getting a list using a list comprehension. lat say I am getting this list using this line of code bellow:
quality, angle, distance = measurements[i]
new_data = [each_value for each_value in measurements[i:i + 20] if angle <= each_value[1] <= angle + 30 and
distance - 150 <= each_value[2] <= distance + 150]
where measurements is a big data set which contains (quality, angle, distance) pair. from that, I am getting those value.
desired_list= [(1,2,3)(1,5,3),(1,8,3)(1,10,3),(1,16,3),(1,17,3)]]
Now how can I add a new condition in my list comprehension so that I will only get the value if the angle is within some offset value? let say if the difference between two respective angles is less then or equal to 5 then put them in desired_list.
with this condition my list should be like so:
desired_list= [(1,2,3)(1,5,3),(1,8,3)(1,10,3)]
cause from 2 to 5, 5 to 8, 8 to 10 the distance is less than or equal to 5.
But the last two points are not included as they break the condition after (1,10,3) and they don't need to check.
How can I achieve this? please help me
Note: it doesn't need to be in the same list comprehension.

You mention the data set is large. Depending how large you many wish to avoid creating a new list from scratch and just search for the relavant index.
data = [(1,2,3), (1,5,3), (1,8,3), (1,10,3), (1,16,3), (1,17,3)]
MAXIMUM_ANGLE = 5
def angles_within_range(x, y):
return abs(x[1] - y[1]) <= MAXIMUM_ANGLE
def first_angle_break_index():
for i in range(len(data) - 1):
if not angles_within_range(data[i], data[i+1]):
return i+1
def valid_angles_list():
return data[:first_angle_break_index()]
print(valid_angles_list())

If you means traverse from start to end, and break out when one neighor pairs break the rule.
here is a way without list comprehension:
desired_list = [(1, 2, 3), (1, 5, 3), (1, 8, 3), (1, 10, 3), (1, 16, 3), (1, 17, 3)]
res = [desired_list[0]]
for a, b in zip(desired_list[:-1], desired_list[1:]):
if abs(a[1] - b[1]) > 5:
break
res += [b]
print(res)
output:
[(1, 2, 3), (1, 5, 3), (1, 8, 3), (1, 10, 3)]
if you insist on using list comprehension with break, here is a solution of recording last pair:
res = [last.pop() and last.append(b) or b for last in [[desired_list[0]]] for a, b in
zip([desired_list[0]] + desired_list, desired_list) if abs(a[1] - b[1]) <= 5 and a == last[0]]
another version use end condition:
res = [b for end in [[]] for a, b in zip([desired_list[0]] + desired_list, desired_list) if
(False if end or abs(a[1] - b[1]) <= 5 else end.append(42)) or not end and abs(a[1] - b[1]) <= 5]
Note: This is a bad idea. (just for fun : ))

Related

Python Comprehension - Replace Nested Loop

Working on the 'two-sum' problem..
Input: An unsorted array A (of integers), and a target sum t
The goal: to return a list of tuple pairs (x,y) where x + y = t
I've implemented a hash-table H to store the contents of A. Through use of a nested loop to iterate through H, I'm achieving the desired output. However, in the spirit of learning the art of Python, I'd like to replace the nested loop with a nice 1-liner using comprehension & a maybe lambda function? Suggestions?
Source Code:
import csv
with open('/Users/xxx/Developer/Algorithms/Data Structures/_a.txt') as csvfile:
csv_reader = csv.reader(csvfile, delimiter ='\n')
hash_table = {int(num[0]):int(num[0]) for(num) in csv_reader} #{str:int}
def two_sum(hash_table, target):
pairs = list()
for x in hash_table.keys():
for y in hash_table.keys():
if x == y:
continue
if x + y == target:
pairs.append((x,y))
return pairs
When you have two ranges and you want to loop both of them separately to get all the combinations as in your case, you can combine the loops into one using itertools.product. You can replace the code below
range1 = [1,2,3,4]
range2 = [3, 4, 5]
for x in range1:
for y in range2:
print(x, y)
with
from itertools import product
for x, y in product(range1, range2):
print(x, y)
Both code blocks produce
1 3
1 4
1 5
2 3
2 4
2 5
3 3
3 4
3 5
4 3
4 4
4 5
But you would still need the if check with this construct. However, what product returns is a generator and you can pass that as the iterable to map or filter along with a lambda function.
In your case you only want to include pairs that meet the criteria. Thus, filter is what you want. In my simple example, if we only want combinations whose sum is even, then we could do something like
gen = product(range1, range2)
f = lambda i: (i[0] + i[1]) % 2 == 0
desired_pairs = filter(f, gen)
This can be written as a one-liner like
desired_pairs = filter(lambda i: (i[0] + i[1]) % 2 == 0, product(range1, range2))
without being too complicated for being understood.
Note that like product and map, what filter returns is a generator, which is good if you are just going to loop over it later to do some other work. If you really need a list just do convert it to a list as
desired_pairs = list(filter(lambda i: (i[0] + i[1]) % 2 == 0, product(range1, range2)))
If we print this we get
[(1, 3), (1, 5), (2, 4), (3, 3), (3, 5), (4, 4)]

Excluding intervals from a real number line

A long time ago I asked the following question:
Python: delete substring by indices
Last week I was asked a very similar question, but with continuous real-number line.
Imagine you are given an interval (X, Y), and a bunch of sub-intervals blocks=[(x1, y1), (x2, y2), ...]. Your goal is to find a list of intervals remaining=[(a1, b1), (a2, b2), ..] that is (1) in (X, Y), but (2) not in any of the sub-intervals in blocks. The intervals in blocks can overlap.
In other words, the function signature looks something like:
def delete_blocks_from_interval(X, Y, blocks):
```
X: start of the given interval
Y: end of the given interval
blocks: list of intervals (x, y) to be removed, can overlap
returns remaining = [(a, b), ...] intervals remaining after removal of blocks
```
pass
I can construct a graph of connected intervals in blocks, and find both the minimum of the start and the maximum of end for each connected component in the graph. But this is quadratic in length of blocks. I wonder if there is a better-runtime algorithm.
Please also discuss what code routine you think is more efficient for the algorithm if you will.
Many many thanks.
As requested, please consider the following illustrative inputs:
X = -1
Y = 20
blocks = [(1, 10), (4, 5), (9, 11), (16, 17.5)]
the expected output is remaining = [(-1, 1), (11, 16), (17.5, 20)]
This should be linear run-time in number of blocks
import bisect
def delete_blocks_from_interval(range_start, range_end, blocks):
blocks = sorted(blocks)
# check if the interval overlaps with blocks,
# if so, truncate the block lists, reset end points if required
start_idx = bisect.bisect_left([b[0] for b in blocks],range_start)
end_idx = bisect.bisect_left([b[0] for b in blocks],range_end)
blocks = blocks[start_idx:end_idx]
if blocks[0][0] < range_start:
blocks[0][0] = range_start
if blocks[-1][1] > range_end:
blocks[-1][1] = range_end
# emit the first gap, if any
if range_start < blocks[0][0]:
yield (range_start, blocks[0][0])
# loop through till the end of the blocks
end = blocks[0][1]
for block in blocks[1:]:
if end < block[0]:
yield (end, block[0])
end = block[1]
elif end < block[1]:
end = block[1]
# emit the last gap, if any
if range_end > blocks[-1][1]:
yield (blocks[-1][1], range_end)
blocks = [(1, 10), (4, 5), (9, 11), (16, 17.5)]
list(delete_blocks_from_interval(-1, 20, blocks))
python-ranges is a library I wrote that excels at this particular use case. It isn't the most efficient code you could possibly write (it essentially uses #wwii's algorithm below, in fact) but it is nice and terse.
from ranges import Range, RangeSet
...
def delete_blocks_from_interval(X, Y, blocks):
# make a Range
orig = Range(X, Y)
# make a RangeSet out of the 2-tuple blocks
# (using the unpacking operator to interpret 2-tuples as positional args for Range())
# and then find the difference from the original set
# (like with sets, the - operator is a shorthand for .difference())
remaining = orig - RangeSet(
Range(*block) for block in blocks
)
# return each range in the RangeSet as a tuple
return [(rng.start, rng.end) for rng in remaining.ranges()]
Would something like this work?
def delete_blocks_from_interval(a, b, blocks):
sorted_blocks = sorted(blocks)
for i, (c, d) in enumerate(sorted_blocks):
if a <= c <= b:
yield (a, c)
if d > a:
a = d
for (e, f) in sorted_blocks[i + 1:]:
if e <= a <= f:
a = f
elif e > a:
break
if a <= d <= b:
yield (d, b)
blocks = ((1, 10), (4, 5), (9, 11), (16, 17.5))
print(list(delete_blocks_from_interval(-1, 20, blocks)))
# (-1, 1), (11, 16), (17.5, 20)
Sort the blocks;
check if there is a gap between X and item 0 of the first block; save if there is;
iterate over blocks in pairs,
get the first and second item
if item 1 of the first block is less than item 0 of the second block,
save this interval (first[1],second[0]) ;
or if they overlap,
compare the last items,
if the last item of the second tuple is bigger than the last item of the first - expand the first tuple range,
get the third/next item, repeat the comparison(s)
repeat til you get to the end of the (X,Y) interval or you run out of blocks.

Is it possible to write a combination function with recursion technique?

Yesterday, I encountered a problem which requires calculating combinations in an iterable with range 5.
Instead of using itertools.combination, I tried to make a primitive function of my own. It looks like:
def combine_5(elements):
"""Find all combinations in elements with range 5."""
temp_list = []
for i in elements:
cur_index = elements.index(i)
for j in elements[cur_index+1 : ]:
cur_index = elements.index(j)
for k in elements[cur_index+1 : ]:
cur_index = elements.index(k)
for n in elements[cur_index+1 : ]:
cur_index = elements.index(n)
for m in elements[cur_index+1 : ]:
temp_list.append((i,j,k,n,m))
return temp_list
Then I thought maybe I can abstract it a bit, to make a combine_n function. And below is my initial blueprint:
# Unfinished version of combine_n
def combine_n(elements, r, cur_index=-1):
"""Find all combinations in elements with range n"""
r -= 1
target_list = elements[cur_index+1 : ]
for i in target_list:
cur_index = elements.index(i)
if r > 0:
combine_n(elements, r, cur_index)
pass
else:
pass
Then I've been stuck there for a whole day, the major problem is that I can't convey a value properly inside the recursive function. I added some code that fixed one problem. But as it works for every recursive loop, new problems arose. More fixes lead to more bugs, a vicious cycle.
And then I went for help to itertools.combination's source code. And it turns out it didn't use recursion technique.
Do you think it is possible to abstract this combine_5 function into a combine_n function with recursion technique? Do you have any ideas about its realization?
FAILURE SAMPLE 1:
def combine_n(elements, r, cur_index=-1):
"""Find all combinations in elements with range n"""
r -= 1
target_list = elements[cur_index+1 : ]
for i in target_list:
cur_index = elements.index(i)
if r > 0:
combine_n(elements, r, cur_index)
print i
else:
print i
This is my recent try after a bunch of overcomplicated experiments.
The core ideas is: if I can print them right, I can collect them into a container later.
But the problem is, in a nested for loop, when the lower for-loop hit with an empty list.
The temp_list.append((i,j,k,n,m)) clause of combine_5 will not work.
But in FAILURE SAMPLE 1, it still will print the content of the upper for-loop
like combine_n([0,1], 2) will print 2, 1, 2.
I need to find a way to convey this empty message to the superior for-loop.
Which I didn't figure out so far.
Yes, it's possible to do it with recursion. You can make combine_n return a list of tuples with all the combinations beginning at index cur_index, and starting with a partial combination of cur_combo, which you build up as you recurse:
def combine_n(elements, r, cur_index=0, cur_combo=()):
r-=1
temp_list = []
for elem_index in range(cur_index, len(elements)-r):
i = elements[elem_index]
if r > 0:
temp_list = temp_list + combine_n(elements, r, elem_index+1, cur_combo+(i,))
else:
temp_list.append(cur_combo+(i,))
return temp_list
elements = list(range(1,6))
print = combine_n(elements, 3)
output:
[(1, 2, 3), (1, 2, 4), (1, 2, 5), (1, 3, 4), (1, 3, 5), (1, 4, 5), (2, 3, 4), (2, 3, 5), (2, 4, 5), (3, 4, 5)]
The for loop only goes up to len(elements)-r, because if you go further than that then there aren't enough remaining elements to fill the remaining places in the tuple. The tuples only get added to the list with append at the last level of recursion, then they get passed back up the call stack by returning the temp_lists and concatenating at each level back to the top.

Generate a random derangement of a list

How can I randomly shuffle a list so that none of the elements remains in its original position?
In other words, given a list A with distinct elements, I'd like to generate a permutation B of it so that
this permutation is random
and for each n, a[n] != b[n]
e.g.
a = [1,2,3,4]
b = [4,1,2,3] # good
b = [4,2,1,3] # good
a = [1,2,3,4]
x = [2,4,3,1] # bad
I don't know the proper term for such a permutation (is it "total"?) thus having a hard time googling. The correct term appears to be "derangement".
After some research I was able to implement the "early refusal" algorithm as described e.g. in this paper [1]. It goes like this:
import random
def random_derangement(n):
while True:
v = [i for i in range(n)]
for j in range(n - 1, -1, -1):
p = random.randint(0, j)
if v[p] == j:
break
else:
v[j], v[p] = v[p], v[j]
else:
if v[0] != 0:
return tuple(v)
The idea is: we keep shuffling the array, once we find that the permutation we're working on is not valid (v[i]==i), we break and start from scratch.
A quick test shows that this algorithm generates all derangements uniformly:
N = 4
# enumerate all derangements for testing
import itertools
counter = {}
for p in itertools.permutations(range(N)):
if all(p[i] != i for i in p):
counter[p] = 0
# make M probes for each derangement
M = 5000
for _ in range(M*len(counter)):
# generate a random derangement
p = random_derangement(N)
# is it really?
assert p in counter
# ok, record it
counter[p] += 1
# the distribution looks uniform
for p, c in sorted(counter.items()):
print p, c
Results:
(1, 0, 3, 2) 4934
(1, 2, 3, 0) 4952
(1, 3, 0, 2) 4980
(2, 0, 3, 1) 5054
(2, 3, 0, 1) 5032
(2, 3, 1, 0) 5053
(3, 0, 1, 2) 4951
(3, 2, 0, 1) 5048
(3, 2, 1, 0) 4996
I choose this algorithm for simplicity, this presentation [2] briefly outlines other ideas.
References:
[1] An analysis of a simple algorithm for random derangements. Merlini, Sprugnoli, Verri. WSPC Proceedings, 2007.
[2] Generating random derangements. Martínez, Panholzer, Prodinger.
Such permutations are called derangements. In practice you can just try random permutations until hitting a derangement, their ratio approaches the inverse of 'e' as 'n' grows.
As a possible starting point, the Fisher-Yates shuffle goes like this.
def swap(xs, a, b):
xs[a], xs[b] = xs[b], xs[a]
def permute(xs):
for a in xrange(len(xs)):
b = random.choice(xrange(a, len(xs)))
swap(xs, a, b)
Perhaps this will do the trick?
def derange(xs):
for a in xrange(len(xs) - 1):
b = random.choice(xrange(a + 1, len(xs) - 1))
swap(xs, a, b)
swap(len(xs) - 1, random.choice(xrange(n - 1))
Here's the version described by Vatine:
def derange(xs):
for a in xrange(1, len(xs)):
b = random.choice(xrange(0, a))
swap(xs, a, b)
return xs
A quick statistical test:
from collections import Counter
def test(n):
derangements = (tuple(derange(range(n))) for _ in xrange(10000))
for k,v in Counter(derangements).iteritems():
print('{} {}').format(k, v)
test(4):
(1, 3, 0, 2) 1665
(2, 0, 3, 1) 1702
(3, 2, 0, 1) 1636
(1, 2, 3, 0) 1632
(3, 0, 1, 2) 1694
(2, 3, 1, 0) 1671
This does appear uniform over its range, and it has the nice property that each element has an equal chance to appear in each allowed slot.
But unfortunately it doesn't include all of the derangements. There are 9 derangements of size 4. (The formula and an example for n=4 are given on the Wikipedia article).
This should work
import random
totalrandom = False
array = [1, 2, 3, 4]
it = 0
while totalrandom == False:
it += 1
shuffledArray = sorted(array, key=lambda k: random.random())
total = 0
for i in array:
if array[i-1] != shuffledArray[i-1]: total += 1
if total == 4:
totalrandom = True
if it > 10*len(array):
print("'Total random' shuffle impossible")
exit()
print(shuffledArray)
Note the variable it which exits the code if too many iterations are called. This accounts for arrays such as [1, 1, 1] or [3]
EDIT
Turns out that if you're using this with large arrays (bigger than 15 or so), it will be CPU intensive. Using a randomly generated 100 element array and upping it to len(array)**3, it takes my Samsung Galaxy S4 a long time to solve.
EDIT 2
After about 1200 seconds (20 minutes), the program ended saying 'Total Random shuffle impossible'. For large arrays, you need a very large number of permutations... Say len(array)**10 or something.
Code:
import random, time
totalrandom = False
array = []
it = 0
for i in range(1, 100):
array.append(random.randint(1, 6))
start = time.time()
while totalrandom == False:
it += 1
shuffledArray = sorted(array, key=lambda k: random.random())
total = 0
for i in array:
if array[i-1] != shuffledArray[i-1]: total += 1
if total == 4:
totalrandom = True
if it > len(array)**3:
end = time.time()
print(end-start)
print("'Total random' shuffle impossible")
exit()
end = time.time()
print(end-start)
print(shuffledArray)
Here is a smaller one, with pythonic syntax -
import random
def derange(s):
d=s[:]
while any([a==b for a,b in zip(d,s)]):random.shuffle(d)
return d
All it does is shuffles the list until there is no element-wise match. Also, be careful that it'll run forever if a list that cannot be deranged is passed.It happens when there are duplicates. To remove duplicates simply call the function like this derange(list(set(my_list_to_be_deranged))).
import random
a=[1,2,3,4]
c=[]
i=0
while i < len(a):
while 1:
k=random.choice(a)
#print k,a[i]
if k==a[i]:
pass
else:
if k not in c:
if i==len(a)-2:
if a[len(a)-1] not in c:
if k==a[len(a)-1]:
c.append(k)
break
else:
c.append(k)
break
else:
c.append(k)
break
i=i+1
print c
A quick way is to try to shuffle your list until you reach that state. You simply try to shuffle your list until you are left with a list that satisfies your condition.
import random
import copy
def is_derangement(l_original, l_proposal):
return all([l_original[i] != item for i, item in enumerate(l_proposal)])
l_original = [1, 2, 3, 4, 5]
l_proposal = copy.copy(l_original)
while not is_derangement(l_original, l_proposal):
random.shuffle(l_proposal)
print(l_proposal)

How to split a list into subsets with no repeating elements in python

I need code that takes a list (up to n=31) and returns all possible subsets of n=3 without any two elements repeating in the same subset twice (think of people who are teaming up in groups of 3 with new people every time):
list=[1,2,3,4,5,6,7,8,9]
and returns
[1,2,3][4,5,6][7,8,9]
[1,4,7][2,3,8][3,6,9]
[1,6,8][2,4,9][3,5,7]
but not:
[1,5,7][2,4,8][3,6,9]
because 1 and 7 have appeared together already (likewise, 3 and 9).
I would also like to do this for subsets of n=2.
Thank you!!
Here's what I came up with:
from itertools import permutations, combinations, ifilter, chain
people = [1,2,3,4,5,6,7,8,9]
#get all combinations of 3 sets of 3 people
combos_combos = combinations(combinations(people,3), 3)
#filter out sets that don't contain all 9 people
valid_sets = ifilter(lambda combo:
len(set(chain.from_iterable(combo))) == 9,
combos_combos)
#a set of people that have already been paired
already_together = set()
for sets in valid_sets:
#get all (sorted) combinations of pairings in this set
pairings = list(chain.from_iterable(combinations(combo, 2) for combo in sets))
pairings = set(map(tuple, map(sorted, pairings)))
#if all of the pairings have never been paired before, we have a new one
if len(pairings.intersection(already_together)) == 0:
print sets
already_together.update(pairings)
This prints:
~$ time python test_combos.py
((1, 2, 3), (4, 5, 6), (7, 8, 9))
((1, 4, 7), (2, 5, 8), (3, 6, 9))
((1, 5, 9), (2, 6, 7), (3, 4, 8))
((1, 6, 8), (2, 4, 9), (3, 5, 7))
real 0m0.182s
user 0m0.164s
sys 0m0.012s
Try this:
from itertools import permutations
lst = list(range(1, 10))
n = 3
triplets = list(permutations(lst, n))
triplets = [set(x) for x in triplets]
def array_unique(seq):
checked = []
for x in seq:
if x not in checked:
checked.append(x)
return checked
triplets = array_unique(triplets)
result = []
m = n * 3
for x in triplets:
for y in triplets:
for z in triplets:
if len(x.union(y.union(z))) == m:
result += [[x, y, z]]
def groups(sets, i):
result = [sets[i]]
for x in sets:
flag = True
for y in result:
for r in x:
for p in y:
if len(r.intersection(p)) >= 2:
flag = False
break
else:
continue
if flag == False:
break
if flag == True:
result.append(x)
return result
for i in range(len(result)):
print('%d:' % (i + 1))
for x in groups(result, i):
print(x)
Output for n = 10:
http://pastebin.com/Vm54HRq3
Here's my attempt of a fairly general solution to your problem.
from itertools import combinations
n = 3
l = range(1, 10)
def f(l, n, used, top):
if len(l) == n:
if all(set(x) not in used for x in combinations(l, 2)):
yield [l]
else:
for group in combinations(l, n):
if any(set(x) in used for x in combinations(group, 2)):
continue
for rest in f([i for i in l if i not in group], n, used, False):
config = [list(group)] + rest
if top:
# Running at top level, this is a valid
# configuration. Update used list.
for c in config:
used.extend(set(x) for x in combinations(c, 2))
yield config
break
for i in f(l, n, [], True):
print i
However, it is very slow for high values of n, too slow for n=31. I don't have time right now to try to improve the speed, but I might try later. Suggestions are welcome!
My wife had this problem trying to arrange breakout groups for a meeting with nine people; she wanted no pairs of attendees to repeat.
I immediately busted out itertools and was stumped and came to StackOverflow. But in the meantime, my non-programmer wife solved it visually. The key insight is to create a tic-tac-toe grid:
1 2 3
4 5 6
7 8 9
And then simply take 3 groups going down, 3 groups going across, and 3 groups going diagonally wrapping around, and 3 groups going diagonally the other way, wrapping around.
You can do it just in your head then.
- : 123,456,789
| : 147,258,368
\ : 159,267,348
/ : 168,249,357
I suppose the next question is how far can you take a visual method like this? Does it rely on the coincidence that the desired subset size * the number of subsets = the number of total elements?

Categories

Resources