Find gaps in list of range values - python

I found numerous similar questions in other programming languages (ruby, C++, JS, etc) but not for Python. Since Python has e.g. itertools I wonder whether we can do the same more elegantly in Python.
Let's say we have a "complete range", [1,100] and then a subset of ranges within/matching the "complete range":
[10,50]
[90,100]
How can we extract the not covered positions, in this case [1,9], [51,89]?
This is a toy example, in my real dataset I have ranges up to thousands.

Here is a neat solution using itertools.chain: I've assumed the input ranges don't overlap. If they do, they need to be simplified first using a union-of-ranges algorithm.
from itertools import chain
def range_gaps(a, b, ranges):
ranges = sorted(ranges)
flat = chain((a-1,), chain.from_iterable(ranges), (b+1,))
return [[x+1, y-1] for x, y in zip(flat, flat) if x+1 < y]
Taking range_gaps(1, 100, [[10, 50], [90, 100]]) as an example:
First sort the ranges in case they aren't already in order. If they are guaranteed to be in order, this step is not needed.
Then flat is an iterable which will give the sequence 0, 10, 50, 90, 100, 101.
Since flat is lazily evaluated and is consumed by iterating over it, zip(flat, flat) gives a sequence of pairs like (0, 10), (50, 90), (100, 101).
The ranges required are then like (1, 9), (51, 89) and the case of (100, 101) should give an empty range so it is discarded.

Assuming the list contains only integers, and the sub-ranges are in increasing order and not overlapping, You can use below code.
This code will take all sub ranges one by one, and will compare with original complete range and the sub range before it, to find the missing range.
[start,end]=[1,100]
chunks=[[25,31],[7,15],[74,83]]
print([r for r in [[start,chunks[0][0]-1] if start!=chunks[0][0] else []] + [[chunks[i-1][1]+1, chunks[i][0]-1] for i in range(1,len(chunks))]+[[chunks[-1][1]+1,end] if end!=chunks[-1][1] else []] if r])
Input
[1,100]
[[7,15],[25,31],[74,83]]
Output
[[1, 6], [16, 24], [32, 73], [84, 100]]
If increasing order of sub ranges are not guaranteed. you can include below line to sort chunks.
chunks.sort(key=lambda x: x[0])

This is a generic solution:
def gap(N, ranges):
ranges=[(min1, max1), (min2, (max2), ......, (minn, maxn)]
original=set(range(N))
for i in ranges:
original=original-set(range(i[0], i[1]))
return original

Related

Does zip operation conserve order?

I was reading the following example from geeksforgeeks:
# Python code to demonstrate the working of
# zip()
# initializing lists
name = [ "Manjeet", "Nikhil", "Shambhavi", "Astha" ]
roll_no = [ 4, 1, 3, 2 ]
marks = [ 40, 50, 60, 70 ]
# using zip() to map values
mapped = zip(name, roll_no, marks)
# converting values to print as set
mapped = set(mapped)
# printing resultant values
print ("The zipped result is : ",end="")
print (mapped)
but if you see the result:
The zipped result is : {('Shambhavi', 3, 60), ('Astha', 2, 70),
('Manjeet', 4, 40), ('Nikhil', 1, 50)}
I would have expected to see {('Manjeet', 4, 40), ('Nikhil', 1, 50), ('Shambhavi', 3, 60), ('Astha', 2, 70)}. So this made me thing if I want to do a mathematical operation between two lists by using zip, will zip itself change the order? I tried this little code, but it seems it doesn't, but still, I have the doubt. Did I just have luck this time or do I have to worry about it? I really need that the position of the couples in (A,B) do not change.
A = range(1,14)
B = range(2,15)
data = [x + y for x, y in zip(A, B)]
print(data)
zip makes use of the underlying iterators. It doesn't change the order.
Here is the doc
The left-to-right evaluation order of the iterables is guaranteed. This makes possible an idiom for clustering a data series into n-length groups using zip(*[iter(s)]*n)
Zip does not change the order of the objects passed to it. However set does, as it forms an unordered set from the zip results.

How to identify the maximum in list of lists

V = [[10,20,30,40],[30,40,50,-50,-70]]
V_max_result = max(V)
V_max_result_index = V.index(max(V))
print(V_max_result,V_max_result_index)
Presently, it is giving output like [30, 40, 50, -50, -70] 1
I wanted it to show something like [[40,3],[50,2]] where 40 is maximum in first list and it is located at 3.
V = [[10,20,30,40], [30,40,50,-50,-70]]
print([[max(per_v), per_v.index(max(per_v))] for per_v in V])
Output:
[[40, 3], [50, 2]]
While the answer by #atline works, it has to iterate three times over each sublist, once to get the maximum, a second time to get the maximum again and then a third time to find its position in the list.
Instead it is easier to find the maximum of tuples, where each tuple is like (value, index). This is almost what enumerate returns (index, value), just reversed. We can use zip(values, itertools.count()) for that. It works, because tuples sort by first sorting according to the first entry, then the second.
from itertools import count
V = [[10,20,30,40], [30,40,50,-50,-70]]
print([max(zip(per_v, count())) for per_v in V])
# [(40, 3), (50, 2)]
If you insist on the inner tuples being lists as well:
print([list(max(zip(per_v, count()))) for per_v in V])
# [[40, 3], [50, 2]]

intercept dictionary items which are two dimensional array

Here is my dictionary of n items.
{
"proceed": [[6,46] , [7,67], [12,217], [67,562], [67,89]],
"concluded": [[6,46] , [783,123], [121,521], [67,12351], [67,12351]],
...
}
imagine a dictionary s.t. like that with n keys and items which are two dimensional arrays.
I want to intercept all of them and take the result as [6,46]
I tried s.t. like that :
result=set.intersection(*map(set,output.values()))
however it got error because of items are two dimensinal array.
Can someone please help me how to do that ?
Thanks.
So... sets don't work for lists because lists are not hashable. Instead you'll have to make them sets of tuples like so:
result = set.intersection(*({tuple(p) for p in v} for v in output.values()))
Edit: works in py version >= 2.7
Completely agree with answer of #FHTMitchell but here's a bit of more explanation with example of why you can't get unique set with list and get TypeError: unhashable type
Consider below values:
x = {'concluded': [[6, 46], [783, 123], [121, 521], [67, 12351], [67, 12351]],
'proceed': [[6, 46], [7, 1], [12, 217], [67, 562], [67, 89]]}
y = {'concluded': ((6, 46), (67, 12351), (121, 521), (783, 123)),
'proceed': ((6, 46), (7, 1), (12, 217), (67, 89), (67, 562))}
x is the dictionary containing list of list as values; the main thing to note is that value of keys are stored as list which is mutable; but in y it's tuple of tuples or you may keep it as set which is not mutable
Now consider some how you managed to get your desire output [6,46] but if you notice it's a list contains some elements stored in a list so if you change the values as below:
x['proceed'][0][0] = 9
it will change your value [6, 46] to [9,46] in concluded key and now your output may or may not change which depends on how you iterated and stored it.

Sort 2 lists in Python based on the ratio of individual corresponding elements or based on a third list

I am trying to write different implementations for a fractional knapsack problem.
For this I have 2 arrays:
Values
Weights
The elements value[n] corresponds to element weights[n]. So we can calculate value_per_unit as:
for I in range(values):
value_per_unit.append(values[I]/weights[I])
value_per_unit.sort()
I now need the 2 arrays (values and weights) to be sorted according to the value_per_unit array
eg:
If
values = [60, 100, 120]
weights = [20, 50, 30]
Then
values_per_unit = [3.0, 2.0, 4.0]
and so values_per_unit_sorted will be [2.0, 3.0, 4.0]
I need the values and weights arrays to become:
values_sorted = [100,60,120]
weights_sorted = [50,20,30]
Is there a way to achieve this using simple lambda functions?
I can still do something like this, but it seems highly inefficient every-time I need to access the elements:
weights[(value_per_unit_sorted.index(max(value_per_unit_sorted)))]
In one line:
values, weights = zip(*sorted(zip(values, weights), key=lambda t: t[0]/t[1]))
To explain: First, zip the lists to pair them.
pairs = zip(values, weights)
# [(60, 20), (100, 50), (120, 30)]
Then, sort by the quotient of value to weight.
sorted_pairs = sorted(pairs, key=lambda t: t[0]/t[1])
# [(100, 50), (60, 20), (120, 30)]
Finally, unzip them back into separate lists.
values, weights = zip(*sorted_pairs)
# (100, 60, 120), (50, 20, 30)
An alternative is to construct tuples explicitly containing the ratio as the first element.
ratios, values, weights = zip(*sorted((v/w, v, w) for v, w in zip(values, weights)))
The former appears to be slightly faster in some quick testing. If you're looking for an optimal algorithm, you're probably going to have to unroll things and the solution will not be as concise.
And to address the comment from #TomWyllie, if you already have the list of ratios, you can use:
ratios, values, weights = zip(*sorted(zip(ratios, values, weights)))
Note that these last two solutions differ from the initial solution in the case where two pairs have an equal ratio. These solutions will sort secondarily by value, while the first solution will keep the items in the same order as the original list.
An elegant way to do this is to make a multi-dimensional list with the values and weights:
for i in range(len(values)):
values_and_weights.append([values[i], weights[i])
# The resulting list is [[60, 20], [100, 50], [120, 30]]
Then, use the sort method with a value divided by weight as the key.
values_and_weights.sort(key=(lambda x: x[0]/x[1]))
For a more explicit (but arguably less pythonic) solution, create a list of indices, sorted by the value at that index in value_per_unit, and reorder values and weights accordingly.
sorted_indices = [index for index, value in
sorted(enumerate(value_per_unit), key=lambda x: x[1])]
values = [values[i] for i in sorted_indices]
weights = [weights[i] for i in sorted_indices]
print(values, weights)
Outputs:
([100, 60, 120], [50, 20, 30])
You can tidy this up, eliminating the unnecessary extra loops using zip, and with a generator expression;
values, weights = zip(*((values[i], weights[i]) for i, value in
sorted(enumerate(value_per_unit), key=lambda x: x[1])))
print(values)
print(weights)
Which outputs;
(100, 60, 120)
(50, 20, 30)
Note these final values are tuples not lists. If you really need the output to be a list, a simple values, weights = map(list, (values, weights)) should suffice. You could even wrap that into the one liner, although by that point it's probably getting pretty hard to follow what's happening.
The problem you are having is cause by using a calculated field over each element (element I will have the calculated value values[I]/weights[I]).
To solve this and still keep it extremely easy to understand, you can turn it into a tuple of this form: ( calculated_value, (value, weight) ), per element.
This approach keeps it easy to read and understand. Look at the following solution:
values = [60, 100, 120]
weights = [20, 50, 30]
value_per_unit = []
for I in range(len(values)):
value_per_unit.append( (values[I]/weights[I], (values[I], weights[I])) )
sorted_value_per_unit = sorted(value_per_unit, key=lambda x: x[0])
sorted_values = []
sorted_weights = []
for I in range(len(values)):
(value, weight) = sorted_value_per_unit[I][1]
sorted_values.append(value)
sorted_weights.append(weight)
print(str(sorted_values))
print(str(sorted_weights))
Also, note that I modified the loop from your original code:
range(values) was change to range(len(values))
Since range would need the length of the list, rather than the list itself.

Finding the difference between consecutive numbers in a list (Python)

Given a list of numbers, I am trying to write a code that finds the difference between consecutive elements. For instance, A = [1, 10, 100, 50, 40] so the output of the function should be [0, 9, 90, 50, 10]. Here is what I have so far trying to use recursion:
def deviation(A):
if len(A) < 2:
return
else:
return [abs(A[0]-A[1])] + [deviation(A[1: ])]
The output I get, however, (using the above example of A as the input) is [9, [90, [50, [10, None]]]]. How do I properly format my brackets? (I've tried guessing and checking but I this is the closest I have gotten) And how do I write this where it subtracts the current element from the previous element without getting an index error for the first element? I still want the first element of the output list to be zero but I do not know how to go about this using recursion and for some reason that seems the best route to me.
You can do:
[y-x for x, y in zip(A[:-1], A[1:])]
>>> A = [1, 10, 100, 50, 40]
>>> [y-x for x, y in zip(A[:-1], A[1:])]
[9, 90, -50, -10]
Note that the difference will be negative if the right side is smaller, you can easily fix this (If you consider this wrong), I'll leave the solution for you.
Explanation:
The best explanation you can get is simply printing each part of the list comprehension.
A[:-1] returns the list without the last element: [1, 10, 100, 50]
A[1:] returns the list without the first element: [10, 100, 50, 40]
zip(A[:-1], A[1:]) returns [(1, 10), (10, 100), (100, 50), (50, 40)]
The last step is simply returning the difference in each tuple.
The simplest (laziest) solution is to use the numpy function diff:
>>> A = [1, 10, 100, 50, 40]
>>> np.diff(A)
array([ 9, 90, -50, -10])
If you want the absolute value of the differences (as you've implied by your question), then take the absolute value of the array.
[abs(j-A[i+1]) for i,j in enumerate(A[:-1])]
You can do a list comprehension:
>>> A = [1, 10, 100, 50, 40]
>>> l=[A[0]]+A
>>> [abs(l[i-1]-l[i]) for i in range(1,len(l))]
[0, 9, 90, 50, 10]
For a longer recursive solution more in line with your original approach:
def deviation(A) :
if len(A) < 2 :
return []
else :
return [abs(A[0]-A[1])] + deviation(A[1:])
Your bracket issue is with your recursive call. Since you have your [deviation(a[1: ])] in its own [] brackets, with every recursive call you're going to be creating a new list, resulting in your many lists within lists.
In order to fix the None issue, just change your base case to an empty list []. Now your function will add 'nothing' to the end of your recursively made list, as opposed to the inherent None that comes with a blank return'
Actually recursion is an overkill:
def deviation(A):
yield 0
for i in range(len(A) - 1):
yield abs(A[i+1] - A[i])
Example:
>>> A = [3, 5, 2]
>>> list(deviation(A))
[0, 2, 3]
EDIT: Yet, another, even simplier and more efficient solution would be this:
def deviation(A):
prev = A[0]
for el in A:
yield abs(el - prev)
prev = el

Categories

Resources