Grouping a list of integers with nearest values - python

I have a list:
d = [23, 67, 110, 25, 69, 24, 102, 109]
how can I group nearest values with a dynamic gap, and create a tuple like this, what is the fastest method? :
[(23, 24, 25), (67, 69), (102, 109, 110)]

Like
d = [23,67,110,25,69,24,102,109]
d.sort()
diff = [y - x for x, y in zip(*[iter(d)] * 2)]
avg = sum(diff) / len(diff)
m = [[d[0]]]
for x in d[1:]:
if x - m[-1][0] < avg:
m[-1].append(x)
else:
m.append([x])
print m
## [[23, 24, 25], [67, 69], [102, 109, 110]]
Fist we calculate an average difference between sequential elements and then group together elements whose difference is less than average.

Related

how to find max, min and average by using if technique in a 2d array in python

x=[[80,59,34,89],[31,11,47,64],[29,56,13,91],[55,61,48,0],[75,78,81,91]]
I want to find maximum minimum and average value of the above 2d array.
You can use numpy module to find min and max values easily:
import numpy as np
x = np.array([[80, 59, 34, 89], [31, 11, 47, 64], [29, 56, 13, 91], [55, 61, 48, 0], [75, 78, 81, 91]])
minValue = np.min(x)
maxValue = np.max(x)
print(minValue)
print(maxValue)
If you need to find them without build-in methods, you can use an approach as follows:
x = [[80, 59, 34, 89], [31, 11, 47, 64], [29, 56, 13, 91], [55, 61, 48, 0], [75, 78, 81, 91]]
minValue = x[0][0]
maxValue = x[0][0]
sumAll = 0
count = 0
for inner in x:
for each in inner:
if each > maxValue: maxValue = each
if each < minValue: minValue = each
sumAll += each
count += 1
average = sumAll / count
In this approach, you compare each value to find min and max. At the same time sum, count each element to calculate average.
You can get maximum , minimum and average of 2D array with using map like
def Average(lst):
return sum(lst) / len(lst)
x=[[80,59,34,89],[31,11,47,64],[29,56,13,91],[55,61,48,0],[75,78,81,91]]
maximum = max(map(max, x)) // 91
minimum = min(map(min, x)) // 0
average = Average(list(map(lambda idx: sum(idx)/float(len(idx)), x))) // 54.65
You can use numpy to flatten the 2d array into an 1d array.
import numpy as np
x=[[80,59,34,89],[31,11,47,64],[29,56,13,91],[55,61,48,0],[75,78,81,91]]
x = np.array(x)
print(max(x.flatten()))
print(min(x.flatten()))
print(sum(x.flatten())/ len(x.flatten()))

Given 2 list of integers how to find the non-overlapping ranges?

Given
x = [5, 30, 58, 72]
y = [8, 35, 53, 60, 66, 67, 68, 73]
The goal is to iterate through x_i and find the value for y that's larger than x_i but not larger than x_i+1
Assume that both list are sorted and all items are unique, the desired output given the x and y is:
[(5, 8), (30, 35), (58, 60), (72, 73)]
I've tried:
def per_window(sequence, n=1):
"""
From http://stackoverflow.com/q/42220614/610569
>>> list(per_window([1,2,3,4], n=2))
[(1, 2), (2, 3), (3, 4)]
>>> list(per_window([1,2,3,4], n=3))
[(1, 2, 3), (2, 3, 4)]
"""
start, stop = 0, n
seq = list(sequence)
while stop <= len(seq):
yield tuple(seq[start:stop])
start += 1
stop += 1
x = [5, 30, 58, 72]
y = [8, 35, 53, 60, 66, 67, 68, 73]
r = []
for xi, xiplus1 in per_window(x, 2):
for j, yj in enumerate(y):
if yj > xi and yj < xiplus1:
r.append((xi, yj))
break
# For the last x value.
# For the last x value.
for j, yj in enumerate(y):
if yj > xiplus1:
r.append((xiplus1, yj))
break
But is there a simpler way to achieve the same with numpy, pandas or something else?
You can use numpy.searchsorted with side='right' to find out the index of the first value in y that is larger than x and then extract the elements with the index; A simple version which assumes there is always one value in y larger than any element in x could be:
x = np.array([5, 30, 58, 72])
y = np.array([8, 35, 53, 60, 66, 67, 68, 73])
np.column_stack((x, y[np.searchsorted(y, x, side='right')]))
#array([[ 5, 8],
# [30, 35],
# [58, 60],
# [72, 73]])
Given y is sorted:
np.searchsorted(y, x, side='right')
# array([0, 1, 3, 7])
returns the index of the first value in y that is larger than the corresponding value in x.
We can use pd.DataFrame on list with merge_asof with direction = forward i.e
new = pd.merge_asof(pd.DataFrame(x,index=x), pd.DataFrame(y,index=y),on=0,left_index=True,direction='forward')
out = list(zip(new[0],new.index))
If you dont need exact matches to match the you need to pass allow_exact_matches=False to merge_asof
Output :
[(5, 8), (30, 35), (58, 60), (72, 73)]
You can construct a new list by iterating over x zipped with itself -- offset by 1 index and appended with the last element of y -- and then iterating over y, check the condition at each pass and break the inner most loop.
out = []
for x_low, x_high in zip(x, x[1:]+y[-1:]):
for yy in y:
if (yy>x_low) and (yy<=x_high):
out.append((x_low,yy))
break
out
# returns:
[(5, 8), (30, 35), (58, 60), (72, 73)]
def find(list1,list2):
final = []
for i in range(len(list1)):
pos=0
try:
while True:
if i+1==len(list1) and list1[i]<list2[pos]:
final.append((list1[i],list2[pos]))
raise Exception
if list1[i]<list2[pos] and list1[i+1]>list2[pos]:
final.append((list1[i],list2[pos]))
raise Exception
pos+=1
except: pass
return final

Finding max and min indices in lists in Python

I have a list that looks like:
trial_lst = [0.5, 3, 6, 40, 90, 130.8, 129, 111, 8, 9, 0.01, 9, 40, 90, 130.1, 112, 108, 90, 77, 68, 0.9, 8, 40, 90, 92, 130.4]
The list represents a series of experiments, each with a minimum and a maximum index. For example, in the list above, the minimum and maximum would be as follows:
Experiment 1:
Min: 0.5
Max: 130.8
Experiment 2:
Min: 0.01
Max: 130.1
Experiment 3:
Min: 0.9
Max: 103.4
I obtained the values for each experiment above because I know that each
experiment starts at around zero (such as 0.4, 0.001, 0.009, etc.) and ends at around 130 (130, 131.2, 130.009, etc.). You can imagine a nozzle turning on and off. When it turns on, the pressure rises and as it's turned off, the pressure dips. I am trying to calculate the minimum and maximum values for each experiment.
What I've tried so far is iterating through the list to first mark each index as max, but I can't seem to get that right.
Here is my code. Any suggestions on how I can change it?
for idx, item in enumerate(trial_lst):
if idx > 0:
prev = trial_lst[idx-1]
curr = item
if prev > curr:
result.append((curr, "max"))
else:
result.append((curr, ""))
I am looking for a manual way to do this, no libraries.
Use the easiest way ( sort your list or array first ):
trial_lst = [0.5, 3, 6, 40, 90, 130.8, 129, 111, 8, 9, 0.01, 9, 40, 90, 130.1, 112, 108, 90, 77, 68, 0.9, 8, 40, 90, 92, 130.4]
trial_lst.sort(key=float)
for count, items in enumerate(trial_lst):
counter = count + 1
last_object = (counter, trial_lst[count], trial_lst[(len(trial_lst)-1) - count])
print( last_object )
You can easily get the index of the minimum value using the following:
my_list.index(min(my_list))
Here is an interactive demonstration which may help:
>>> trial_lst = [0.5, 3, 6, 40, 90, 130.8, 129, 111, 8, 9, 0.01, 9, 40, 90, 130.1, 112, 108, 90, 77, 68, 0.9, 8, 40, 90, 92, 130.4]
Use values below 1 to identify where one experiment ends and another begins
>>> indices = [x[0] for x in enumerate(map(lambda x:x<1, trial_lst)) if x[1]]
Break list into sublists at those values
>>> sublists = [trial_lst[i:j] for i,j in zip([0]+indices, indices+[None])[1:]]
Compute max/min for each sublist
>>> for i,l in enumerate(sublists):
... print "Experiment", i+1
... print "Min", min(l)
... print "Max", max(l)
... print
...
Experiment 1
Min 0.5
Max 130.8
Experiment 2
Min 0.01
Max 130.1
Experiment 3
Min 0.9
Max 130.4

take sum of ints preserving specific information

I have a list of ints
list = [25, 50, 70, 32, 10, 20, 50, 40, 30]
And I would like to sum up the ints (from left to right) if their sum is smaller than 99. Lets say I write this output to a list, than this list should look like this:
#75 because 25+50 = 70. 25+50+70 would be > 99
new_list = [75, 70, 62, 90, 30]
#70 because 70+32 > 99
#62 because 32+10+20 = 62. 32+10+20+50 would be > 99
But that is not all. I want to save the ints the sum was made from as well. So what I actually want to have is a data structure that looks like this:
list0 = [ [(25,50),75], [(70),70], [(32, 10, 20),62], [(50, 40),90], [(30),30] ]
How can I do this?
Use a separate list to track your numbers:
results = []
result = []
for num in inputlist:
if sum(result) + num < 100:
result.append(num)
else:
results.append([tuple(result), sum(result)])
result = [num]
if result:
results.append([tuple(result), sum(result)])
For your sample input, this produces:
[[(25, 50), 75], [(70,), 70], [(32, 10, 20), 62], [(50, 40), 90], [(30,), 30]]
You can use iterator fo this:
l = [25, 50, 70, 32, 10, 20, 50, 40, 30]
def sum_iter(lst):
s = 0
t = tuple()
for i in lst:
if s + i <= 99:
s += i
t += (i,)
else:
yield t, s
s = i
t = (i,)
else:
yield t, s
res = [[t, s] for t, s in sum_iter(l)]
On your data result is:
[[(25, 50), 75], [(70,), 70], [(32, 10, 20), 62], [(50, 40), 90], [(30,), 30]]

Finding the largest delta between two integers in a list

I have a list of integers, i.e.:
values = [55, 55, 56, 57, 57, 57, 57, 62, 63, 64, 79, 80]
I am trying to find the largest difference between two consecutive numbers.
In this case it would be 15 from 64->79.
The numbers can be negative or positive, increasing or decreasing or both. The important thing is I need to find the largest delta between two consecutive numbers.
What is the fastest way to do this? These lists can contain anywhere from hundreds to thousands of integers.
This is the code I have right now:
prev_value = values[0]
largest_delta = 0
for value in values:
delta = value - prev_value
if delta > largest_delta:
largest_delta = delta
prev_value = value
return largest_delta
Is there a faster way to do this? It takes a while.
max(abs(x - y) for (x, y) in zip(values[1:], values[:-1]))
Try timing some of these with the timeit module:
>>> values = [55, 55, 56, 57, 57, 57, 57, 62, 63, 64, 79, 80]
>>> max(values[i+1] - values[i] for i in xrange(0, len(values) - 1))
15
>>> max(v1 - v0 for v0, v1 in zip(values[:-1], values[1:]))
15
>>> from itertools import izip, islice
>>> max(v1 - v0 for v0, v1 in izip(values[:-1], values[1:]))
15
>>> max(v1 - v0 for v0, v1 in izip(values, islice(values,1,None)))
15
>>>
This is more as an advertisement for the brilliant recipes in the Python itertools help.
In this case use pairwise as shown in the help linked above.
from itertools import tee, izip
def pairwise(iterable):
"s -> (s0,s1), (s1,s2), (s2, s3), ..."
a, b = tee(iterable)
next(b, None)
return izip(a, b)
values = [55, 55, 56, 57, 57, 57, 57, 62, 63, 64, 79, 80]
print max(b - a for a,b in pairwise(values))
With reduce (ugly i guess)
>>> foo = [5, 5, 5, 5, 8, 8, 9]
>>> print reduce(lambda x, y: (max(x[0], y - x[1]), y), foo, (0, foo[0]))[0]
3
Starting in Python 3.10, the new pairwise function provides a way to slide through pairs of consecutive elements, and thus find each of their differences:
from itertools import pairwise
# values = [55, 55, 56, 57, 57, 57, 57, 62, 63, 64, 79, 80]
max(abs(x-y) for x, y in pairwise(values))
# 15
The intermediate result of pairwise:
pairwise([55, 55, 56, 57, 57, 57, 57, 62, 63, 64, 79, 80])
# [(55, 55), (55, 56), (56, 57), (57, 57), (57, 57), (57, 57), ...]

Categories

Resources