Given an interval of time:
a = [20,40]
I need to covert it into an equal intervals of:
a = [[20,30],[30,40]]
I tried this code:
v1 = a[0]; v2 = a[1]
d.append(v1)
val = abs(v1-v2)
n = int(val/2)
for i in range(n):
v1 += n
d.append(v1)
print d
Can anyone suggest a code to do this it will be helpfull
I can point out a few incorrect things of what you've tried, instead of writing out the code for you.
for i in range(n):
v1 += n
d.append(v1)
Remember, from your example n is set to 10. So when you say for i in range(n), you will be iterating through your for loop 10 times.
And if you look at the way you append to d, this will not be appending a smaller list to a overall list. This will keep appending all numbers to just one list.
I'm guessing this is the output you are currently getting: [20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120].
With what I said, give it another shot :-)
You can solve the first part by calculating an interval and using a loop to create a tuple in each iteration using that interval. This will give an answer corresponding to the desired result you listed.
However, notice that in your example the end of the previous tuple is the same to the start of the next. If you don't want them to intersect, then you need to a similar logic to the first, but do + 1 at the right time.
Here is some code, the first using a for loop and the second using a post test loop:
def interval_divide(min, max, intervals):
assert intervals != 0
interval_size = round((max - min) / intervals)
result = []
start = min
for start in range(min, max, interval_size):
end = start + interval_size
result.append([start, end])
return result
a = [20, 40]
print("1 intervals", interval_divide(a[0], a[1], 1))
print("2 intervals", interval_divide(a[0], a[1], 2))
print("3 intervals", interval_divide(a[0], a[1], 3))
print("4 intervals", interval_divide(a[0], a[1], 4))
def interval_divide2(min, max, intervals):
assert intervals != 0
interval_size = round((max - min) / intervals)
result = []
start = min
end = min + interval_size
while True:
result.append([start, end])
start = end + 1
end = end + interval_size
if len(result) == intervals:
break
return result
print("-----")
print("1 intervals", interval_divide2(a[0], a[1], 1))
print("2 intervals", interval_divide2(a[0], a[1], 2))
print("3 intervals", interval_divide2(a[0], a[1], 3))
print("4 intervals", interval_divide2(a[0], a[1], 4))
The results will as follows:
$ python3 intervals.py
1 intervals [[20, 40]]
2 intervals [[20, 30], [30, 40]]
3 intervals [[20, 27], [27, 34], [34, 41]]
4 intervals [[20, 25], [25, 30], [30, 35], [35, 40]]
-----
1 intervals [[20, 40]]
2 intervals [[20, 30], [31, 40]]
3 intervals [[20, 27], [28, 34], [35, 41]]
4 intervals [[20, 25], [26, 30], [31, 35], [36, 40]]
Note that when we are using three intervals the end doesn't terminate properly. This is because we cannot divide 20 by 3 with no reminder, and thus its not possible to have all the intervals of the same size.
We can still improve our answer though by removing the rounding when we calculate the interval as follows (and still keep the result in integer terms):
def interval_divide(min, max, intervals):
assert intervals != 0
interval_size = (max - min) / intervals
result = []
start = min
end = start + interval_size
while True:
result.append([int(start), int(end)])
start = end
end = end + interval_size
if len(result) == intervals:
break
return result
def interval_divide2(min, max, intervals):
assert intervals != 0
interval_size = (max - min) / intervals
result = []
start = min
end = min + interval_size
while True:
result.append([int(start), int(end)])
start = end + 1
end = end + interval_size
if len(result) == intervals:
break
return result
The new answers are:
1 intervals [[20, 40]]
2 intervals [[20, 30], [30, 40]]
3 intervals [[20, 26], [26, 33], [33, 40]]
4 intervals [[20, 25], [25, 30], [30, 35], [35, 40]]
-----
1 intervals [[20, 40]]
2 intervals [[20, 30], [31, 40]]
3 intervals [[20, 26], [27, 33], [34, 40]]
4 intervals [[20, 25], [26, 30], [31, 35], [36, 40]]
The three intervals are still not fully equal, but pretty close without displaying the answer using decimal places.
Use this
import numpy as np
points = np.linspace(20, 40, num=2+1)
intervals = np.array([points[:-1], points[1:]]).transpose()
print(intervals)
and get a np.array:
[[20. 30.]
[30. 40.]]
Of course, for
points = np.linspace(10, 40, num=6+1)
intervals = np.array([points[:-1], points[1:]]).transpose()
print(intervals)
we have
[[10. 15.]
[15. 20.]
[20. 25.]
[25. 30.]
[30. 35.]
[35. 40.]]
Related
I have a dataframe of size 700x20. My data are pixel intensity coordinates for specific locations on an image, where i have 14 people where each has 50 images. I am trying to perform dimensionality reduction and for such task one of the steps require me to calculate the mean between each class, where i have two classes. In my dataframe in every 50th row are the features that belongs to a class, therefore i'd have from 0 to 50 features for class A, 51 to 100 features for class B, 101-150 for class A, 151-200 for class B and so on.
What i want to do is calculate the mean for every nth given row, from N to M and calculate the mean value. Here's a link for the dataframe for better visualization of the problem: Dataframe pickle file
What i tried was ordering the the dataframe and calculate separately but it didn't work, it calculated the mean for every row and grouped them in 14 different classes.
class_feature_means = pd.DataFrame(columns=target_names)
for c, rows in df.groupby('class'):
class_feature_means[c] = rows.mean()
class_feature_means
Minimal reproducible example:
my_array = np.asarray([[31, 25, 17, 62],
[31, 26, 19, 59,],
[31, 23, 17, 67,],
[31, 23, 19, 67,],
[31, 28, 17, 65,],
[32, 26, 19, 62,],
[32, 26, 17, 66,],
[30, 24, 17, 68],
[29, 24, 17, 68],
[33, 24, 17, 68],
[32, 52, 16, 68],
[29, 24, 17, 68],
[33, 24, 17, 68],
[32, 52, 16, 68],
[29, 24, 17, 68],
[33, 24, 17, 68],
[32, 52, 16, 68],
[30, 25, 16, 97]])
my_array = my_array.reshape(18, 4)
my_array = my_array.reshape(18, 4)
indices = sorted(list(range(0,int(my_array.shape[0]/3)))*3)
class_dict = dict(zip(range(0,int((my_array.shape[0]/3))), string.ascii_uppercase))
target_names = ["Index_" + c for c in class_dict.values()]
pixel_index = [1, 2, 3, 4]
X = pd.DataFrame(my_array, columns= pixel_index)
y = pd.Categorical.from_codes(indices,target_names)
df = X.join(pd.Series(y,name='class'))
df
Basically what i want to do is group into a unique class A, C, E, take their sum and divide by 3, therefore achieving mean value for class A or lets call it class 0.
Then, group into a unique class B, D, F, take their sum and divide by 3, therefore achieving mean value for class B, or class 1.
Create helper array with inteegr division and modulo for groups and pass to groupby for aggregate sum, last divide:
N = 3
arr = np.arange(len(df)) // N % 2
print (arr)
[0 0 0 1 1 1 0 0 0 1 1 1 0 0 0 1 1 1]
df = df.groupby(arr).sum() / N
print (df)
1 2 3 4
0 92.666667 82.666667 51.333333 198.000000
1 94.333333 92.666667 51.333333 210.333333
x=[[80,59,34,89],[31,11,47,64],[29,56,13,91],[55,61,48,0],[75,78,81,91]]
I want to find maximum minimum and average value of the above 2d array.
You can use numpy module to find min and max values easily:
import numpy as np
x = np.array([[80, 59, 34, 89], [31, 11, 47, 64], [29, 56, 13, 91], [55, 61, 48, 0], [75, 78, 81, 91]])
minValue = np.min(x)
maxValue = np.max(x)
print(minValue)
print(maxValue)
If you need to find them without build-in methods, you can use an approach as follows:
x = [[80, 59, 34, 89], [31, 11, 47, 64], [29, 56, 13, 91], [55, 61, 48, 0], [75, 78, 81, 91]]
minValue = x[0][0]
maxValue = x[0][0]
sumAll = 0
count = 0
for inner in x:
for each in inner:
if each > maxValue: maxValue = each
if each < minValue: minValue = each
sumAll += each
count += 1
average = sumAll / count
In this approach, you compare each value to find min and max. At the same time sum, count each element to calculate average.
You can get maximum , minimum and average of 2D array with using map like
def Average(lst):
return sum(lst) / len(lst)
x=[[80,59,34,89],[31,11,47,64],[29,56,13,91],[55,61,48,0],[75,78,81,91]]
maximum = max(map(max, x)) // 91
minimum = min(map(min, x)) // 0
average = Average(list(map(lambda idx: sum(idx)/float(len(idx)), x))) // 54.65
You can use numpy to flatten the 2d array into an 1d array.
import numpy as np
x=[[80,59,34,89],[31,11,47,64],[29,56,13,91],[55,61,48,0],[75,78,81,91]]
x = np.array(x)
print(max(x.flatten()))
print(min(x.flatten()))
print(sum(x.flatten())/ len(x.flatten()))
I'm trying to find the number that I'm looking from in a 2D array list. However, it has to be sorted first before searching.
Everything seems to be working fine when I'm trying to find a number in the 2D array. It is just the fact of sorting the 2D array in a way that will still be working. Let's assume I want to sort a 3x3 2D array. The way that it should display is:
[[8, 27, 6],
[1, 0, 11],
[10, 9, 3]]
Then, I will be looking for a number by using the binary search method through the sorted 2D array. My mid value will be in the middle of the array from the search.
This is just an example, but what I want to accomplish when I put randomized numbers and then sort row and columns. Using this idea, I'm using the random.randint() library from Python to randomized my numbers. Then, I'm trying to sort afterward in my 2d array, but it isn't really sorting before continuing.
n = 5
m = 5
def findnum_arr(array, num):
low = 0
high = n * m - 1
while (high >= low):
mid = (low + high) // 2
i = mid // m
j = mid % m
if (num == array[i][j]):
return True
if (num < array[i][j]):
high = mid - 1
else:
low = mid + 1
return False
if __name__ == '__main__':
multi_array = [[random.randint(0, 20) for x in range(n)] for y in range(m)]
sorted(multi_array)
Sorted:
[[0, 1, 3],
[6, 8, 9],
[10, 11, 27]]
Should be the sorted 2D array. Is it possible that both the row and column are sorted respectively with the sorted function?
Calling sorted on a nested list that is just going to sort based on the first index in the list.
Example:
arr = [[8, 27, 6],[1, 0, 11],[10, 15, 3], [16, 12, 14], [4, 9, 13]]
is going to return
[[1, 0, 11], [4, 9, 13], [8, 27, 6], [10, 15, 3], [16, 12, 14]]
To do this way that you want, you are going to have to flatten and then reshape.
To do this, I would try introducing numpy.
import numpy as np
a = np.array(sorted(sum(arr, [])))
#sorted(sum(arr, [])) flattens the list
b = np.reshape(a, (-1,3)).tolist()
EDITED FOR CLARITY: You can use your m and n as parameters in np.reshape. The first parameter (m) would return the number of arrays, while (n) would return the number of arrays.
The use of -1 in either parameter means that the reshaped array will be fit to return the requirements of the other parameter.
b would return
[[0, 1, 3], [4, 6, 8], [9, 10, 11], [12, 13, 14], [15, 16, 27]]
Finally found out a proper solution without using numpy and avoiding sum() module.
if __name__ == '__main__':
x = 7
multi_array = [[random.randint(0, 200) for x in range(n)] for y in range(m)]
# one_array = sorted(list(itertools.chain.from_iterable(multi_array))) Another way if you are using itertools
one_array = sorted([x for row in multi_array for x in row])
sorted_2d = [one_array[i:i+m] for i in range(0, len(one_array), n)]
print("multi_array list is: \n{0}\n".format(multi_array))
print("sorted 2D array: \n{0}\n".format(sorted_2d))
if not findnum_arr(sorted_2d, x):
print("Not Found")
else:
print("Found")
output:
multi_array list is:
[[40, 107, 23, 27, 42], [150, 84, 108, 191, 172], [154, 22, 161, 26, 31], [18, 150, 197, 77, 191], [96, 124, 81, 1
25, 186]]
sorted 2D array:
[[18, 22, 23, 26, 27], [31, 40, 42, 77, 81], [84, 96, 107, 108, 124], [125, 150, 150, 154, 161], [172, 186, 191, 1
91, 197]]
Not Found
I wanted to find a standard library module where I could flat the 2D array into 1D and sort it. Then, I would make a list comprehension of my 1D array and build it into a 2D array to. This sounds a lot of works but seems to work fine. Let me know if there is a better way to do it without numpy and faster :)
I have a matrix M:
M = [[10, 1000],
[11, 200],
[15, 800],
[20, 5000],
[28, 100],
[32, 3000],
[35, 3500],
[38, 100],
[50, 5000],
[51, 100],
[55, 2000],
[58, 3000],
[66, 4000],
[90, 5000]]
And a matrix R:
[[10 20]
[32 35]
[50 66]
[90 90]]
I want to use the values in column 0 of matrix R as start value of a slice and the value in column 1 as end of a slice.
I want to calculate the sum between and including the ranges of these slices from the right column in matrix M.
Basically doing
M[0:4][:,1].sum() # Upper index +1 as I need upper bound including
M[5:7][:,1].sum() # Upper index +1 as I need upper bound including
and so on. 0 is the index of 10 and 3 is the index of 20. 5 would be the index of 32, 6 the index of 35.
I'm stuck at how to get the start/end values from matrix R into indeces by column 0 of matrix M. And then calculate the sum between the index range including upper/lower bound.
Expected output:
[[10, 20, 7000], # 7000 = 1000+200+800+5000
[32, 35, 6500], # 6500 = 3000+3500
[50, 66, 14100], # 14100 = 5000+100+2000+3000+4000
[90, 90, 5000]] # 5000 = just 5000 as upper=lower boundary
Update, I can get the indices now using searchsorted. Now I just need to use sum at column 1 of matrix M within the start and end.
start_indices = [0,5,8,13]
end_indices = [3,6,12,13]
Wondering if there is a more efficient way than applying a for loop?
EDIT: Found the answer here. Numpy sum of values in subarrays between pairs of indices
Use searchsorted to determine the correct indices and add.reduceat to perform the summation:
>>> idx = M[:, 0].searchsorted(R) + (0, 1)
>>> idx = idx.ravel()[:-1] if idx[-1, 1] == M.shape[0] else idx.ravel()
>>> result = np.add.reduceat(M[:, 1], idx)[::2]
>>> result
array([ 7000, 6500, 14100, 5000])
Details:
Since you want to include the upper boundaries but Python excludes them we have to add 1.
reduceat cannot handle len(arg0) as an index, we have to special case that
reduceat computes all stretches between consecutive boundaries, we have to discard every other one
I think it would be better to show an example of the output you are expecting. If what you want to calculate using M[0:4][:,1].sum() is the sum of 1000 + 200 + 800 + 5000. Then this code might help:
import numpy as np
M = np.matrix([[10, 1000],
[11, 200],
[15, 800],
[20, 5000],
[28, 100],
[32, 3000],
[35, 3500],
[38, 100],
[50, 5000],
[51, 100],
[55, 2000],
[58, 3000],
[66, 4000],
[90, 5000]])
print(M[0:4][:,1].sum())
I am trying to append the second item in my two dimensional.
I have tried a few dozen different ways and can't seem to get it to append.
def main():
values = [[10,0], [13, 0], [36, 0], [74,0], [22,0]]
user = int(input('Enter a whole number'))
for i in range(len(values)):
print(values[i])
(current out put)
10, 0
13, 0
36, 0
74, 0
22, 0
(were the second part it is values[0] + user input)
[10, 12]
[13, 15]
[36, 38]
[74, 76]
[22, 24]
with list comprehension
user = 2
[[x[0], sum(x)+user] for x in values]
>>> [[10, 12], [13, 15], [36, 38], [74, 76], [22, 24]]
or using map:
map(lambda x: [x[0], sum(x)+user], values)
First, you can nearly always avoid iterating over range(len(iterable)) - in this case, your loop can be written as the much nicer:
for value in values:
print(value)
for exactly the same functionality.
I'm not sure from your description exactly how you want the code to behave, but it seems like you want something like this - each line of output will have the first item of the corresponding value, and then that added to the user input; ie, ignoring the second item of the existing input entirely:
for value in values:
total = value[0] + user
print((value[0], total))
or if you want it to overwrite the second item of each value for later use in your program:
values = [[10,0], [13, 0], [36, 0], [74,0], [22,0]]
for value in values:
value[1] = value[0] + user
print(value)
Shouldn't it be just like this?
>>> def f():
values = [[10,0], [13, 0], [36, 0], [74,0], [22,0]]
user = int(input('Enter a whole number'))
for i in range(len(values)):
values[i][1]=values[i][0]+user
print(values[i])
>>> f()
Enter a whole number2
[10, 12]
[13, 15]
[36, 38]
[74, 76]
[22, 24]