Finding maximum per row when condition is met - python

I'm having the following relatively simple problem. I have two arrays storing x and y coordinates per timestep, e.g.
x = [[0, 1, 2, 3], [0.1, 1.1, 2.1, 3.1]]
y = [[0.5, 0.5, 0.5, 0.5], [0.51, 0.52, 0.49, 0.53]]
in which 2 timesteps are represented (2 rows). What I would like is to find the maximum y coordinate per row when the condition x >= 1 and x <= 2.5 is met.
How can I define a function which returns an array of 2 columns with just the max(y) per row when the spatial x condition is met?
I've tried np.where without luck. The result the function should return is:
[0.5, 0.52]

You can use numpy's mask function. The mask function 'masks' the true values, so the conditions are flipped.
import numpy as np
x = [[0, 1, 2, 3], [0.1, 1.1, 2.1, 3.1]]
y = [[0.5, 0.5, 0.5, 0.5], [0.51, 0.52, 0.49, 0.53]]
x = np.array(x)
y = np.array(y)
y_masked = np.ma.masked_where((x>2.5) | (x<1), y)
result = np.max(y_masked, axis = 1)
print(result)

Not very pretty, but using pure Python (no numpy) you could combine zip, filter, and max:
>>> x = [[0,1,2,3],[0.1,1.1,2.1,3.1]]
>>> y = [[0.5,0.5,0.5,0.5],[0.51,0.52,0.49,0.53]]
>>> [max(filter(lambda t: 1.0 <= t[0] <= 2.5, zip(rx, ry)), key=lambda t: t[1])[1]
... for rx, ry in zip(x, y)]
...
[0.5, 0.52]
Or a bit shorter, using a list comprehension to filter and reverse order of the tuple so max can use natural ordering:
>>> [max((y, x) for (x, y) in zip(rx, ry) if 1.0 <= x <= 2.5)[0]
... for rx, ry in zip(x, y)]
...
[0.5, 0.52]

As you were suggesting a Numpy solution:
import numpy as np
x = np.array([[0, 1, 2, 3], [0.1, 1.1, 2.1, 3.1]])
y = np.array([[0.5, 0.5, 0.5, 0.5], [0.51, 0.52, 0.49, 0.53]])
print([np.max(y[i][(x[i] >= 1) & (x[i] <= 2.5)]) for i in range(len(x))])
gives
[0.5, 0.52]

Related

Solving equation with for loops python

I have arrays like this:
x = np.array([-1,-1,-1,1,-1,-1])
weight = np.array([[0.5,-0.5,0.5,-0.5,0.5,-0.5],
[-0.5,0.5,-0.5,0.5,-0.5,0.5],
[0.5,0.5,0.5,0.5,0.5,0.5]])
print(weight.shape)
bias=np.array([2, 2, 2])
print(bias)
weight = np.transpose(weight)
weight
You can run the above code which results to arrays bias and weight_ham and x:
bias = [2 2 2]
weight = array([[ 0.5, -0.5, 0.5],
[-0.5, 0.5, 0.5],
[ 0.5, -0.5, 0.5],
[-0.5, 0.5, 0.5],
[ 0.5, -0.5, 0.5],
[-0.5, 0.5, 0.5]])
x = array([-1, -1, -1, 1, -1, -1])
Now i want to calculate this equation:
the y_in array should be like this:
y_in = np.zeros((1, len(bias)))
What i don't understand is how can i compute that equation with for loop since i'm not really familiar with how should i write for loops.
if you didn't understand the equation you can see this example below:
I don't understand why you are required to use loops when you are already working with numpy, however the correct way would be:
>>> np.add(bias, np.dot(x[None, :], weight)).flatten()
array([1., 3., 0.])
But if you want loops:
y = []
for index_1, b in enumerate(bias):
sum_ = b
for index_2, val in enumerate(x):
sum_ += x[index_2] * weight[index_2, index_1]
y.append(sum_)
>>> y
[1.0, 3.0, 0.0]
# OR
>>> [b + sum(x_val * w for x_val, w in zip(x, weight[:,i])) for i, b in enumerate(bias)]
[1.0, 3.0, 0.0]
Posting answer for your screenshot problem. You can use the same code for your original problem:
x = np.array([1,1,-1,-1])
weight = np.array([[0.5,-0.5,-0.5,-0.5],
[-0.5,-0.5,-0.5,0.5],
])
bias=np.array([2, 2])
weight = np.transpose(weight)
One Liner:
np.add(bias, np.dot(weight.T, x))
Using Loop:
y_arr = []
for j in range(weight.shape[1]):
y = (bias[j] + np.dot(weight[:,j].T, x))
y_arr.append(y)
y_arr = np.array(y_arr)
y_arr:
array([3., 1.])

PyTorch: index 2D tensor with 2D tensor of row indices

I have a torch tensor a of shape (x, n) and another tensor b of shape (y, n) where y <= x. every column of b contains a sequence of row indices for a and what I would like to be able to do is to somehow index a with b such that I obtain a tensor of shape (y, n) in which the ith column contains a[:, i][b[:, i]] (not quite sure if that's the correct way to express it).
Here's an example (where x = 5, y = 3 and n = 4):
import torch
a = torch.Tensor(
[[0.1, 0.2, 0.3, 0.4],
[0.6, 0.7, 0.8, 0.9],
[1.1, 1.2, 1.3, 1.4],
[1.6, 1.7, 1.8, 1.9],
[2.1, 2.2, 2.3, 2.4]]
)
b = torch.LongTensor(
[[0, 3, 1, 2],
[2, 2, 2, 0],
[1, 1, 0, 4]]
)
# How do I get from a and b to c
# (so that I can also assign to those elements in a)?
c = torch.Tensor(
[[0.1, 1.7, 0.8, 1.4],
[1.1, 1.2, 1.3, 0.4],
[0.6, 0.7, 0.3, 2.4]]
)
I can't get my head around this. What I'm looking for is a method that will not yield the tensor c but also let me assign a tensor of the same shape as c to the elements of a which c is made up of.
I try to use index_select but it supports only 1-dim array for index.
bt = b.transpose(0, 1)
at = a.transpose(0, 1)
ct = [torch.index_select(at[i], dim=0, index=bt[i]) for i in range(len(at))]
c = torch.stack(ct).transpose(0, 1)
print(c)
"""
tensor([[0.1000, 1.7000, 0.8000, 1.4000],
[1.1000, 1.2000, 1.3000, 0.4000],
[0.6000, 0.7000, 0.3000, 2.4000]])
"""
It might be not the best solution, but hope this helps you at least.

Piecewise function in numpy with multiple arguments

I tried to define a function (tent map) as following:
def f(r, x):
return np.piecewise([r, x], [x < 0.5, x >= 0.5], [lambda r, x: 2*r*x, lambda r, x: 2*r*(1-x)])
And r, x will be numpy arrays:
no_r = 10001
r = np.linspace(0, 4, no_r)
x = np.random.rand(no_r)
I would like the result to be a numpy array matching the shapes of r and x, calculated using each pairs of elements of arrays r and x with the same indicies. For example if r = [0, 1, 2, 3] and x = [0.1, 0.7, 0.3, 1], the result should be [0, 0.6, 1.2, 0].
An error occured: "boolean index did not match indexed array along dimension 0; dimension is 2 but corresponding boolean dimension is 10001"
So what should I do to achieve the intended purpose?
what you want to get as result can be done with np.select such as:
def f(r, x):
return np.select([x < 0.5,x >= 0.5], [2*r*x, 2*r*(1-x)])
Then with
r = np.array([0, 1, 2, 3])
x = np.array([0.1, 0.7, 0.3, 1])
print (f(r,x))
[0. 0.6 1.2 0. ]
EDIT: in this case, with only 2 conditions that are exclusive, you can also use np.where:
def f(r,x):
return np.where(x<0.5,2*r*x, 2*r*(1-x))
will give the same result.

How do I get frequency of each value and sum of the frequency if it is less or equal to the value in the list?

sample = [1,2,2,4,4,3,4,3,4,3]
depot_1=[]
depot_2=[]
tempdepot_1=[]
tempdepot_2 = []
def sample_calc (sample):
for item_1 in sorted(sample):
depot_1.append(item_1)
tempdepot_1.append(sample.count(item_1))
for item_2 in tempdepot_1:
depot_2.append(item_2/len(sample))
tempdepot_3=[ sum( depot_2[:x] ) for x in range( 1, len(depot_2)+1 ) ]
print(depot_1)
print(tempdepot_1)
print(depot_2)
print(tempdepot_3)
sample_calc (sample)
I am trying to get two lists, one is sorted [original list] and second is frequency of each value on the [sorted list] and sum of the frequencies for equal and less value.
Desired output:
depot_1 = [1,2,2,3,3,3,4,4,4,4]
tempdepot_3 = [0.1, 0.3, 0.3, 0.6, 0.6, 0.6, 1.0, 1.0, 1.0, 1.0]
Could you help with [tempdepot_3] list? (without libraries)
Another not so elegant way of doing it without collections library is:
sample = [1,2,2,4,4,3,4,3,4,3]
depot_1 = sorted(sample)
print(depot_1)
tempdepot_3 = []
freq = depot_1.count(depot_1[0])/len(depot_1)
tempdepot_3.append(freq)
for i in range(1,len(depot_1)):
freq = depot_1.count(depot_1[i])/len(depot_1)
if depot_1[i]==depot_1[i-1]:
tempdepot_3.append(tempdepot_3[i-1])
else:
tempdepot_3.append(round(freq + tempdepot_3[i-1],1))
print(tempdepot_3)
The output is:
[1, 2, 2, 3, 3, 3, 4, 4, 4, 4]
[0.1, 0.3, 0.3, 0.6, 0.6, 0.6, 1.0, 1.0, 1.0, 1.0]
However it is best for you to use Python's standard libraries to improve your code's performance.
So, just get a counter (I'm using collections, simple enough to do yourself with a plain dict though). Then I get a mapping of cumulative proportions, them use the mapping to build the final list:
>>> import collections
>>> sample = [1,2,2,4,4,3,4,3,4,3]
>>> N = len(sample)
>>> depot = sorted(sample)
>>> counts = collections.Counter(depot)
>>> counts
Counter({4: 4, 3: 3, 2: 2, 1: 1})
>>> p = 0
>>> props = {}
>>> for k, v in sorted(counts.items()): # Necessary to sort, dicts are unordered
... p += v
... props[k] = p / N
...
>>> props
{1: 0.1, 2: 0.3, 3: 0.6, 4: 1.0}
Finally:
>>> tempdepot = [props[x] for x in depot]
>>> tempdepot
[0.1, 0.3, 0.3, 0.6, 0.6, 0.6, 1.0, 1.0, 1.0, 1.0]

python; counting elements of vectors

I would like to count and save in a vector a the number of elements of an array that are greater than a certain value t. I want to do this for different ts.
eg
My vector:c=[0.3 0.2 0.3 0.6 0.9 0.1 0.2 0.5 0.3 0.5 0.7 0.1]
I would like to count the number of elements of c that are greater than t=0.9, than t=0.8 than t=0.7 etc... I then want to save the counts for each different value of t in a vector
my code is (not working):
for t in range(0,10,1):
for j in range(0, len(c)):
if c[j]>t/10:
a.append(sum(c[j]>t))
my vector a should be of dimension 10, but it isn't!
Anybody can help me out?
I made a function that loops over the array and just counts whenever the value is greater than the supplied threshold
c=[0.3, 0.2, 0.3, 0.6, 0.9, 0.1, 0.2, 0.5, 0.3, 0.5, 0.7, 0.1]
def num_bigger(threshold):
count = 0
for num in c:
if num > threshold:
count +=1
return count
thresholds = [x/10.0 for x in range(10)]
for thresh in thresholds:
print thresh, num_bigger(thresh)
Note that the function checks for strictly greater, which is why, for example, the result is 0 when the threshold is .9.
There are few things wrong with your code.
my vector a should be of dimension 10, but it isn't!
That's because you don't append only 10 elements in your list. Look at your logic.
for t in range(0,10,1):
for j in range(0, len(c)):
if c[j]>t/10:
a.append(sum(c[j]>t))
For each threshold, t, you iterate over all 12 items in c one at a time and you append something to the list. Overall, you get 120 items. What you should have been doing instead is (in pseudocode):
for each threshold:
count = how many elements in c are greater than threshold
a.append(count)
numpy.where() gives you the indices in an array where a condition is satisfied, so you just have to count how many indices you get each time. We'll get to the full solution is a moment.
Another potential error is t/10, which in Python 2 is integer division and will return 0 for all thresholds. The correct way would be to force float division with t/10.. If you're on Python 3 though, you get float division by default so this might not be a problem. Notice though that you do c[j] > t, where t is between 0 and 10. Overall, your c[j] > t logic is wrong. You want to use a counter for all elements, like other answers have shown you, or collapse it all down to a one-liner list comprehension.
Finally, here's a solution fully utilising numpy.
import numpy as np
c = np.array([0.3, 0.2, 0.3, 0.6, 0.9, 0.1, 0.2, 0.5, 0.3, 0.5, 0.7, 0.1])
thresh = np.arange(0, 1, 0.1)
counts = np.empty(thresh.shape, dtype=int)
for i, t in enumerate(thresh):
counts[i] = len(np.where(c > t)[0])
print counts
Output:
[12 10 8 5 5 3 2 1 1 0]
Letting numpy take care of the loops under the hood is faster than Python-level loops. For demonstration:
import timeit
head = """
import numpy as np
c = np.array([0.3, 0.2, 0.3, 0.6, 0.9, 0.1, 0.2, 0.5, 0.3, 0.5, 0.7, 0.1])
thresh = np.arange(0, 1, 0.1)
"""
numpy_where = """
for t in thresh:
len(np.where(c > t)[0])
"""
python_loop = """
for t in thresh:
len([element for element in c if element > t])
"""
n = 10000
for test in [numpy_where, python_loop]:
print timeit.timeit(test, setup=head, number=n)
Which on my computer results in the following timings.
0.231292377372
0.321743753994
Your problem is here:
if c[j]>t/10:
Notice that both t and 10 are integers and so you perform integer division.
The easiest solution with the least changes is to change it to:
if c[j]>float(t)/10:
to force float division
So the whole code would look something like this:
a = []
c = [0.3, 0.2, 0.3, 0.6, 0.9, 0.1, 0.2, 0.5, 0.3, 0.5, 0.7, 0.1]
for i in range(10): #10 is our 1.0 change it to 9 if you want to iterate to 0.9
sum = 0
cutoff = float(i)/10
for ele in c:
if ele <= cutoff:
sum += ele
a.append(sum)
print(len(a)) # prints 10, the numbers from 0.0 - 0.9
print(a) # prints the sums going from 0.0 cutoff to 1.0 cutoff
You have to divide t / 10.0 so the result is a decimal, the result of t / 10 is an integer
a = []
c=[0.3, 0.2, 0.3, 0.6, 0.9, 0.1, 0.2, 0.5, 0.3, 0.5, 0.7, 0.1]
for t in range(0,10,1):
count = 0
for j in range(0, len(c)):
if c[j]>t/10.0:
count = count+1
a.append(count)
for t in range(0,10,1):
print(str(a[t]) + ' elements in c are bigger than ' + str(t/10.0))
Output:
12 elements in c are bigger than 0.0
10 elements in c are bigger than 0.1
8 elements in c are bigger than 0.2
5 elements in c are bigger than 0.3
5 elements in c are bigger than 0.4
3 elements in c are bigger than 0.5
2 elements in c are bigger than 0.6
1 elements in c are bigger than 0.7
1 elements in c are bigger than 0.8
0 elements in c are bigger than 0.9
You can check the test here
If you simplify your code bugs won't have places to hide!
c=[0.3, 0.2, 0.3, 0.6, 0.9, 0.1, 0.2, 0.5, 0.3, 0.5, 0.7, 0.1]
a=[]
for t in [x/10 for x in range(10)]:
a.append((t,len([x for x in c if x>t])))
a
[(0.0, 12),
(0.1, 10),
(0.2, 8),
(0.3, 5),
(0.4, 5),
(0.5, 3),
(0.6, 2),
(0.7, 1),
(0.8, 1),
(0.9, 0)]
or even this one-liner
[(r/10,len([x for x in c if x>r/10])) for r in range(10)]
It depends on the sizes of your arrays, but your current solution has O(m*n) complexity, m being the number of values to test and n the size of your array. You may be better off with O((m+n)*log(n)) by first sorting your array in O(n*log(n)) and then using binary search to find the m values in O(m*log(n)). Using numpy and your sample c list, this would be something like:
>>> c
[0.3, 0.2, 0.3, 0.6, 0.9, 0.1, 0.2, 0.5, 0.3, 0.5, 0.7, 0.1]
>>> thresholds = np.linspace(0, 1, 10, endpoint=False)
>>> thresholds
array([ 0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
>>> len(c) - np.sort(c).searchsorted(thresholds, side='right')
array([12, 10, 8, 5, 5, 3, 2, 1, 1, 0])

Categories

Resources