Get median value in each bin in a 2D grid - python

I have a 2-D array of coordinates and each coordinates correspond to a value z (like z=f(x,y)). Now I want to divide this whole 2-D coordinate set into, for example, 100 even bins. And calculate the median value of z in each bin. Then use scipy.interpolate.griddata function to create a interpolated z surface. How can I achieve it in python? I was thinking of using np.histogram2d but I think there is no median function in it. And I found myself have hard time understanding how scipy.stats.binned_statistic work. Can someone help me please. Thanks.

With numpy.histogram2d you can both count the number of data and sum it, thus it gives you the possibility to compute the average.
I would try something like this:
import numpy as np
coo=np.array([np.arange(1000),np.arange(1000)]).T #your array coordinates
def func(x, y): return x*(1-x)*np.sin(np.pi*x) / (1.5+np.sin(2*np.pi*y**2)**2)
z = func(coo[:,0], coo[:,1])
(n,ex,ey)=np.histogram2d(coo[:,0], coo[:,1],bins=100) # here we get counting
(tot,ex,ey)=np.histogram2d(coo[:,0], coo[:,1],bins=100,weights=z) # here we get total over z
average=tot/n
average=np.nan_to_num(average) #cure 0/0
print(average)

you'll need a few functions or one depending on how you want to structure things:
function to create the bins should take in your data, determine how big each bin is and return an array or array of arrays (also called lists in python).
Happy to help with this but would need more information about the data.
get the median of the bins:
Numpy (part of scipy) has a median function
http://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.median.html
essentially the median on an array called
"bin"
would be:
$ numpy.median(bin)
Note: numpy.median does accept multiple arrays, so you could get the median for some or all of your bins at once. numpy.median(bins) which would return an array of the median for each bin
Updated
Not 100% on your example code, so here goes:
import numpy as np
# added some parenthesis as I wasn't sure of the math. also removed ;'s
def bincalc(x, y):
return x*(1-x)*(np.sin(np.pi*x))/(1.5+np.sin(2*(np.pi*y)**2)**2)
coo = np.random.rand(1000,2)
tcoo = coo[0]
a = []
for i in tcoo:
a.append(bincalc(coo[0],coo[1]))
z_med = np.median(a)
print(z_med)`

Related

How to extract specific parts of a numpy array?

I have the following looking correlation function.
I want to extract only the main peak of the function in a seperate array. The central peak has the form of a gaussian.. I want to seperate the peak with a width arround the peak of approximately four times the FWHM of the gaussian peak. I have the correlation function stored in a numpy array. Any tips/ideas how to approach this ?
Numpy's argmax (Docs) function returns the index of the max value of a numpy array. With that value you could then get the values around that index.
Example:
m = numpy.argmax(arr)
values = arr[m-width:m+width]

scipy interp1d extrapolation method

I am trying to extrapolate values from some endpoints as shown in the image below
extrapolated value illustration
I have tried using the scipy interp1d method as shown below
from scipy import interpolate
x = [1,2,3,4]
y = [0,1,2,0]
f = interpolate.interp1d(x,y,fill_value='extrapolate')
print(f(4.3))
output : -0.5999999999999996
Though this is correct, I also need a second extrapolated value which is the intersection of X on segment i=1.The estimated value i am expecting is ~ 3.3 as seen from the graph in the image above.But I need get this programmatically,I am hoping there should be a way of returning multiple values from interp1d(.....) or something. Any help will be much appreciated.Thanks in advance
If you want to extrapolate based all but the last pair of values, you can just build a second interpolator, using x[:-1], y[:-1])

Draw random selection of a numpy array following approximately a lognormal distribution

Suppose I have a grid of numbers in Python that I have created using
import numpy as np
h = np.linspace(0,20,100)
I am trying to make a random selection within the elements of h in a way that the distribution of the selections follows for example the log-normal distribution, with a given mean and standard deviation. How would I be able to do this?
May be easier to just draw samples from a lognormal distribution
np.random.lognormal(mean=5,sigma=2,size=10)
This can be solved very fast. At first you have to find a way to draw random indices following your custom pdf. After you have done this, you can use these indices to draw numbers from 0 to 100 and return the entries of the array at these indices.
To draw the numbers randomly in this way, there are a few ways in ´python´, like this for example. When you have drawn your random indices in this way in an array called indices you can use:
result = h[indices]
to create your desired numpy array.

How to find peaks in 1d array

I am reading a csv file in python and preparing a dataframe out of it. I have a Microsoft Kinect which is recording Arm Abduction exercise and generating this CSV file.
I have this array of Y-Coordinates of ElbowLeft joint. You can visualize this here. Now, I want to come up with a solution which can count number of peaks or local maximum in this array.
Can someone please help me to solve this problem?
You can use the find_peaks_cwt function from the scipy.signal module to find peaks within 1-D arrays:
from scipy import signal
import numpy as np
y_coordinates = np.array(y_coordinates) # convert your 1-D array to a numpy array if it's not, otherwise omit this line
peak_widths = np.arange(1, max_peak_width)
peak_indices = signal.find_peaks_cwt(y_coordinates, peak_widths)
peak_count = len(peak_indices) # the number of peaks in the array
More information here: https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.find_peaks_cwt.html
It's easy, put the data in a 1-d array and compare each value with the neighboors, the n-1 and n+1 data are smaller than n.
Read data as Robert Valencia suggests
max_local=0
for u in range (1,len(data)-1):
if ((data[u]>data[u-1])&(data[u]>data[u+1])):
max_local=max_local+1
You could try to smooth the data with a smoothing filter and then find all values where the value before and after are less than the current value. This assumes you want all peaks in the sequence. The reason you need the smoothing filter is to avoid local maxima. The level of smoothing required will depend on the noise present in your data.
A simple smoothing filter sets the current value to the average of the N values before and N values after the current value in your sequence along with the current value being analyzed.

Visualization of Standard Deviation in an array

As a python newbie I need a little help.
I have an array with 100 rows and 100 columns. Each position stands for a temperature value. I now want to calculate the mean of the whole array (I have that so far) and then create a new array with the same dimension like the first one and with the standrard deviation at each positions. At the end I want to get an array with the deviation from the mean at each postion, so I want to know, how far each value spreads from the mean. I hope you understand what I mean? For better understanding: the array is an infrared thermography image of a house. With the calulation of standard deviation I want to get the best reactive/sensitive pixels in the image. Maybe someone has done something like this before. In the end I want to export the file, so that I get an image that is similar looking to the infrared image. But not with the raw temperatures but the standard deviation temperatures.
Importing the file and calculating the mean like this:
data_mean = []
my_array = np.genfromtxt((line.replace(',','.') for line in data),skip_header=9,delimiter=";")
data_mean.append(np.nanmean(my_array))
Then I need calculation the standard deviation of each position in the array.
Thank you so much in advance for any help!
data_mean = np.mean(my_array) #gets you the mean of the whole array
return an array where every value is the mean of your data
meanArray = np.ones(my_array.shape)*data_mean
variationFromMean = my_array - meanArray
Is this what you were looking for?
If you are keeping the data in an array format here is a solution:
import numpy as np
#Find the mean of the array data values
mean_value = np.mean(data_mean)
#Find the standard deviation of the array data values
standard_deviation = np.std(data_mean)
#create an array consisting of the standard deviations from the mean
array = data_mean/standard_deviation

Categories

Resources