SciPy Medfilt returning only 0 values - python

Hi I'm working on a code that calculates the median values for a given window size in my data set. I'm using medfilt from SciPy. I don't understand why the median array returned is all zeroes. I've changed the kernel size but that didn't affect anything, and I'm wondering if the shape of my array affects medfilt. Here is my code:
from __future__ import division
import numpy as np
import matplotlib.pyplot as plt
import scipy as sp
from scipy import signal
filename = "hbond.txt"
hbond_val = []
with open(filename,'r') as f:
for line in f:
f_line = np.array(line.split())
hbond_val.append(f_line)
bond_array = np.asarray(hbond_val)
bond_array_float = bond_array.astype(float)
bond_1d = np.reshape(bond_array_float,50001)
#print bond_1d.shape
median = sp.signal.medfilt(bond_array_float,101)
#plt.plot(range(len(bond_array)),bond_array,'b')
#plt.plot(range(len(median)),median,'r')
#plt.show()
print median #median returns array full of zeros

Related

Python: how can I read data from files and assign it to an array? error":could not broadcast input array from shape"

I have the following MATLAB code that works fine using text data files, now I am trying to rewrite it using Python but running into errors. I have results that I am trying to apply some calculations on (perform data analysis). My results are in the format of binary files and I have a specific package I am using to help me import the data. For example, here ne is a 1024x256 array with 159 number of files printed per each iteration. So, in MATLAB I can simply do the following:
% Load data:
frame = 6; % how many number of output files
ne_bg = load([DirPath '/ne_unpert.txt']);
ne_p = load([DirPath '/ne_' num2str(frame) '.txt']);
% perform calculations on data:
ne = ne_bg + ne_p;
dn_over_n = ne_p ./ ne;
Since MATLAB deals easily with multi-dimensional arrays and matrices, I am struggling to interpret that to python.
My Python code:
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import matplotlib.gridspec as gridspec
import matplotlib.colors as colors
import matplotlib.patches as patches
import scipy.optimize as opt
from scipy.special import erf, comb, gamma, gammainc
import scipy.constants as const
from scipy.integrate import odeint
import sys
from glob import glob
from mpl_toolkits.axes_grid1 import make_axes_locatable
import Package as pg
# Initialize sizes
ne = np.zeros((1024,256))
ne_p = np.zeros((1024,256))
# Data
data = pg.GData('ne_p.bp')
dg = pg.GInterpModal(data, 2, 'ms')
#dg.interpolate(overwrite=True)
ne_p = data.getValues()
data = pg.GData('ne0.gkyl')
dg = pg.GInterpModal(data, 2, 'ms')
#dg.interpolate(overwrite=True)
ne_bg = data.getValues()
for i in range(1,159): # would like to look at files start from 1 to 159 not 0
data = pg.GData('ne{:d}.gkyl'.format(i))
dg = pg.GInterpModal(data, 2, 'ms')
ne[i,:] = data.getValues() # ERROR HERE
dn_over_n = ne_p/ne # get
....
Error message:
ValueError Traceback (most recent call last)
<ipython-input-35-d6134fb807e8> in <module>
48 dg = pg.GInterpModal(data, 2, 'ms')
49 #dg.interpolate(overwrite=True)
---> 50 ne[i,:] = data.getValues()
ValueError: could not broadcast input array from shape (1024,256,1) into shape (256)
Can someone show me how to fix this and explain what it means?

How to find dominant frequency from FFT?

The peak suggestion, in the comments, does not find the peak that occurs most often. I need to find the frequency that occurs most often.
I need to find the dominant frequency in my Coefficient of Lift data. The frequency I am getting with the following code is quite large and not the dominant frequency. I know because the 2-D analysis is easy to analyze with a graph. It is sinusoidal. By dominant frequency, I mean the frequency of the signal with the most repeats.
#1/usr/bin/env python
import sys
import numpy
from numpy import sin
from math import pi
print("Hello World!")
data = numpy.loadtxt('CoefficientLiftData.dat', usecols= (0,3))
n = data.size
timestep = 0.000005
The peak data suggestion, in the comments, does not provide a count method to find how often a frequency occurs.
print(data)
fourier = numpy.fft.fft(data)
frequencies = numpy.fft.fftfreq(n, d=timestep)
positive_frequencies = frequencies[numpy.where(frequencies >= 0)]
magnitudes = abs(fourier[numpy.where(frequencies >= 0)])
peak_frequency = numpy.argmax(magnitudes)
print(peak_frequency)
Here is how to extract the dominate frequency from Coefficient of Lift data from, as an example, OpenFOAM.
#1/usr/bin/env python
import sys
import numpy as np
import scipy.fftpack as fftpack
from numpy import sin
from math import pi
import matplotlib.pyplot as plt
print("Hello World!")
N = 2500
Nev = 1000
data = np.loadtxt('CoefficientLiftData.dat', usecols= (0,3))
times = data[:,0]
length = int(len(times)/2)
forcez= data[:,1]
t = np.linspace(times[length], times[-1], 2500)
forcezint = np.interp(t, times, forcez)
fourier = fftpack.fft(forcezint[Nev-1:N-1])
frequencies = fftpack.fftfreq(forcezint[Nev-1:N-1].size, d=t[1]-t[0])
#print(frequencies)
freq = frequencies[np.argmax(np.abs(fourier))]
print(freq)

How do I make a for loop to each individual array?

I have a function that I created and I want the function to be applied to these different values using a for loop or something.
How do I create a for loop that takes each value but stores them in different arrays?
I have this so far:
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
import matplotlib.patches as patches
import xarray as xr
import cartopy.crs as ccrs
import cartopy.feature as cfeature
import netCDF4 as s
import numpy.ma as ma
fwf_tot = fwf_ice + ds.runoff_tundra*ds.LSMGr #data input i am using
# function i want to apply to the data
def ob_annual(ob_monthly, id_number):
ann_sum = ob_monthly.where(ds.ocean_basins == id_number).resample(TIME='1AS').sum().sum(dim=('X','Y'))
return ann_sum
This is where my problem is to create the for loop to save for these different values. I think this for loop is just saving the function applied to the last value (87) and not the others. How might I fix this? I expected there to be an output of 7 arrays with each a size of 59.
obs = np.array([26,28,29,30,76,84,87])
total_obs = []
for i in obs:
total_obs = ob_annual(fwf_tot_grnl, i)
print(total_obs.shape)
(59)
You replace your list total_obs at each iteration. You must append each value into it:
for i in obs:
total_obs.append(ob_annual(fwf_tot_grnl, i))
or use a comprehension list
total_obs = [ob_annual(fwf_tot_grnl, i) for i in obs]

Exponential fit returns an unreasonable amplitude but looks good when plotted

I'm trying to fit my exponential data, but I am unable to get a decent answer. I'm using scipy and the following code:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import glob
import scipy.optimize
import pylab
def exponential(x, a, k, b):
return a*np.exp(-x/k) + b
def main():
filename = 'tek0071ALL.csv'
df = pd.read_csv(filename, skiprows=14)
t = df['TIME']
ch3 = df['CH3']
idx1 = df.index[df['TIME']==-0.32]
idx2 = df.index[df['TIME']==-0.18]
t= t[idx1.values[0]:idx2.values[0]]
data=ch3[idx1.values[0]:idx2.values[0]]
popt_exponential, pcov_exponential = scipy.optimize.curve_fit(exponential, t, data, p0=[1,.1, 0])
# print(popt_exponential,pcov_exponential)
print(popt_exponential[0])
print(popt_exponential[1])
print(popt_exponential[2])
plt.plot(t,data,'.')
plt.plot(t,exponential(t,popt_exponential[0],popt_exponential[1],popt_exponential[2]))
plt.show()
plt.legend(['Data','Fit'])
main()
This is what the fit looks like:
and I think this means that it's actually a good fit. I think my time constant is correct, and that's what I'm trying to extract. However, the amplitude is really giving me trouble -- I expected the amplitude to be around 0.5 by inspection, but instead I get the following values for equation A*exp(-t/K)+C:
A:1.2424893552249658e-07
K:0.0207112474466181
C: 0.010623336832120528
I'm left wondering if this is correct, and that my amplitude really ought to be so tiny to account for the exponential's behavior.

ndimage script mis-behaving

I have a script that reads in image data, and then iterates over the images with the median filter in scipy.ndimage. From the iteration i create new arrays.
However when i attempt to run the script with
run filtering.py
The filtering does not seem to work. The new arrays (month_f) are the same as the old ones.
import matplotlib.pyplot as plt
import numpy as numpy
from scipy import ndimage
import Image as Image
# Get images
#Load images
jan1999 = Image.open('jan1999.tif')
mar1999 = Image.open('mar1999.tif')
may1999 = Image.open('may1999.tif')
sep1999 = Image.open('sep1999.tif')
dec1999 = Image.open('dec1999.tif')
jan2000 = Image.open('jan2000.tif')
feb2000 = Image.open('feb2000.tif')
#Compute numpy arrays
jan1999 = numpy.array(jan1999)
mar1999 = numpy.array(mar1999)
may1999 = numpy.array(may1999)
sep1999 = numpy.array(sep1999)
dec1999 = numpy.array(dec1999)
jan2000 = numpy.array(jan2000)
feb2000 = numpy.array(feb2000)
########### Put arrays into a list
months = [jan1999, mar1999, may1999, sep1999, dec1999, jan2000, feb2000]
############ Filtering = 3,3
months_f = []
for image in months:
image = scipy.ndimage.median_filter(image, size=(5,5))
months_f.append(image)
Any help would be much appreciated :)
This is rather a comment but due to reputation limits I'm not able to write one.
The way you import your modules is a bit strange. Especially "import .. as" with the idential name. I think a more pythonian way would be
import matplotlib.pyplot as plt
import numpy as np
from scipy import ndimage
from PIL import Image
and then call
image = ndimage.median_filter(image, size=(...))
When I run your steps with a RGB test image it seems to work.
What does jan1999.shape return?

Categories

Resources