I was trying to understand what exactly scipy.signal.resample is to do. Here is a simple code:
import numpy as np
from scipy.signal import resample
import matplotlib.pyplot as plt
t=np.linspace(1,4,10)
x=t
x_resampled=resample(x,10000)
plt.plot(x)
plt.figure()
plt.plot(x_resampled)
the input is
and the output of code is
however, I expected
Can you please tell me how can I settle scipy.signal.resample to obtain such a result?
Look at the documentation for resample (emphasis mine):
Because a Fourier method is used, the signal is assumed to be periodic.
On the non-periodic data you're using the above assumption doesn't hold. So a valid result should not be expected.
Related
I am unable to generate the actual underlying values of the IRFs. See code of a simple VAR model.
import numpy as np
import statsmodels.tsa as sm
model = VAR(df_differenced.astype(float))
results = model.fit()
irf = results.irf(10)
I can generate the resulting IRF plots just fine with this code:
irf.plot(orth=False)
But, I can't generate the underlying values. I'd like to do so to have precise figures. Visually interpreting IRFs is not that accurate. Using the summary() did not provide me this information.
I would really appreciate some help. Thanks in advance.
You need to use the irfs property or cum_effects (cumulative irf). results.irf returns an IRAnalysis object. The documentation is below the standard where it should be.
import numpy as np
import statsmodels.tsa as sm
import pandas as pd
df = pd.DataFrame(np.random.standard_normal((300,3)))
model = VAR(df)
results = model.fit()
irf = results.irf(10)
print(irf.irfs)
print(irf.cum_effects)
You are close to the actual answer.
You can type
results.irf(10)
or try
results.impulse_responses(10)
It will give you a table with the actual point estimates from the VAR
I am new to Python, so I am not sure if this problem is due to my inexperience or whether this is a glitch.
I am running this code multiple times on the same data (no random number generation) and getting different results. This has occurred with more than one variable so far, and obviously I cannot proceed with the analysis until I figure out which results are trustworthy. Here is a short sample of the results I have obtained after running the code four times. Why is there such a discrepancy between these outputs? I am puzzled and greatly appreciate your advice.
Linear Regression
from scipy.stats import linregress
import scipy.stats
from scipy.signal import welch
import matplotlib
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import scipy.signal as signal
part_022_o = pd.read_excel(r'C:\Users\Me\Desktop\Behavioral Data Processed\part_022_combined_other.xlsx')
distance_o = part_022_o["distance"]
fs = 200
f, Pwelch_spec = signal.welch(distance_o, fs=fs, window='hanning',nperseg=400, noverlap=200, scaling='density', average='mean')
log_f = np.log(f, where=f>0)
log_pwelch = np.log(Pwelch_spec, where=Pwelch_spec>0)
idx = np.isfinite(log_f) & np.isfinite(log_pwelch)
polynomial_coefficients = np.polyfit(log_f[idx],log_pwelch[idx],1)
print(polynomial_coefficients)
scipy.stats.linregress(log_f[idx], log_pwelch[idx])
Results First Attempt
[ 0.00324568 -2.82962602]
Results Second Attempt
[-2.70137164 6.97117509]
Results Third Attempt
[-2.70137164 6.97117509]
Results Fourth Attempt
[-2.28028005 5.53839502]
The same thing happens when I use scipy.stats.linregress().
Thank you,
Confused
Edit: full code added.
Also, the issue appears to be related to np.log(), since only the values of "log_f" array seem to be changing with the different outputs. It is hard to be certain that nothing else is changing (e.g. log_pwelch), but differences in output clearly correspond to differences in the first value of the "log_f" array.
Edit: I have narrowed the issue down to np.log(f, where=f>0). The first value in the f array is zero. According to the documentation of numpy log, "...Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized." Apparently this means that the value or variable is unpredictable and can vary from trial to trial, which is exactly what I am observing. Given my inexperience with Python, I am not sure what the best solution is (e.g. specifying the out-array in the log function, use a random seed, just note the regression coefficients whenever the value of zero is unchanged after log, etc.)
Try to use a random seed to reproduce results. Do this with the following code at the top of your program:
import numpy as np
np.random.seed(123) or any number you want
see here for more info: https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.random.seed.html
A random seed ensures you get repeatable results when some part of your program is generating numbers at random.
Try finding out what the functions (np.polyfit(), np.log()) are actually doing using documentation.
This is standard practice for scikit-learn and ML to use a seed value.
When I try to integrate a periodic array with the scipy function sp.fftpack.diff(x,order=-1), it sometimes works and sometimes doesn't.
For example, when integrating x=sin(alpha) to obtain an array of the values of the integral when evaluated from 0 to discrete values up to 2*pi I get the expected result -cos(\alphas). However, when I use it to calculate the values of the integrals of x=sin(alpha)+cos(alpha)+1 in the same ranges I do not get the right answer, even when the function is periodic.
I do not understand how this function works. Does someone have an idea?
https://docs.scipy.org/doc/scipy/reference/generated/scipy.fftpack.diff.html
For example, with this code I obtain the results in the image,I am also comparing the results with the obtained by the trapezoidal rule, which does work when fixing the offset.enter image description here
import numpy as np
from scipy import fftpack as sp
from scipy import integrate as inte
import matplotlib.pyplot as plt
N=150
h=(2*np.pi)/N
x=np.arange(-np.pi,np.pi,h)
y=np.sin(x)+np.cos(x)+1
arrExact=-np.cos(x)+np.sin(x)+x
st=inte.cumtrapz(y,x,initial=0)-2.1
di=sp.diff(y, order=-1)-1
plt.plot(x,di,label='diff')
plt.plot(x,arrExact,label='Exact')
plt.plot(x,st,label='cumpTrapz')
plt.legend()
plt.show()
Edit: Well, reading again I realized scipy assumes x[0]=0, however I need to integrate spectrally arrays that do not satisfies this condition, How can I proceed?
I'm fairly new to programming, but this problem happens in python and in excel as well.
I'm using the following formulas for the RC transfer function
s/(s+1) for High Pass
1/(s+1) for Low Pass
with s = jwRC
below is the code I used in python
from pylab import *
from numpy import *
from cmath import *
"""
Generating a transfer function for RC filters.
Importing modules for complex math and plotting.
"""
f = arange(1, 5000, 1)
w = 2.0j*pi*f
R=100
C=1E-5
hp_tf = (w*R*C)/(w*R*C+1) # High Pass Transfer function
lp_tf = 1/(w*R*C+1) # Low Pass Transfer function
plot(f, hp_tf) # plot high pass transfer function
plot(f, lp_tf, '-r') # plot low pass transfer function
xscale('log')
I can't post images yet so I can't show the plot. But the issue here is the cutoff frequency is different for each one. They should cross at y=0.707, but they actually cross at about 0.5.
I figure my formula or method is wrong somewhere, but I can't find the mistake can anyone help me out?
Also, on a related note, I tried to convert to dB scale and I get the following error:
TypeError: only length-1 arrays can be converted to Python scalars
I'm using the following
debl=20*log(hp_tf)
This is a classical example why you should avoid pylab and more generally imports of the form
from module import *
unless you know exactly what it does, since it hopelessly clutters the name space.
Using,
import matplotlib.pyplot as plt
import numpy as np
and then calling np.log and plt.plot etc. will solve your problem.
Furether explanations
What's happening here is that,
from pylab import *
defines a log function from numpy that operate on arrays (the one you want).
However, the later import,
from cmath import *
overwrites it with a version that only accepts scalars, hence your error.
I have been porting code for an isomap algorithm from MATLAB to Python. I am trying to visualize the sparsity pattern using the spy function.
MATLAB command:
spy(sparse(A));
drawnow;
Python command:
matplotlib.pyplot.spy(scipy.sparse.csr_matrix(A))
plt.show()
I am not able to reproduce the MATLAB result in Python using the above command. Using the command with only A in non-sparse format gives quite similar result to MATLAB. But it's taking quite long (A being 2000-by-2000). What would be the MATLAB equivalent of a sparse function for scipy?
Maybe it's your version of matplotlib that makes trouble, as for me scipy.sparse and matplotlib.pylab work well together.
See sample code below that produces the 'spy' plot attached.
import matplotlib.pylab as plt
import scipy.sparse as sps
A = sps.rand(10000,10000, density=0.00001)
M = sps.csr_matrix(A)
plt.spy(M)
plt.show()
# Returns here '1.3.0'
matplotlib.__version__
This gives this plot:
I just released betterspy, which arguably does a better job here. Install with
pip install betterspy
and run with
import betterspy
from scipy import sparse
A = sparse.rand(20, 20, density=0.1)
betterspy.show(A)
betterspy.write_png("out.png", A)
With smaller markers:
import matplotlib.pylab as pl
import scipy.sparse as sps
import scipy.io
import sys
A=scipy.io.mmread(sys.argv[1])
pl.spy(A,precision=0.01, markersize=1)
pl.show()