When I try to integrate a periodic array with the scipy function sp.fftpack.diff(x,order=-1), it sometimes works and sometimes doesn't.
For example, when integrating x=sin(alpha) to obtain an array of the values of the integral when evaluated from 0 to discrete values up to 2*pi I get the expected result -cos(\alphas). However, when I use it to calculate the values of the integrals of x=sin(alpha)+cos(alpha)+1 in the same ranges I do not get the right answer, even when the function is periodic.
I do not understand how this function works. Does someone have an idea?
https://docs.scipy.org/doc/scipy/reference/generated/scipy.fftpack.diff.html
For example, with this code I obtain the results in the image,I am also comparing the results with the obtained by the trapezoidal rule, which does work when fixing the offset.enter image description here
import numpy as np
from scipy import fftpack as sp
from scipy import integrate as inte
import matplotlib.pyplot as plt
N=150
h=(2*np.pi)/N
x=np.arange(-np.pi,np.pi,h)
y=np.sin(x)+np.cos(x)+1
arrExact=-np.cos(x)+np.sin(x)+x
st=inte.cumtrapz(y,x,initial=0)-2.1
di=sp.diff(y, order=-1)-1
plt.plot(x,di,label='diff')
plt.plot(x,arrExact,label='Exact')
plt.plot(x,st,label='cumpTrapz')
plt.legend()
plt.show()
Edit: Well, reading again I realized scipy assumes x[0]=0, however I need to integrate spectrally arrays that do not satisfies this condition, How can I proceed?
Related
So, if the title is not clear, I am trying to take the function [sin(x)/(x3-2*x2+4x-8)] and find the integral. I know that the analytical solution is (-pi/(8e**2))+(pi/8)*cos(2) or ~ -0.2165...
The code that I can not seem create properly so far looks like
import numpy as np
from sympy import integrate
from sympy.abc import x
f = integrate(np.sin(x)/(x**3-2*x**2+4x-8), (x, -5, 2))
print(f)
I would like the code to eventually be drawn out which I plan on doing myself, however, this gives me the error
"TypeError: loop of ufunc does not support argument 0 of type Symbol which has no callable sin method"
How do I go about fixing this?
I am new to Python, so I am not sure if this problem is due to my inexperience or whether this is a glitch.
I am running this code multiple times on the same data (no random number generation) and getting different results. This has occurred with more than one variable so far, and obviously I cannot proceed with the analysis until I figure out which results are trustworthy. Here is a short sample of the results I have obtained after running the code four times. Why is there such a discrepancy between these outputs? I am puzzled and greatly appreciate your advice.
Linear Regression
from scipy.stats import linregress
import scipy.stats
from scipy.signal import welch
import matplotlib
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import scipy.signal as signal
part_022_o = pd.read_excel(r'C:\Users\Me\Desktop\Behavioral Data Processed\part_022_combined_other.xlsx')
distance_o = part_022_o["distance"]
fs = 200
f, Pwelch_spec = signal.welch(distance_o, fs=fs, window='hanning',nperseg=400, noverlap=200, scaling='density', average='mean')
log_f = np.log(f, where=f>0)
log_pwelch = np.log(Pwelch_spec, where=Pwelch_spec>0)
idx = np.isfinite(log_f) & np.isfinite(log_pwelch)
polynomial_coefficients = np.polyfit(log_f[idx],log_pwelch[idx],1)
print(polynomial_coefficients)
scipy.stats.linregress(log_f[idx], log_pwelch[idx])
Results First Attempt
[ 0.00324568 -2.82962602]
Results Second Attempt
[-2.70137164 6.97117509]
Results Third Attempt
[-2.70137164 6.97117509]
Results Fourth Attempt
[-2.28028005 5.53839502]
The same thing happens when I use scipy.stats.linregress().
Thank you,
Confused
Edit: full code added.
Also, the issue appears to be related to np.log(), since only the values of "log_f" array seem to be changing with the different outputs. It is hard to be certain that nothing else is changing (e.g. log_pwelch), but differences in output clearly correspond to differences in the first value of the "log_f" array.
Edit: I have narrowed the issue down to np.log(f, where=f>0). The first value in the f array is zero. According to the documentation of numpy log, "...Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized." Apparently this means that the value or variable is unpredictable and can vary from trial to trial, which is exactly what I am observing. Given my inexperience with Python, I am not sure what the best solution is (e.g. specifying the out-array in the log function, use a random seed, just note the regression coefficients whenever the value of zero is unchanged after log, etc.)
Try to use a random seed to reproduce results. Do this with the following code at the top of your program:
import numpy as np
np.random.seed(123) or any number you want
see here for more info: https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.random.seed.html
A random seed ensures you get repeatable results when some part of your program is generating numbers at random.
Try finding out what the functions (np.polyfit(), np.log()) are actually doing using documentation.
This is standard practice for scikit-learn and ML to use a seed value.
I was trying to understand what exactly scipy.signal.resample is to do. Here is a simple code:
import numpy as np
from scipy.signal import resample
import matplotlib.pyplot as plt
t=np.linspace(1,4,10)
x=t
x_resampled=resample(x,10000)
plt.plot(x)
plt.figure()
plt.plot(x_resampled)
the input is
and the output of code is
however, I expected
Can you please tell me how can I settle scipy.signal.resample to obtain such a result?
Look at the documentation for resample (emphasis mine):
Because a Fourier method is used, the signal is assumed to be periodic.
On the non-periodic data you're using the above assumption doesn't hold. So a valid result should not be expected.
I'm getting inverse of matrix even if determinant is zero. I tried with the code below:
import numpy as np
from scipy import linalg
matrix=np.array([[5,10],[2,4]])
print(linalg.det(matrix))
linalg.inv(matrix)
The problem here is a disconnect between how different routines calculate the matrix determinant. For example:
import numpy as np
matrix=np.array([[5.,10.],[2.,4.]])
print(np.linalg.det(matrix))
print(np.linalg.slogdet(matrix))
try:
invmatrix=np.linalg.inv(matrix)
except np.linalg.LinAlgError:
print("inversion failed")
produces no exception and prints this:
-1.1102230246251625e-15
(-1.0, -34.43421547668305)
i.e. not using a direct algebraic calculation of the determinant (which scipy.linalg.det does) yields a non-zero determinant, because of accumulated floating point rounding error. Thus the standard linear algebra routines treat the matrix as non-singular and produce an incorrect inverse from an extremely poor conditioned problem.
(Tested with numpy version 1.15.4 and scipy version 1.1.0)
I am implementing Andrew Ng's Machine Learning course on Python, but I got stuck because the scipy's optimize functions keep giving me a hard time by not working/giving me dimension errors
The goal is to find the minimum of the cost function (a scalar function that takes theta (dimension (1,401)), X (dimension (5000,401)), and y (dimension (5000,1)) as inputs). I have defined such cost function and its gradient wrt parameters. When running one of the optimize functions (I have tried fmin_tnc, minimize, Nelder-Mead and others, all not working), either they run for ages or keep giving me errors saying that the array dimension is wrong, or that they find a division by 0... errors that I am not able to spot.
weirdest thing is that this problem has popped up at first when I was doing exercise 2 on logistic regression, and then magically disappeared without me changing anything. Now, Implementing multi-classification logistic regression, it has appeared again, and it won't fix even though I have literally copied and pasted the code of exercise 2!
The code is the following:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.io import loadmat
import scipy.misc
import matplotlib.cm as cm
from scipy.optimize import minimize,fmin_tnc
import random
def sigmoid(z):
return 1/(1+np.exp(-z))
def J(theta,X,y):
theta_t=np.transpose(theta)
prod=np.matmul(X,theta_t)
sigm=sigmoid(prod)
vec=y*np.log(sigm)+(1-y)*np.log(1-sigm)
return -np.sum(vec)/len(y)
def grad(theta,X,y):
theta_t=np.transpose(theta)
prod=np.matmul(X,theta_t)
sigm=sigmoid(prod)
one=sigm-y
return np.matmul(np.transpose(one),X)/len(y)
data=loadmat('/home/marco/Desktop/MLang/mlex3/ex3/ex3data1.mat')
X,y = data['X'],data['y']
X=np.column_stack((np.ones(len(X[:,0])),X))
initial_theta=np.zeros((1,len(X[0,:])))
res=fmin_tnc(func=J, x0=initial_theta.flatten(), args=(X,y.flatten()), fprime=grad)
theta_opt=res[0]
Instead of returning the value of theta that minimizes the function as theta_opt, it says:
/home/marco/anaconda3/lib/python3.6/site-packages ipykernel_launcher.py:8: RuntimeWarning: divide by zero encountered in log
I have no clue where this divide by zero occurs, given that there is literally no division in the whole code, except for the division by len(y), which is 5000, and the division in the sigmoid function (1/(1+exp(-z)), which can never be 0!
Any suggestions?