Plotting a decaying exponential in Pycharm from a CSV file - python

I am trying to plot this data as a decaying exponential, all of the data has the same x values just the y values differ. y= a*[(-1)*exp(-x/t)].
I am not getting the correct chart when it goes through. csv file In the image is the type of curve I am looking for. I need to plot all of the data in csv (preferably on the same plot) in pycharm. I am relatively new to pycharm so I am starting from scratch! (excel just wouldn't behave for this data) Willing to start fresh as well if there is a simpler way of writing the code, I sparsed this together with some help from the internet.
import scipy.signal as scp
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import numpy.core.function_base
def decaying_exponential(x,a,t,c):
return a *(-1)* np.exp(-1 * (x) / t) + c
import os
for f in os.listdir("/Users/flyar/My Python Stuff/"):
print(f)
df = numpy.transpose(pd.read_csv("D:/Grad Lab/NMR/Data/T1 Data/mineral oil/F0009CH1.CSV", names= ['a','b','c','d']).to_numpy())
temp = scp.find_peaks(df[2], height = 0)
df_subset = [(df[1][n], df[2][n]) for n in temp[0]]
print(df_subset)
plt.scatter([df[2][n] for n in temp[0]], [df[1][n] for n in temp[0]])
y = np.linspace(min(df[2]), max(df[2]), 1000)
params, covs = curve_fit(decaying_exponential, [df[1][n] for n in temp[0][2::]],
[df[2][n] for n in temp[0][2::]], maxfev=10000)
print(params)
plt.plot(y, [decaying_exponential(l, 5, params[1], params[2]) for l in y])
plt.show()

Related

How to count number of points above a least square fit?

I want to count the points above the least squares fits.
from sklearn.mixture import GaussianMixture
from sklearn import preprocessing
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from scipy import stats
from astropy.io import ascii
from scipy.stats import norm
from astropy.timeseries import LombScargle
from astropy import stats
data4= pd.read_csv('Standard Dev main pop.csv')
names4 = data4.columns
df3 = pd.DataFrame(data4, columns=names4)
df3.head()
#print(c)
data5= pd.read_csv('2 Sigma Main pop.csv')
names5 = data5.columns
df5 = pd.DataFrame(data5, columns=names5)
df5.head()
data6= pd.read_csv('3 Sigma main pop.csv')
names6 = data6.columns
df6 = pd.DataFrame(data6, columns=names6)
df6.head()
a=df5['Mean Mag']
b=df5['Std']
c=df6['Mean Mag']
d=df6['Std']
e=df3['Mean Mag']
f=df3['Std']
ax=plt.scatter(e,f, label=' All sources')
#ay=plt.scatter(c,d, label='3 Sigma from Median Std')
lstsq_coefs = np.polyfit(a, b, deg=2)
lstsq_preds = lstsq_coefs[0]*a**2 + lstsq_coefs[1]*a + lstsq_coefs[2]
plt.plot(a, lstsq_preds, linestyle="dashed", color="red", label="Least squares 2 sigma")
#ay=plt.scatter(c,d, label='3 Sigma from Median Std')
lstsq_coefs1 = np.polyfit(c ,d, deg=2)
lstsq_preds1 = lstsq_coefs1[0]*c**2 + lstsq_coefs1[1]*c + lstsq_coefs1[2]
plt.plot(c, lstsq_preds1, linestyle="dashed", color="black", label="Least squares 3 sigma")
plt.legend(loc='best',fontsize= 16)
plt.gcf().set_size_inches((12,10))
plt.ylim(0,0.1)
plt.show()
I want to count the number of points that lie above each least-squares fit. I have tried some extremely tedious methods which is not feasible in the long run.
You can compare them using numpy module:
import numpy as np
f = np.array(f)
lstsq_preds = np.array(lstsq_preds)
lstsq_preds1 = np.array(lstsq_preds1)
print("Number above least squares #1:", len(f[f > lstsq_preds]))
print("Number above least squares #2:", len(f[f > lstsq_preds]))
Note that, I just transform the arrays to numpy to make sure they are numpy. It might be unnecessary to use these transforming lines since you are dealing with pandas dataframe.

Problem with plotting/calculating exponential curve (python, matplotlib, pandas)

I have some data that forms exponential curve and I'm trying to fit that curve to the data.
Unfortunately everything I have tried didn't work (I will spare you madness of the code).
The thing is that it works when I used a*x**2 +b*x + c or a*x**3 + b*x**2 +c*x + d with what I found on internet (using implementation(s) of from scipy.optimize import curve_fit). Again I will spare you my iterations of exp function.
Here is the data:
x,y
0.48995590396864286,8.109516054921031e-09
0.48995590396864286,8.09818090049968e-09
0.48995590396864286,8.103734197035667e-09
0.48995590396864286,8.110736963480639e-09
0.48995590396864286,8.09118823654877e-09
0.48995590396864286,8.12135991705394e-09
0.48995590396864286,8.122079043957364e-09
0.48995590396864286,8.128376050930522e-09
0.48995590396864286,8.157919899241163e-09
0.48661800486618,8.198100087712926e-09
0.48426150121065376,8.22138382076506e-09
0.48192771084337344,8.281557310731435e-09
0.4793863854266539,8.27420119872003e-09
0.47709923664122134,8.321514715516415e-09
0.47483380816714155,8.3552316463302e-09
0.47483380816714155,8.378564235036926e-09
0.47192071731949026,8.401917724613532e-09
0.4703668861712136,8.425994519752875e-09
0.4681647940074906,8.45965504646707e-09
0.4659832246039143,8.496218480906607e-09
0.46382189239332094,8.551849768778838e-09
0.46168051708217916,8.54285497435508e-09
0.46168051708217916,8.583748312156053e-09
0.46168051708217916,8.646661429014719e-09
0.4568296025582458,8.733501981255873e-09
0.45475216007276037,8.765708849715661e-09
0.45004500450045004,8.8589473576661e-09
0.44385264092321347,8.991513675928626e-09
0.4397537379067722,9.130861147033911e-09
0.43308791684711995,9.301055589581911e-09
0.4269854824935952,9.533957982742729e-09
0.42052144659377627,9.741467401775447e-09
0.41476565740356697,9.942960683024683e-09
0.4088307440719542,1.0205883938061429e-08
0.40176777822418647,1.0447121052453653e-08
0.3947887879984209,1.0747232046538825e-08
0.3895597974289053,1.1089181777589068e-08
0.3829950210647261,1.1466586145307001e-08
0.37664783427495296,1.1898726912256124e-08
0.3707823507601038,1.2248924384552248e-08
0.362844702467344,1.2806614625543388e-08
0.35676061362825545,1.3206507000963428e-08
0.35385704175513094,1.3625333143433576e-08
0.3460207612456747,1.4205592733074004e-08
0.34002040122407345,1.4793868231688043e-08
0.3348961821835231,1.545475512236522e-08
0.3287310979618672,1.6141630273450685e-08
0.32185387833923396,1.698004473312357e-08
0.3162555344718533,1.7677811603552503e-08
0.3111387678904792,1.858017339865837e-08
0.3037667071688943,1.9505998651376402e-08
0.29886431560071725,2.022694254385094e-08
0.2910360884749709,2.1353523243307723e-08
0.28457598178713717,2.2277591448622187e-08
0.2770083102493075,2.302804705798657e-08
0.2727024815925825,2.299784512552745e-08
If you believe this is exponentiel curve i would find linear fit of the log of the data.
# your data in a Dataframe
import pandas as pd
import numpy as np
df = pd.read_csv("data.csv", sep=",")
# get log of your data
log_y = np.log(df["y"])
# linear fit of your log (as exp(ln(ax + b)) = ax + b)
a, b = np.polyfit(df.x, log_y, 1)
# plot the fit
import matplotlib.pyplot as plt
plt.scatter(df.x, df.y, label="raw_data")
plt.plot(df.x, np.exp(a*df.x + b), label="fit")
plt.legend()

Why does nothing showing up on my plot even with defined variables

So i made this code to create a plot that should look like this[This image was done in Mathematica] 1 but for some reason nothing shows up on the plot plot i made.does it have to something with the gam(x_2) or gam itself because i tried defining that as a range but still nothing. please teach me. From the plot made in matematica it seems like he set both the x and y ranges all the way up to 10,000.
import matplotlib.pyplot as plt
import numpy as np
import math
import pylab
%matplotlib inline
gam0 = 72.8
temp = 293.15
def gam(x_2):
return gam0 - 0.0187 * temp * math.log10(1+628.14*55.556*x_2)
x = range(0, 10000)
x_2= x
plt.plot('gam(x_2), x_2')
plt.xlabel('Log_10x_2')
plt.ylabel('gamma (erg cm^2)')
A few fixes needed; defining your function, there's an indent missing, also multiplying the whole array with ' * ' isn't working, so you can save up the values in a separate array through a for loop:
EDIT: Oh, and also while plotting, you don't put the variable names as strings, you just call them as they are.
import matplotlib.pyplot as plt
import numpy as np
import math
import pylab
%matplotlib inline
gam0 = 72.8
temp = 293.15
x = range(0, 10000)
x_2= x
def gam(x_2):
returns = []
for x_i in x_2:
returns.append(gam0 - 0.0187 * temp * math.log10(1+628.14*55.556*x_i))
return returns
plt.plot(gam(x_2), x_2)
plt.xlabel('Log_10x_2')
plt.ylabel('gamma (erg cm^2)')
plt.show()
Indent your function
def gam(x_2):
return gam0 - 0.0187 * temp * math.log10(1+628.14*55.556*x_2)
Find gam(x_2) for each item(x_2) in list x
gam_x = [gam(x_2) for x_2 in x]
Finally, plot and show.
plt.plot(gam_x, x)
plt.xlabel('Log_10x_2')
plt.ylabel('gamma (erg cm^2)')
plt.show()

Plotting stick-breaking process in R based on Python code

I'd like to reproduce Python code to R code about Stick-breaking process, which is one of construction schemes for Dirichlet Process. However, the plot I drew within R is quite different in that DP sample distributions are not around the base distribution, H.
The reference Python code is from Austin Rochford's blog.
from matplotlib import pyplot as plt
import numpy as np
import pymc3 as pm
import scipy.stats as ss
import seaborn as sns
from statsmodels.datasets import get_rdataset
from theano import tensor as T
np.random.seed(433)
N=20
K=30
alpha=50
H = ss.norm # base dist
beta = ss.beta.rvs(1,alpha, size=(N,K))
pi = np.empty_like(beta)
pi[:, 0] = beta[:,0]
pi[:, 1:] = beta[:, 1:] * (1-beta[:, :-1]).cumprod(axis=1)
omega = H.rvs(size=(N,K))
x_plot = np.linspace(-3,3,200)
sample_cdfs = (pi[..., np.newaxis]* np.less.outer(omega, x_plot)).sum(axis=1)
fig, ax = plt.subplots(figsize=(8,6))
ax.plot (x_plot, sample_cdfs[0],c="gray", alpha=0.75, label = "DP sample CDFs")
ax.plot(x_plot, sample_cdfs[1:].T, c="gray", alpha=0.75)
ax.plot(x_plot, H.cdf(x_plot), c= "k", label = "Base CDF")
ax.set_title(r'$\alpha = {}$'.format(alpha))
ax.legend(loc=2)
The figure on the right side is the result in Python code.
And I tried to convert it to R code:
library(yarrr)
N=20;K=30;ngrid=200;alpha=50
xgrid = seq(-3,3,length.out=ngrid)
betas = matrix(rbeta(N*K, 1, alpha),nr=N, nc=K)
stick.to.right = c(1, cumprod(1 - betas))[1:K]
pis.temp = stick.to.right * betas
omega = matrix(rnorm(N*K),nr=N,nc=K)
dirac = array(numeric(N*K*ngrid),dim=c(N,K,ngrid))
for(i in 1:N){
for(j in 1:K){
for(k in 1:ngrid){
dirac[i,j,k]=ifelse(omega[i,j]<xgrid[k],TRUE,FALSE)
}
}
}
pis = array(pis.temp,dim=c(N,K,200))
sample_cdfs = apply(pis* dirac,c(1,3),sum)
plot(xgrid,sample_cdfs[1,],col=piratepal("pony"),type="l",lwd=1,ylim=c(0,1))
for(i in 2:N) lines(xgrid,sample_cdfs[i,],col=piratepal("pony")[i])
lines(xgrid,pnorm(xgrid),lwd=2)
The plot I drew is DP with alpha=50:
How can I modify R code to give a similar result as Python code?

How can I determine three best linear fits to a data with Python?

I have data of the form shown in figure. The natural logarithm of the data when will always have three distinct linear ranges but the ranges will not always be the same, it varies with data, but there will definitely be three regions where three different linear fits can be made.
I am trying to determine the best three linear fits to natural logarithm of it marked as I, II and III. The figure shows natural logarithm of y-data. This has to applied to at least thousand datasets. The code automatically has to detect the best linear fits for the three regions shown in figure.
I am trying to get it done using thus code which tries to apply two piecewise linear fits using code from here, but it does not correctly. I need it extended to three liner fits. How can I determine three best linear fits to the data with Python?
MWE
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
import matplotlib.colors as colors
import matplotlib.cm as mplcm
import itertools
from scipy import optimize
def piecewise_linear(x, x0, y0, k1, k2):
return np.piecewise(x, [x < x0], [lambda x:k1*x + y0-k1*x0, lambda x:k2*x + y0-k2*x0])
with open('./three_piecewise_linear.dat', "r") as data:
while True:
line = data.readline()
if not line.startswith('#'):
break
data_header = [i for i in line.strip().split('\t') if i]
_data_ = np.genfromtxt(data, names = data_header, dtype = None, delimiter = '\t')
_data_.dtype.names = [j.replace('_', ' ') for j in _data_.dtype.names]
data = np.array(_data_.tolist())
n_rf = data.shape[1] - 2
xd = np.linspace(1, 1.5, 100)
fit_data = np.empty(shape = (100, n_rf))
for i in range(n_rf):
p , e = optimize.curve_fit(piecewise_linear, data[:, 1], np.log(data[:, i + 2]))
fit_data[:, i] = piecewise_linear(xd, *p)

Categories

Resources