How to stop numpy trendline from going below 0 on matplotlib graph - python

I am creating several scatter plot graphs in matplotlib. For these I want to plot trend lines for the scatter plots. I am using the numpy polyfit and poly1d methods to create the trendline.
My problem is as follows: There are only positive y values in my dataset (I have also removed all 0 values), but my trendlines are going below 0. The reason why I think it's going below 0 is that I have some very large outlier values that skew the trendline.
Is there a way I can prevent my graph trendlines from going below 0 without removing data points? Perhaps using a method or parameter for a method in the numpy or matplotlib libraries?
Removing outliers helps some trendlines, but not at all for the multiple graphs I'm making.
Graph example with scatter points: https://imgur.com/a/bwIFJw7
Graph example without scatter points (same data as above graph): https://imgur.com/a/k5TyNjt
Changing the degree of the trend line doesn't solve the issue
code for reproduce-ability:
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import pandas as pd
import numpy as np
plt.figure(figsize=(20,150))
loc = mdates.AutoDateLocator()
dataset = {'time':['4/5/2014','4/10/2014','4/21/2014','5/3/2014','5/8/2014','5/19/2014','6/7/2014','6/12/2014','6/16/2014','12/6/2014','12/11/2014','12/15/2014','2/7/2015','2/12/2015','2/16/2015','7/20/2015','8/1/2015','8/13/2015','8/17/2015,'9/5/2015','9/10/2015','9/21/2015','10/3/2015','12/10/2015','1/18/2016','8/6/2016','8/11/2016','8/15/2016','9/3/2016','9/8/2016','9/19/2016','10/1/2016','10/13/2016','10/17/2016','11/10/2016','11/5/2016','8/10/2017','9/14/2017','9/18/2017','10/7/2017','2/8/2018','2/19/2018','3/3/2018','3/8/2018','3/19/2018','4/12/2018','4/7/2018','4/16/2018','5/5/2018','5/10/2018','5/21/2018','11/3/2018','11/8/2018','11/19/2018','12/1/2018','12/13/2018','12/17/2018','1/5/2019','1/10/2019','1/21/2019','2/2/2019','2/14/2019','2/18/2019','3/2/2019','3/14/2019','3/18/2019','4/6/2019','4/11/2019','4/15/2019'],'yval':[1714.6,996.32,1638.4,1293.47,744.73,1843.2,1009.97,2168.47,819.2,2949.12,2730.67,2106.51,14745.6,3880.42,73728,792.77,538.16,585.14,571.53,580.54,933.27,460.8,646.74,4336.94,36864,190.51,206.89,199.02,197.54,219.84,210.27,223.75,201.96,212.23,223.6,211.48,1568.68,418.91,837.82,5671.38,217.18,189.74,192.59,192.04,196.74,197.8,196.47,200.69,193.69,210.79,349.42,222.5,209.17,191.37,192.91,197.57,207.23,192.48,189.7,199.44,187.57,186.85,187.99,189.19,196.34,196.11,192.61,196.39,190.05,]}
dataset['time'] = pd.to_datetime(dataset['time'])
dataset['yval'] = pd.to_numeric(dataset['yval'])
x = mdates.date2num(dataset['time'])
y = dataset['yval']
z = np.polyfit(x,y,3)
p = np.poly1d(z)
plt.plot(x,p(x),'#00FFFF', label = type)
plt.title(type)
plt.xlabel('Time')
plt.ylabel('Weight')
#comment out the next line to see plot without scatter points
plt.scatter(x,y)
plt.gca().xaxis.set_major_locator(loc)
plt.gca().xaxis.set_major_formatter(mdates.AutoDateFormatter(loc))
plt.grid(which='major',axis='both')
plt.show()
Graph with trendline not going below the horizontal 0 axis is the desired output

Related

Contour plot of multiple lineplots in matplolib

I have a set of 125 x and y data (Xray absorption spectroscopy data ie energy vs intensity) and I would like to reproduce a plot similar to this one : [contour plot of xanes spectras]
(https://i.stack.imgur.com/0Kymp.png)
The spectras were taken as a function of time and my goal is to plot them in a 2d contour plot with the energy as x, and the time (or maybe just the index of the spectra) as the y. I would like the z axis to represent the intensity of the spectra with different colors so that changes in time are easily seen.
My data currently look like this, when I plot them all in the same graph with a viridis color map.line plot of the spectras
I have tried to work with the contour function of matplotlib and got this result :
attempt of a contour plot
I used the following code :
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
df = pd.read_excel('data.xlsx')
energy = df['energy']
df.index = energy
df = df.iloc[:,2:]
df = df.transpose()
X = energy
Y = range(len(df.index))
fig, ax = plt.subplots()
ax.contourf(X,Y,df)
plt.show()
If you have any idea, I would be grateful. I am in fact not sure that the contour function is the most apropriate for what I want, and I am open to any suggestion.
Thanks,
Yoloco

How to plot histograms on a 3D plot?

I have collected data on an experiment, where I am looking at property A over time, and then making a histogram of property A at a given condition B. Now the deal is that A is collected over an array of B values.
So I have a histogram that corresponds to B=B1, B=B2, ..., B=Bn. What I want to do, is construct a 3D plot, with the z axis being for property B, and the x axis being property A, and y axis being counts.
As an example, I want the plot to look like this (B corresponds to Temperature, A corresponds to Rg):
How do I pull this off on python?
The python library joypy can plot graphs like this. But I'm not sure if you also want these molecules within your graph.
Here an example:
import joypy
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
from matplotlib import cm
%matplotlib inline
temp = pd.read_csv("data/daily_temp.csv",comment="%")
labels=[y if y%10==0 else None for y in list(temp.Year.unique())]
fig, axes = joypy.joyplot(temp, by="Year", column="Anomaly", labels=labels, range_style='own',
grid="y", linewidth=1, legend=False, figsize=(6,5),
title="Global daily temperature 1880-2014 \n(°C above 1950-80 average)",
colormap=cm.autumn_r)
Output:
See this thread as reference.

How to plot a line graph of density over a density colour map plot in Python

First time user so apologies for any mistakes.
I have some code (pasted below) which is used to analyse and gain values/graphs from a simulation I have run.
This results in the following image:
I would therefore now like to plot a line graph on top of this according to the values of the colour map corresponding to r = 0 on the y-axis at every point on the x - axis with each respective value on the colour map. However, I'm completely lost on where to even begin with this. I've tried looking into KDE and other similar things, but I realise I'm not sure how to take numerical values which were used to generate the colour map.
from openpmd_viewer import OpenPMDTimeSeries
from openpmd_viewer.addons import LpaDiagnostics
import numpy as np
from scipy.constants import c, e, m_e
import matplotlib.pyplot as plt
from matplotlib import gridspec
# Replace the string below, to point to your data
ts = OpenPMDTimeSeries(r"/Users/bentorrance/diags/hdf5/")
ts_2d = LpaDiagnostics(r"/Users/bentorrance/diags/hdf5/")
plt.figure(1)
Ez = ts.get_field(iteration=5750, field='E', coord='z', plot=True, cmap='inferno')
plt.title(r'Electric Field Density $E_{z}$')
plt.show()

4D Density Plot in Python

I am looking to plot some density maps from some grid-like data:
X,Y,Z = np.mgrids[-5:5:50j, -5:5:50j, -5:5:50j]
rho = np.random.rand(50,50,50) #for the sake of argument
I am interested in producing an interpolated density plot as shown below, from Mathematica here, using Python.
Is there any solution in Matplotlib or another plotting suite for this sort of plot?
To be clear, I do not want a scatterplot of coloured points, which is not suitable the plot I am trying to make. I would like a 3D interpolated density plot, as shown below.
Plotly
Plotly Approach from https://plotly.com/python/3d-volume-plots/ uses np.mgrid
import plotly.graph_objects as go
import numpy as np
X, Y, Z = np.mgrid[-8:8:40j, -8:8:40j, -8:8:40j]
values = np.sin(X*Y*Z) / (X*Y*Z)
fig = go.Figure(data=go.Volume(
x=X.flatten(),
y=Y.flatten(),
z=Z.flatten(),
value=values.flatten(),
isomin=0.1,
isomax=0.8,
opacity=0.1, # needs to be small to see through all surfaces
surface_count=17, # needs to be a large number for good volume rendering
))
fig.show()
Pyvista
Volume Rendering example:
https://docs.pyvista.org/examples/02-plot/volume.html#sphx-glr-examples-02-plot-volume-py
3D-interpolation code you might need with pyvista:
interpolate 3D volume with numpy and or scipy

How can I change the values on Y axis of Histogram plot in Python

I have data in the CSV file. I am trying to plot a histogram using matplotlib.
Here is the code that I am trying.
data.hist(bins=10)
plt.ylabel('Frequency')
plt.xlabel('Data')
plt.show()
This is the plot that I get.
Now using the same code, I need to create a normalized histogram that shows the probability distribution of the data. But now on the y-axis, instead of plotting the number of data points that fall in each bin, you will plot the number of data points in that data bin divided by the total number of data points.
How should I do it?
Pandas' histogram adds some functionality to the underlying pyplot.hist(). Many of the parameters are passed through. One of them is density=.
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
data = pd.DataFrame(np.random.uniform(258.1, 262.3, 20))
data.hist(bins=10, density=True)
plt.ylabel('Density')
plt.xlabel('Data')
plt.show()
A related library, seaborn, has a command to create a density histogram together with a kde curve as an approximation of the probability distribution.
import seaborn as sns
sns.distplot(data, bins=10)

Categories

Resources