Contour plot of multiple lineplots in matplolib - python

I have a set of 125 x and y data (Xray absorption spectroscopy data ie energy vs intensity) and I would like to reproduce a plot similar to this one : [contour plot of xanes spectras]
(https://i.stack.imgur.com/0Kymp.png)
The spectras were taken as a function of time and my goal is to plot them in a 2d contour plot with the energy as x, and the time (or maybe just the index of the spectra) as the y. I would like the z axis to represent the intensity of the spectra with different colors so that changes in time are easily seen.
My data currently look like this, when I plot them all in the same graph with a viridis color map.line plot of the spectras
I have tried to work with the contour function of matplotlib and got this result :
attempt of a contour plot
I used the following code :
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
df = pd.read_excel('data.xlsx')
energy = df['energy']
df.index = energy
df = df.iloc[:,2:]
df = df.transpose()
X = energy
Y = range(len(df.index))
fig, ax = plt.subplots()
ax.contourf(X,Y,df)
plt.show()
If you have any idea, I would be grateful. I am in fact not sure that the contour function is the most apropriate for what I want, and I am open to any suggestion.
Thanks,
Yoloco

Related

How do I cluster values of y axis against x axis in scatterplot?

Lets say I've 2 arrays
x = [1,2,3,4,5,6,7]
y = [1,2,2,2,3,4,5]
its scatter plot looks like this
what I want to do is that I want my x axis to look like this in the plot
0,4,8
as a result of which values of y in each piece of x should come closer .
The similar behavior I've seen is bar plots where this is called clustering , how do I do the same in case of scatter plot , or is there any other plot I should be using ?
I hope my question is clear/understandable .
All the help is appreciated
With you plot, try this, before you display the plot.
plt.xticks([0,4,8]))
or
import numpy as np
plt.xticks(np.arange(0, 8+1, step=4))
Then to change the scale you can try something like this,
plt.xticks([0,4,8]))
plt.rcParams["figure.figsize"] = (10,5)
I got this with my example,
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0, 10, 30)
y = np.sin(x)
plt.xticks([0,4,8])
plt.rcParams["figure.figsize"] = (7,3)
plt.plot(x, y, 'o', color='black')
output
I think what you are looking for is close to swarmplots and stripplots in Seaborn. However, Seaborn's swarmplot and stripplot are purely categorical on one of the axes, which means that they wouldn't preserve the relative x-axis order of your elements inside each category.
One way to do what you want would be to increase the space in your x-axis between categories ([0,4,8]) and modify your xticks accordingly.
Below is an example of this where I assign the data to 3 different categories: [-2,2[, [2,6[, [6,10[. And each bar is dil_k away from its directly neighboring bars.
import matplotlib.pyplot as plt
import numpy as np
#Generating data
x= np.random.choice(8,size=(100))
y= np.random.choice(8,size=(100))
dil_k=20
#Creating the spacing between categories
x[np.logical_and(x<6, x>=2)]+=dil_k
x[np.logical_and(x<10, x>=6)]+=2*dil_k
#Plotting
ax=plt.scatter(x,y)
#Modifying axes accordingly
plt.xticks([0,2,22,24,26,46,48,50],[0,2,2,4,6,6,8,10])
plt.show()
And the output gives:
Alternatively, if you don't care about keeping the order of your elements along the x-axis inside each category, then you can use swarmplot directly.
The code can be seen below:
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
#Generating data
x= np.random.choice(8,size=(100))
y= np.random.choice(8,size=(100))
#Creating the spacing between categories
x[np.logical_and(x<2,x>=-2)]=0
x[np.logical_and(x<6, x>=2)]=4
x[np.logical_and(x<10, x>=6)]=8
#Plotting
sns.swarmplot(x=x,y=y)
plt.show()
And the output gives:

filling a Mat Plot Lib Scatter plot with points using a loop

I tried this but got an error that they are not the same size
x = np.linspace(0,501,num=50)
y = np.linspace(0,501,num=50)
for i in range(10,510,10):
plt.scatter(x,i,c='dimgrey')
ax = plt.gca()
ax.set_facecolor('darkgrey')
plt.xlim(0,501)
plt.ylim(0,501);
My overall goal is to have N amount of points plotted in a grid orientation in the scatter plot. I was tying to plot 2500 points like this.
All I could come up with was one row or column would equal 50 points,
and I made this loop.
I want to fill the plot like this: a line of points at y= 10 as I have here, then at 20,30,40... so on. I realize I could do this manually but is there an easier way I could incorporate it into the loop? I am planning on putting it into an animation later.
Here is an simple example, starting from your code.
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0,501,num=50)
for i in range(10,40,10):
y = i * np.ones(50)
plt.scatter(x,y)
This gives the following plot :

How to stop numpy trendline from going below 0 on matplotlib graph

I am creating several scatter plot graphs in matplotlib. For these I want to plot trend lines for the scatter plots. I am using the numpy polyfit and poly1d methods to create the trendline.
My problem is as follows: There are only positive y values in my dataset (I have also removed all 0 values), but my trendlines are going below 0. The reason why I think it's going below 0 is that I have some very large outlier values that skew the trendline.
Is there a way I can prevent my graph trendlines from going below 0 without removing data points? Perhaps using a method or parameter for a method in the numpy or matplotlib libraries?
Removing outliers helps some trendlines, but not at all for the multiple graphs I'm making.
Graph example with scatter points: https://imgur.com/a/bwIFJw7
Graph example without scatter points (same data as above graph): https://imgur.com/a/k5TyNjt
Changing the degree of the trend line doesn't solve the issue
code for reproduce-ability:
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import pandas as pd
import numpy as np
plt.figure(figsize=(20,150))
loc = mdates.AutoDateLocator()
dataset = {'time':['4/5/2014','4/10/2014','4/21/2014','5/3/2014','5/8/2014','5/19/2014','6/7/2014','6/12/2014','6/16/2014','12/6/2014','12/11/2014','12/15/2014','2/7/2015','2/12/2015','2/16/2015','7/20/2015','8/1/2015','8/13/2015','8/17/2015,'9/5/2015','9/10/2015','9/21/2015','10/3/2015','12/10/2015','1/18/2016','8/6/2016','8/11/2016','8/15/2016','9/3/2016','9/8/2016','9/19/2016','10/1/2016','10/13/2016','10/17/2016','11/10/2016','11/5/2016','8/10/2017','9/14/2017','9/18/2017','10/7/2017','2/8/2018','2/19/2018','3/3/2018','3/8/2018','3/19/2018','4/12/2018','4/7/2018','4/16/2018','5/5/2018','5/10/2018','5/21/2018','11/3/2018','11/8/2018','11/19/2018','12/1/2018','12/13/2018','12/17/2018','1/5/2019','1/10/2019','1/21/2019','2/2/2019','2/14/2019','2/18/2019','3/2/2019','3/14/2019','3/18/2019','4/6/2019','4/11/2019','4/15/2019'],'yval':[1714.6,996.32,1638.4,1293.47,744.73,1843.2,1009.97,2168.47,819.2,2949.12,2730.67,2106.51,14745.6,3880.42,73728,792.77,538.16,585.14,571.53,580.54,933.27,460.8,646.74,4336.94,36864,190.51,206.89,199.02,197.54,219.84,210.27,223.75,201.96,212.23,223.6,211.48,1568.68,418.91,837.82,5671.38,217.18,189.74,192.59,192.04,196.74,197.8,196.47,200.69,193.69,210.79,349.42,222.5,209.17,191.37,192.91,197.57,207.23,192.48,189.7,199.44,187.57,186.85,187.99,189.19,196.34,196.11,192.61,196.39,190.05,]}
dataset['time'] = pd.to_datetime(dataset['time'])
dataset['yval'] = pd.to_numeric(dataset['yval'])
x = mdates.date2num(dataset['time'])
y = dataset['yval']
z = np.polyfit(x,y,3)
p = np.poly1d(z)
plt.plot(x,p(x),'#00FFFF', label = type)
plt.title(type)
plt.xlabel('Time')
plt.ylabel('Weight')
#comment out the next line to see plot without scatter points
plt.scatter(x,y)
plt.gca().xaxis.set_major_locator(loc)
plt.gca().xaxis.set_major_formatter(mdates.AutoDateFormatter(loc))
plt.grid(which='major',axis='both')
plt.show()
Graph with trendline not going below the horizontal 0 axis is the desired output

Set the axis range in a boxplot

I'm working on this kaggle dataset on the EDA.
I´m working with some boxplot in pandas with this code:
coupon_list[["CATALOG_PRICE","VALIDEND_MONTH"]].boxplot(by='VALIDEND_MONTH')
The problem I'm havaing here is that the y axes has a large scale and it hard to read the plot. Is there any way to limit the sixze of this axis? something similar to ylim ?
EDIT:
The dataset have outliers, adding the argument:
showfliers=False
Seems to solve the issue.
It's weird since by default the Y axis is autoscaled, see the example below. Maybe you have some outliers in your data. Could you share more code?
import pandas as pd
import numpy as np
np.random.seed = 4
df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))
ax = df.boxplot()
Here is the same plot with outliers
# Generating some outliers
df.loc[0] = df.loc[0] * 10
ax = df.boxplot()
Could you try the showfliers option to plot the box without outliers? In this example the Y scale is back to [0-100].
ax = df.boxplot(showfliers=False)
showfliers : bool, optional (True)
Show the outliers beyond the caps.
matplotlib.axes.Axes.boxplot

How to plot several curves with an offset on the same graph

I read a waveform from an oscilloscope. The waveform is divided into 10 segments as a function of time. I want to plot the complete waveform, one segment above (or under) another, 'with a vertical offset', so to speak. Additionally, a color map is necessary to show the signal intensity. I've only been able to get the following plot:
As you can see, all the curves are superimposed, which is unacceptable. One could add an offset to the y data but this is not how I would like to do it. Surely there is a much neater way of plotting my data? I've tried a few things to solve this issue using pylab but I am not even sure how to proceed and if this is the right way to go.
Any help will be appreciated.
import readTrc #helps read binary data from an oscilloscope
import matplotlib.pyplot as plt
fName = r"...trc"
datX, datY, m = readTrc.readTrc(fName)
segments = m['SUBARRAY_COUNT'] #number of segments
x, y = [], []
for i in range(segments+1):
x.append(datX[segments*i:segments*(i+1)])
y.append(datY[segments*i:segments*(i+1)])
plt.plot(x,y)
plt.show()
A plot with a vertical offset sounds like a frequency trail.
Here's one approach that does just adjust the y value.
Frequency Trail in MatPlotLib
The same plot has also been coined a joyplot/ridgeline plot. Seaborn has an implementation that creates a series of plots (FacetGrid), and then adjusts the offset between them for a similar effect.
https://seaborn.pydata.org/examples/kde_joyplot.html
An example using a line plot might look like:
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
segments = 10
points_per_segment = 100
#your data preparation will vary
x = np.tile(np.arange(points_per_segment), segments)
z = np.floor(np.arange(points_per_segment * segments)/points_per_segment)
y = np.sin(x * (1 + z))
df = pd.DataFrame({'x': x, 'y': y, 'z': z})
pal = sns.color_palette()
g = sns.FacetGrid(df, row="z", hue="z", aspect=15, height=.5, palette=pal)
g.map(plt.plot, 'x', 'y')
g.map(plt.axhline, y=0, lw=2, clip_on=False)
# Set the subplots to overlap
g.fig.subplots_adjust(hspace=-.00)
g.set_titles("")
g.set(yticks=[])
g.despine(bottom=True, left=True)
plt.show()
Out:

Categories

Resources