Plot start-end time slots - matplotlib python - python

I'm trying to plot time slots. I have two ndarrays of 'start' and 'end' points.
I want to draw it as chunks on a figure. Keep in mind that the chunks are not consecutive and there are gaps between the slots.
Until now I have tried to use patches:
for x_1 , x_2 in zip(s_data['begin'].values ,s_data['end'].values):
ax1.add_patch(Rectangle((x_1,0),x_2-x_1,0.5))
plt.show()
But its only giving me hald blue figure.
While I want something like this

The approach is correct. You just need to scale the axes such that the complete plot is within its range.
import matplotlib.pyplot as plt
import pandas as pd
df = pd.DataFrame({"begin": [1,4,6,9], "end" : [3,5,8,12]})
fig, ax = plt.subplots()
for x_1 , x_2 in zip(df['begin'].values ,df['end'].values):
ax.add_patch(plt.Rectangle((x_1,0),x_2-x_1,0.5))
ax.autoscale()
ax.set_ylim(-2,2)
plt.show()
It is worth noting that matplotlib has a function broken_barh, which simplifies the creation of such charts.
import matplotlib.pyplot as plt
import pandas as pd
df = pd.DataFrame({"begin": [1,4,6,9], "end" : [3,5,8,12]})
fig, ax = plt.subplots()
ax.broken_barh(list(zip(df["begin"].values, (df["end"] - df["begin"]).values)), (0, 0.5))
ax.set_ylim(-2,2)
plt.show()
Giving the same diagram as the above.

Related

Superimposing plots in seaborn cause x-axis to misallign

I am having an issue trying to superimpose plots with seaborn. I am able to generate the two plots separetly as
fig, (ax1,ax2) = plt.subplots(ncols=2,figsize=(30, 7))
sns.lineplot(data=data1, y='MSE',x='pct_gc',ax=ax1)
sns.boxplot(x="pct_gc", y="MSE", data=data2,ax=ax2,width=0.4)
The output looks like this:
But when i try to put both plots superimposed, but assiging both to the same ax object.
fig, (ax1,ax2) = plt.subplots(ncols=2,figsize=(30, 7))
sns.lineplot(data=data1, y='MSE',x='pct_gc',ax=ax1)
sns.boxplot(x="pct_gc", y="MSE", data=data2,ax=ax2,width=0.4)
I am not able to identify with the X axis in the Lineplot changes when superimposing both plots (both plots X axis go from 0 to 0.069).
My goal is for both plots to be superimposed, while keeping the same X axis range.
Seaborn's boxplot creates categorical x-axis, with all boxes nicely with the same distance. Internally the x-axis is numbered as 0, 1, 2, ... but externally it gets the labels from 0 to 0.069.
To combine a line plot with a boxplot, matplotlib's boxplot can be addressed directly, so that positions and widths can be set explicitly. When patch_artist=True, a rectangle is created (instead of just lines), for which a facecolor can be given. manage_ticks=False prevents that boxplot changes the x ticks and their limits. Optionally notch=True would accentuate the median a bit more, but depending on the data, the confidence interval might be too large and look weird.
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
data1 = pd.DataFrame({'pct_gc': np.linspace(0, 0.069, 200), 'MSE': np.random.normal(0.02, 0.1, 200).cumsum()})
data1['pct_range'] = pd.cut(data1['pct_gc'], 10)
fig, ax1 = plt.subplots(ncols=1, figsize=(20, 7))
sns.lineplot(data=data1, y='MSE', x='pct_gc', ax=ax1)
for interval, color in zip(np.unique(data1['pct_range']), plt.cm.tab10.colors):
ax1.boxplot(data1[data1['pct_range'] == interval]['MSE'],
positions=[interval.mid], widths=0.4 * interval.length,
patch_artist=True, boxprops={'facecolor': color},
notch=False, medianprops={'color':'yellow', 'linewidth':2},
manage_ticks=False)
plt.show()

Adjust spacing on X-axis in python boxplots

I plot boxplots using sns.boxplot and pandas.DataFrame.boxplot in python 3.x.
And I want to ask is it possible to adjust the spacing between boxes in boxplot, so the box of Group_b is farther right to the box of Group_a than in the output figures. Thanks
Codes:
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
dict_a = {'value':[1,2,3,7,8,9],'name':['Group_a']*3+['Group_b']*3}
dataframe = pd.DataFrame(dict_a)
sns.boxplot( y="value" , x="name" , data=dataframe )
Output figure:
dataframe.boxplot("value" ,by = "name" )
Output figure 2:
The distance between the two boxes is determined by the x axis limits. For a constant distance in data units between the boxes, what makes them spaced more or less appart is the fraction of this data unit distance compared to the overall data space shown on the axis.
For example, in the seaborn case, the first box sits at x=0, the second at x=1. The difference is 1 unit. The maximal distance between the two boxplots is hence achieved by setting the x axis limits to those exact limits,
ax.set_xlim(0, 1)
Of course this will cut half of each box.
So a more useful value would be ax.set_xlim(0-val, 1+val) with val being somewhere in the range of the width of the boxes.
One needs to mention that pandas uses different units. The first box is at x=1, the second at x=2. Hence one would need something like ax.set_xlim(1-val, 2+val).
The following would add a slider to the plot to see the effect of different values.
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
dict_a = {'value':[1,2,3,7,8,9],'name':['Group_a']*3+['Group_b']*3}
dataframe = pd.DataFrame(dict_a)
fig, (ax, ax2, ax3) = plt.subplots(nrows=3,
gridspec_kw=dict(height_ratios=[4,4,1], hspace=1))
sns.boxplot( y="value" , x="name" , data=dataframe, width=0.1, ax=ax)
dataframe.boxplot("value", by = "name", ax=ax2)
from matplotlib.widgets import Slider
slider = Slider(ax3, "", valmin=0, valmax=3)
def update(val):
ax.set_xlim(-val, 1+val)
ax2.set_xlim(1-val, 2+val)
slider.on_changed(update)
plt.show()

How to Change Color of Line graphs in Statsmodel Decomposition Plots

The default color of the seasonal decomposition graphs is a light blue.
1. How can you change the colors so that each of the lines is a different color?
2. If each plot can not have a separate color, how would I change all of the colors to say red?
I've tried adding arguments to decomposition.plot(color = 'red')
and searching the documentation for clues.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# I want 7 days of 24 hours with 60 minutes each
periods = 7 * 24 * 60
tidx = pd.date_range('2016-07-01', periods=periods, freq='D')
np.random.seed([3,1415])
# This will pick a number of normally distributed random numbers
# where the number is specified by periods
data = np.random.randn(periods)
ts = pd.Series(data=data, index=tidx, name='TimeSeries')
decomposition = sm.tsa.seasonal_decompose(ts, model ='additive')
fig = decomposition.plot()
plt.show()
A decomposition plot in which each graph is a different color.
The decomposition object in the code you posted uses pandas in the plotting method. I don't see a way of passing colors directly to the plot method, and it doesn't take **kwargs.
A work around would be to call the pandas plotting code directly on the object:
fig, axes = plt.subplots(4, 1, sharex=True)
decomposition.observed.plot(ax=axes[0], legend=False, color='r')
axes[0].set_ylabel('Observed')
decomposition.trend.plot(ax=axes[1], legend=False, color='g')
axes[1].set_ylabel('Trend')
decomposition.seasonal.plot(ax=axes[2], legend=False)
axes[2].set_ylabel('Seasonal')
decomposition.resid.plot(ax=axes[3], legend=False, color='k')
axes[3].set_ylabel('Residual')

Extending the range of bins in seaborn histogram

I'm trying to create a histogram with seaborn, where the bins start at 0 and go to 1. However, there is only date in the range from 0.22 to 0.34. I want the empty space more for a visual effect to better present the data.
I create my sheet with
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
%matplotlib inline
from IPython.display import set_matplotlib_formats
set_matplotlib_formats('svg', 'pdf')
df = pd.read_excel('test.xlsx', sheetname='IvT')
Here I create a variable for my list and one that I think should define the range of the bins of the histogram.
st = pd.Series(df['Short total'])
a = np.arange(0, 1, 15, dtype=None)
And the histogram itself looks like this
sns.set_style("white")
plt.figure(figsize=(12,10))
plt.xlabel('Ration short/total', fontsize=18)
plt.title ('CO3 In vitro transcription, Na+', fontsize=22)
ax = sns.distplot(st, bins=a, kde=False)
plt.savefig("hist.svg", format="svg")
plt.show()
Histogram
It creates a graph bit the range in x goes from 0 to 0.2050 and in y from -0.04 to 0.04. So completely different from what I expect. I google searched for quite some time but can't seem to find an answer to my specific problem.
Already, thanks for your help guys.
There are a few approaches to achieve the desired results here. For example, you can change the xaxis limits after you have plotted the histogram, or adjust the range over which the bins are created.
import seaborn as sns
# Load sample data and create a column with values in the suitable range
iris = sns.load_dataset('iris')
iris['norm_sep_len'] = iris['sepal_length'] / (iris['sepal_length'].max()*2)
sns.distplot(iris['norm_sep_len'], bins=10, kde=False)
Change the xaxis limits (the bins are still created over the range of your data):
ax = sns.distplot(iris['norm_sep_len'], bins=10, kde=False)
ax.set_xlim(0,1)
Create the bins over the range 0 to 1:
sns.distplot(iris['norm_sep_len'], bins=10, kde=False, hist_kws={'range':(0,1)})
Since the range for the bins is larger, you now need to use more bins if you want to have the same bin width as when adjusting the xlim:
sns.distplot(iris['norm_sep_len'], bins=45, kde=False, hist_kws={'range':(0,1)})

matplotlib - pandas - No xlabel and xticks for twinx axes in subploted figures

I had a similar question, which was answered previously. However, it differs in usage of Pandas package with it.
Here is my previous question: matplotlib - No xlabel and xticks for twinx axes in subploted figures
So, my question like last one is that why it does not show xlabel and xticks for first row diagrams when using this Python code.
Two notes:
I also used subplots instead of gridspec but same result.
If you uncomment any of the commented lines in this code, which is related to using the Pandas on the axes in each diagram, the xlabel and xticks will disappear!
import matplotlib.pyplot as plt
import matplotlib.gridspec as gspec
import numpy as np
import pandas as pd
from math import sqrt
fig = plt.figure()
gs = gspec.GridSpec(2, 2)
gs.update(hspace=0.7, wspace=0.7)
ax1 = plt.subplot(gs[0, 0])
ax2 = plt.subplot(gs[0, 1])
ax3 = plt.subplot(gs[1, 0])
ax4 = plt.subplot(gs[1, 1])
x1 = np.linspace(1,10,10)
ax12 = ax1.twinx()
ax1.set_xlabel("Fig1")
ax12.set_xlabel("Fig1")
ax1.set_ylabel("Y1")
ax12.set_ylabel("Y2")
# pd.Series(range(10)).plot(ax=ax1)
ax12.plot(x1, x1**3)
ax22 = ax2.twinx()
ax2.set_xlabel("Fig2")
ax22.set_xlabel("Fig2")
ax2.set_ylabel("Y3")
ax22.set_ylabel("Y4")
# pd.Series(range(10)).plot(ax=ax2)
ax22.plot(x1, x1**0.5)
ax32 = ax3.twinx()
ax3.set_xlabel("Fig3")
ax32.set_xlabel("Fig3")
ax3.set_ylabel("Y5")
ax32.set_ylabel("Y6")
# pd.Series(range(200)).plot(ax=ax3)
ax42 = ax4.twinx()
ax4.set_xlabel("Fig4")
ax42.set_xlabel("Fig4")
ax4.set_ylabel("Y7")
ax42.set_ylabel("Y8")
# pd.Series(range(10)).plot(ax=ax42)
plt.subplots_adjust(wspace=0.8, hspace=0.8)
plt.show()
I just got the same issue because I was mixing plots made with matplotlib and made with Pandas.
You should not plot with Pandas, here is how you could replace:
pd.Series(range(10)).plot(ax=ax42)
with
ax42.plot(pd.Series(range(10))
As Scimonster mentioned above for me it worked when I plotted all the pandas before creating twinx axes.
I had few plots in twinx which were also coming from Pandas Dataframe objects (x,t plots). I created separate lists before starting the plot and then used them to plot after plotting the first pandas plots.
to summarize my work flow was
1. Creating lists for twinx plots
2. opening plot and plotting all pandas plots with normal axes.
3. creating twinx axes
4. plotting lists on twinx axes
fortunately, this flow is working for me

Categories

Resources