Plotting Errorbars from different DataFrame into SubPlots with matplotlib - python

i just stumpled upon a problem I simply cannot solve. I have a dataset with raw data which I will upload here: https://file.io/oJqkZjAGyqV1
Its an excel file with the data inside.
I then created some code to open it, read it, generate a mean and sem of my data as below.
# Import required packages
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
from pylab import cm
df = pd.read_excel("Chlorophyll_data_mod.xlsx")
#----Calculation of meanvalues and sem from raw_data---------
meandf2 = df.set_index(["Group"])
sets = []
for x in ["A","B","AB","xc"]:
meandf3 = meandf2.filter(like=f"Chl_{x}_").reset_index()
sets.append(meandf3)
#---------Grouping DataFrame----------#
means = []
ster = []
for x in range(len(sets)):
meandf = sets[x].groupby(["Group"]).mean()
meandf = meandf.reset_index()
means.append(meandf)
sems = sets[x].groupby("Group").sem()
sems = sems.reset_index()
ster.append(sems)
#----Selecting Dataframe from List-----#
plotdf = means[0]
ploter = ster[0]
plotgroup = plotdf.iloc[:,[0,]]
plotdata = plotdf.iloc[:,[1,]]
grouparray = plotgroup.to_numpy()
dataarray = plotdata.to_numpy()
#-----CreatePlot------#
fig, ax = plt.subplots(nrows=3, ncols=1, sharex="all", figsize=(10,8))
plotdf.plot(ax=ax[0,],x="Group",y="Chl_A_0D", kind="bar", legend=False, color="black")
plt.errorbar(x=plotdf["Group"], y=plotdf["Chl_A_0D"],yerr=ploter["Chl_A_0D"])
plotdf.plot(ax=ax[1,],x="Group",y="Chl_A_10DaT", kind="bar", legend=False, color="blue")
plt.errorbar(x=plotdf["Group"], y=plotdf["Chl_A_10DaT"],yerr=ploter["Chl_A_10DaT"])
plotdf.plot(ax=ax[2,],x="Group",y="Chl_A_7DaR", kind="bar", legend=False, color="magenta")
plt.errorbar(x=plotdf["Group"], y=plotdf["Chl_A_7DaR"],yerr=ploter["Chl_A_7DaR"])
#----Legend of the Plot-----#
fig.legend(loc="lower center", bbox_to_anchor=(0.5,0), fancybox=True, ncol=6)
#----Layout------#
plt.tight_layout(rect=[0, 0.02, 1,1])
plt.show()
And I manage to create a subplot, which shows 3 of my interested data points. However, I struggle with the error bars.
My approach was to calculate the sem and store it into a new dataframe. And then just read it from there for the yerr. However, this doesn't work.
plotdf.plot(ax=ax[2,],x="Group",y="Chl_A_7DaR", kind="bar", legend=False, color="magenta", yerr=ploter["Chl_A_7DaR"])
Results in an array error because of the structure.
And my current approach, as in the main code above only draws the error bars in the last subplot, but not in each individual plot.
Maybe here is someone who could help me understanding this function?
Best regards

Related

Moving Graph Titles in the Y axis of Subplots

This question is adapted from this answer, however the solution provided does not work and following is my result. I am interested in adding individual title on the right side for individual subgraphs.
(p.s no matter how much offset for y-axis i provide the title seems to stay at the same y-value)
from matplotlib import pyplot as plt
import numpy as np
fig, axes = plt.subplots(nrows=2)
ax0label = axes[0].set_ylabel('Axes 0')
ax1label = axes[1].set_ylabel('Axes 1')
title = axes[0].set_title('Title')
offset = np.array([-0.15, 0.0])
title.set_position(ax0label.get_position() + offset)
title.set_rotation(90)
fig.tight_layout()
plt.show()
Something like this? This is the only other way i can think of.
from matplotlib import pyplot as plt
import numpy as np
fig, axes = plt.subplots(nrows=2)
ax0label = axes[0].set_ylabel('Axes 0')
ax1label = axes[1].set_ylabel('Axes 1')
ax01 = axes[0].twinx()
ax02 = axes[1].twinx()
ax01.set_ylabel('title')
ax02.set_ylabel('title')
fig.tight_layout()
plt.show()

Plot GeoDataFrame with multiple column attributes

I'm using geopandas (python 3.8.2) to plot variables contained in a geodataframe.
I would like to plot on a single figure, all datasets with their own colormap.
The problem is that the plot shows only the last dataset, which corresponds to 'var_5' with colormap 'Reds'. Even if I set: ax = geodataframe.plot() it does not work.
Any idea ? Many Thanks!
import geopandas as gpd
import matplotlib.pyplot as plt
filename = 'myfile.geojson'
geodataframe = gpd.read_file(filename)
cmaps = ['plasma', 'Greens', 'Blues', 'binary', 'Reds']
variables = ['var_1', 'var_2', 'var_3', 'var_4', 'var_5']
plt.rcParams['figure.figsize'] = (20, 10)
ax = plt.gca()
for i, var in enumerate(variables):
geodataframe.plot(ax=ax, column=var, cmap=cmaps[i])
plt.show()
Edit:
After taking into account the answers, I got this image:

Need to save pandas correlation Highlighted table (cmap Matplotlib) as png image

Used this code to genrate corelation table:
df1.drop(['BC DataPlus', 'AC Glossary'], axis=1).corr(method='pearson').style.format("{:.2}").background_gradient(cmap=plt.get_cmap('coolwarm'), axis=1)
This is the table generated:
I cant find any way to save this table as image. Thank you.
The question you pose is difficult to answer if taken literally.
The difficulty stems from the fact that df.style.render() generates HTML which is then sent to a browser to be rendered as an image. The result may not be exactly the same across all browsers either.
Python is not directly involved in the generation of the image. So there is no
straight-forward Python-based solution.
Nevertheless, the issue of how to convert HTML to png
was raised on the pandas developers'
github page and the suggested
answer was to use phantomjs. Other ways (that I haven't tested) might be to use
webkit2png or
GrabzIt.
We could avoid much of this difficulty, however, if we loosen the interpretation of the question. Instead of trying to produce the exact image generated by df.style (for a particular browser),
we could generate a similar image very easily using seaborn:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.DataFrame(np.random.random((6, 4)), columns=list('ABCD'))
fig, ax = plt.subplots()
sns.heatmap(df.corr(method='pearson'), annot=True, fmt='.4f',
cmap=plt.get_cmap('coolwarm'), cbar=False, ax=ax)
ax.set_yticklabels(ax.get_yticklabels(), rotation="horizontal")
plt.savefig('result.png', bbox_inches='tight', pad_inches=0.0)
If you don't want to add the seaborn dependency, you could use matplotlib directly though it takes a few more lines of code:
import colorsys
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame(np.random.random((6, 4)), columns=list('ABCD'))
corr = df.corr(method='pearson')
fig, ax = plt.subplots()
data = corr.values
heatmap = ax.pcolor(data, cmap=plt.get_cmap('coolwarm'),
vmin=np.nanmin(data), vmax=np.nanmax(data))
ax.set_xticks(np.arange(data.shape[1])+0.5, minor=False)
ax.set_yticks(np.arange(data.shape[0])+0.5, minor=False)
ax.invert_yaxis()
row_labels = corr.index
column_labels = corr.columns
ax.set_xticklabels(row_labels, minor=False)
ax.set_yticklabels(column_labels, minor=False)
def _annotate_heatmap(ax, mesh):
"""
**Taken from seaborn/matrix.py**
Add textual labels with the value in each cell.
"""
mesh.update_scalarmappable()
xpos, ypos = np.meshgrid(ax.get_xticks(), ax.get_yticks())
for x, y, val, color in zip(xpos.flat, ypos.flat,
mesh.get_array(), mesh.get_facecolors()):
if val is not np.ma.masked:
_, l, _ = colorsys.rgb_to_hls(*color[:3])
text_color = ".15" if l > .5 else "w"
val = ("{:.3f}").format(val)
text_kwargs = dict(color=text_color, ha="center", va="center")
# text_kwargs.update(self.annot_kws)
ax.text(x, y, val, **text_kwargs)
_annotate_heatmap(ax, heatmap)
plt.savefig('result.png', bbox_inches='tight', pad_inches=0.0)

Change Error Bar Markers (Caplines) in Pandas Bar Plot

so I am plotting error bar of pandas dataframe. Now the error bar has a weird arrow at the top, but what I want is a horizontal line. For example, a figure like this:
But now my error bar ends with arrow instead of a horinzontal line.
Here is the code i used to generate it:
plot = meansum.plot(
kind="bar",
yerr=stdsum,
colormap="OrRd_r",
edgecolor="black",
grid=False,
figsize=(8, 2),
ax=ax,
position=0.45,
error_kw=dict(ecolor="black", elinewidth=0.5, lolims=True, marker="o"),
width=0.8,
)
So what should I change to make the error become the one I want. Thx.
Using plt.errorbar from matplotlib makes it easier as it returns several objects including the caplines which contain the marker you want to change (the arrow which is automatically used when lolims is set to True, see docs).
Using pandas, you just need to dig the correct line in the children of plot and change its marker:
import pandas as pd
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
df = pd.DataFrame({"val":[1,2,3,4],"error":[.4,.3,.6,.9]})
meansum = df["val"]
stdsum = df["error"]
plot = meansum.plot(kind='bar',yerr=stdsum,colormap='OrRd_r',edgecolor='black',grid=False,figsize=8,2),ax=ax,position=0.45,error_kw=dict(ecolor='black',elinewidth=0.5, lolims=True),width=0.8)
for ch in plot.get_children():
if str(ch).startswith('Line2D'): # this is silly, but it appears that the first Line in the children are the caplines...
ch.set_marker('_')
ch.set_markersize(10) # to change its size
break
plt.show()
The result looks like:
Just don't set lolim = True and you are good to go, an example with sample data:
import pandas as pd
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
df = pd.DataFrame({"val":[1,2,3,4],"error":[.4,.3,.6,.9]})
meansum = df["val"]
stdsum = df["error"]
plot = meansum.plot(kind='bar',yerr=stdsum,colormap='OrRd_r',edgecolor='black',grid=False,figsize=(8,2),ax=ax,position=0.45,error_kw=dict(ecolor='black',elinewidth=0.5),width=0.8)
plt.show()

searborn annotate overwrites previous

I am trying to loop through chunks of pandas dataframe and append chart to pdf. here is sample code:
import random
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
from matplotlib.backends import backend_pdf
df = pd.DataFrame({'a':[a + + random.random() for a in range(12)] ,
'b':[ b + random.random() for b in range(12,24)]})
print(df)
chunk_size = 3 # number of rows in heatmap
n_chunks = len(df)//chunk_size # number of pages in heatmap pdf
with backend_pdf.PdfPages('chart.pdf') as pdf_pages:
for e,(k,g) in enumerate(df.groupby(np.arange(len(df))//chunk_size)):
#print(k,g.shape)
snsplot = sns.heatmap(g, annot=True, cbar=False, linewidths=.5) #fmt="d",cmap="YlGnBu",
pdf_pages.savefig(snsplot.figure)
This code adds pages alright, but all the annotation from previous pages seems to be overlayed (preserved) in all the pages that follow.
Every time you call sns.heatmap it is using plt.gca() so all of your plotting is going to the same Axes object (each loop might be getting slower too as all of the previous artists are rendered, but just occluded by the latest one).
I suggest something like
fig, ax = plt.subplots()
with backend_pdf.PdfPages('chart.pdf') as pdf_pages:
for e,(k,g) in enumerate(df.groupby(np.arange(len(df))//chunk_size)):
#print(k,g.shape)
ax.cla()
snsplot = sns.heatmap(g, annot=True, cbar=False, linewidths=.5, ax=ax)
pdf_pages.savefig(snsplot.figure)
Which passes in an Axes object so seaborn knows where to draw and explicitly clears it in each loop.

Categories

Resources