I have multiple CSV files that I am trying to plot in same the figure to have a comparison between them. I already read some information about pandas problem not keeping memory plot and creating the new one every time. People were talking about using an ax var, but I do not understand it...
For now I have:
def scatter_plot(csvfile,param,exp):
for i in range (1,10):
df = pd.read_csv('{}{}.csv'.format(csvfile,i))
ax = df.plot(kind='scatter',x=param,y ='Adjusted')
df.plot.line(x=param,y='Adjusted',ax=ax,style='b')
plt.show()
plt.savefig('plot/{}/{}'.format(exp,param),dpi=100)
But it's showing me ten plot and only save the last one.
Any idea?
The structure is
create an axes to plot to
run the loop to populate the axes
save and/or show (save before show)
In terms of code:
import matplotlib.pyplot as plt
import pandas as pd
ax = plt.gca()
for i in range (1,10):
df = pd.read_csv(...)
df.plot(..., ax=ax)
df.plot.line(..., ax=ax)
plt.savefig(...)
plt.show()
Related
I am trying to create a figure which contains 9 subplots (3 x 3). X, and Y axis data is coming from the dataframe using groupby. Here is my code:
fig, axs = plt.subplots(3,3)
for index,cause in enumerate(cause_list):
df[df['CAT']==cause].groupby('RYQ')['NO_CONSUMERS'].mean().axs[index].plot()
axs[index].set_title(cause)
plt.show()
However, it does not produce the desired output. In fact it returned the error. If I remove the axs[index]before plot() and put inside the plot() function like plot(ax=axs[index]) then it worked and produces nine subplot but did not display the data in it (as shown in the figure).
Could anyone guide me where am I making the mistake?
You need to flatten axs otherwise it is a 2d array. And you can provide the ax in plot function, see documentation of pandas plot, so using an example:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
cause_list = np.arange(9)
df = pd.DataFrame({'CAT':np.random.choice(cause_list,100),
'RYQ':np.random.choice(['A','B','C'],100),
'NO_CONSUMERS':np.random.normal(0,1,100)})
fig, axs = plt.subplots(3,3,figsize=(8,6))
axs = axs.flatten()
for index,cause in enumerate(cause_list):
df[df['CAT']==cause].groupby('RYQ')['NO_CONSUMERS'].mean().plot(ax=axs[index])
axs[index].set_title(cause)
plt.tight_layout()
I'm pretty new to Python. I'm trying to plot a box plot for a sample data
I'm trying to plot box plots of mean value of the shared data. I got that part of the code. I'm also trying to plot standard error values on this box plot using yerr().
My code:
data3=pd.read_csv('demo1.csv')
names=['brow', 'harr', 'hage', 'buch', 'mcre']
d=[data3['brow'].mean(),data3['harr'].mean(),data3['hage'].mean(),data3['buch'].mean(),data3['mcre'].mean()]
N=len(data3['co'])
l=math.sqrt(N)
k=[(data3['brow'].std())/l,(data3['harr'].std())/l,(data3['hage'].std())/l,(data3['buch'].std())/l,(data3['mcre'].std())/l,(data3['phil'].std())/l,(data3['moor'].std())/l]
fig, ax = plt.subplots()
plt.bar(names,d)
plt.bar(len(names),d,yerr=k,align='center',alpha=0.5,ecolor='black',capsize=10)
Im getting an image such as this
But I want the black lines to be against each bar graph and not as a new bar in the plot with all of them together. How can I change this. Am I using the plt the wrong way? Please help.
I don't understand what you were trying to do with your second call to plt.bar()
import math
names=['brow', 'harr', 'hage', 'buch', 'mcre']
data3 = pd.DataFrame({n: np.random.normal(loc=np.random.randint(5,10), scale=np.random.randint(1,10), size=(100,)) for n in names})
d=data3[names].mean()
N=100
l=math.sqrt(N)
k=data3[names].std()/l
fig, ax = plt.subplots()
plt.bar(names,d,yerr=k,align='center',alpha=0.5,ecolor='black',capsize=10)
I am using the integrated plot() function in pandas to generate a graph with two y-axes. This works well and the legend even points to the (right) y-axis for the second data set. But imho the legend's position is bad.
However, when I update the legend position I get two legends the correct one ('A', 'B (right)') at an inconvenient location, and a wrong one ('A' only) at the chosen location.
So now I want to generate a legend on my own and was looking for the second <matplotlib.lines.Line2D>, but it is not contained in the ax environment.
import pandas as pd
df = pd.DataFrame({"A":[1,2,3],"B":[1/4,1/5,1/6]})
ax = df.plot(secondary_y=['B'])
len(ax.lines)
>>> 1
My ultimate objective is to be able to move the correct legend around, but I am confident I could manually place a legend, if only I had access to the second line container.
If I had, I was going to suppress the original legend by invoking df.plot(...,legend=None) and do something like plt.legend([ax.lines[0],ax.lines[1]],['A','B (right)'],loc='center left',bbox_to_anchor=(1.2, 0.5)). But ax only stores the first line "A", where is the second?
Also ax.get_legend_handles_labels() only contains ([<matplotlib.lines.Line2D at 0x2630e2193c8>], ['A']).
You create two axes. Each contains a line. So you need to loop over the axes and take the line(s) from each of them.
import numpy as np
import pandas as pd
df = pd.DataFrame({"A":[1,2,3],"B":[1/4,1/5,1/6]})
ax = df.plot(secondary_y=['B'])
lines = np.array([axes.lines for axes in ax.figure.axes]).flatten()
print(lines)
For the purpose of creating a single legend you may however just use a figure legend,
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({"A":[1,2,3],"B":[1/4,1/5,1/6]})
ax = df.plot(secondary_y=['B'], legend=False)
ax.figure.legend()
plt.show()
I have a conceptual problem in the basic structure of matplotlib.
I want to add a Caption to a graph and I do understand the advice given in Is there a way of drawing a caption box in matplotlib
However, I do not know, how to combine this with the pandas data frame I have.
Without the structure given in the link above my code looks (projects1 being my pandas data frame):
ax2=projects1.T.plot.bar(stacked=True)
ax2.set_xlabel('Year',size=20)
and it returns a barplot.
But if I want to apply the structure of above, I get stuck. I tried:
fig = plt.figure()
ax2 = fig.add_axes((.1,.4,.8,.5))
ax2.plot.bar(projects1.T,stacked=True)
And it results in various errors.
So the question is, how do I apply the structure of the link given above with pandas data frame and with more complex graphs than a mere line. Thx
Pandas plot function has an optional argument ax which can be used to supply an externally created matplotlib axes instance to the pandas plot.
import matplotlib.pyplot as plt
import pandas as pd
projects1 = ...?
fig = plt.figure()
ax2 = fig.add_axes((.1,.4,.8,.5))
projects1.T.plot.bar(stacked=True, ax = ax2)
ax2.set_xlabel('Year',size=20)
I'm trying to get my figures in just one pdf page, but I don't know how to do this. I found out that it's possible to save multiple figures in a pdf file with 'matplotlib.backends.backend_pdf', but it doesn't work for just one page.
Has anyone any ideas ? Convert the figures to just one figure ?
You can use matplotlib gridspec to have multiple plots in 1 window
http://matplotlib.org/users/gridspec.html
from matplotlib.gridspec import GridSpec
import random
import numpy
from matplotlib import pyplot as pl
fig = pl.figure(figsize=(12, 16))
G = GridSpec(2,2)
axes_1 = pl.subplot(G[0, :])
x = [random.gauss(3,1) for _ in range(400)]
bins = numpy.linspace(-10, 10, 100)
axes_1.hist(x, bins, alpha=0.5, label='x')
axes_2 = pl.subplot(G[1, :])
axes_2.plot(x)
pl.tight_layout()
pl.show()
You can change the rows and column values and can subdivide the sections.
The PDF backend makes one page per figure. Use subplots to get multiple plots into one figure and they'll all show up together on one page of the PDF.
Here is a solution provided by matplotlib:
from matplotlib.backends.backend_pdf import PdfPages
import matplotlib.pyplot as plt
with PdfPages('foo.pdf') as pdf:
#As many times as you like, create a figure fig and save it:
fig = plt.figure()
pdf.savefig(fig)
....
fig = plt.figure()
pdf.savefig(fig)
VoilĂ
Find a full example here: multipage pdf matplotlib
And by the way, for one figure, you don't need matplotlib.backends.backend_pdf just add pdf extension like so:
plt.savefig("foo.pdf")