I have a conceptual problem in the basic structure of matplotlib.
I want to add a Caption to a graph and I do understand the advice given in Is there a way of drawing a caption box in matplotlib
However, I do not know, how to combine this with the pandas data frame I have.
Without the structure given in the link above my code looks (projects1 being my pandas data frame):
ax2=projects1.T.plot.bar(stacked=True)
ax2.set_xlabel('Year',size=20)
and it returns a barplot.
But if I want to apply the structure of above, I get stuck. I tried:
fig = plt.figure()
ax2 = fig.add_axes((.1,.4,.8,.5))
ax2.plot.bar(projects1.T,stacked=True)
And it results in various errors.
So the question is, how do I apply the structure of the link given above with pandas data frame and with more complex graphs than a mere line. Thx
Pandas plot function has an optional argument ax which can be used to supply an externally created matplotlib axes instance to the pandas plot.
import matplotlib.pyplot as plt
import pandas as pd
projects1 = ...?
fig = plt.figure()
ax2 = fig.add_axes((.1,.4,.8,.5))
projects1.T.plot.bar(stacked=True, ax = ax2)
ax2.set_xlabel('Year',size=20)
Related
I tried to create a graph side by side using matplotlib.
I don't get any errors when I run my code, instead, I just get a blank window from MatPlotLib.
Here's the link I used for my CSV.
https://ca.finance.yahoo.com/quote/%5EGSPTSE/history?p=%5EGSPTSE
Previously, I have also created a graph that overlayed the two lines(which works as intended), but they are not displaying as seperate graphs, which is what I am trying to do with my current code.
I tried this video for information in creating these graphs, but I can't replicate the graph shown in the video even when I copy the code.
https://www.youtube.com/watch?v=-2AMr95nUDw
from matplotlib import pyplot as mpl
import pandas as pd
data_better = pd.read_csv('What.csv')
# print(data_better.head()) #I used this part to find out what the headers were for x values
# print(data_better.columns[::])
mpl.axes([15000, 17000, 20000, 23000])
mpl.title("Open Values")
mpl.plot(data_better["Date"], data_better["Open"])
mpl.ylabel("Money")
mpl.axes([15000, 17000, 20000, 23000])
mpl.title("Close Values")
mpl.plot(data_better["Date"], data_better["Close"])
mpl.ylabel("Money")
mpl.show()
pyplot.axes accepts 4-tuple of floats in normalized (0, 1) units to place the axes. You can look at examples in Make Room For Ylabel Using Axesgrid to learn using it.
If you want to plot two plots in one figure, you need use different axes
from matplotlib import pyplot as plt
import pandas as pd
data_better = pd.read_csv('What.csv')
figure, (axes1, axes2) = plt.subplots(nrows=1, ncols=2)
axes1.set_title("Open Values")
axes1.plot(data_better["Date"], data_better["Open"])
axes1.set_ylabel("Money")
axes2.set_title("Close Values")
axes2.plot(data_better["Date"], data_better["Close"])
axes2.set_ylabel("Money")
plt.show()
I want to create a output widget with several drop boxes and a plot with Seaborn as follows.
The intention is having different drop boxes to choose variables and according to user input output a different Seaborn plot.
Here the code. It works but not as desired.
dd = wd.Dropdown(
options=["YlGnBu","Blues","BuPu","Greens"],
value="Blues",
description='chosse cmap:',
disabled=False,
)
myout = wd.Output()
def draw_plot(change):
with myout:
myout.clear_output()
print('plotting out the following: sns.heatmap(df.corr(), annot=True, cmap = "YlGnBu")')
plt.figure(figsize=(8, 3))
sns.heatmap(df.corr(), annot=True, cmap = change.new)
plt.title("Correlation matrix");
dd.observe(draw_plot, names='value')
display(dd)
display(myout)
The above code DOES NOT CLEAR THE OUTPUT WIDGET every time a new variable of dropbox is selected, and Seaborn plots are added.
I saw these solutions:
Stop seaborn plotting multiple figures on top of one another
that are not satisfactory according to my opinion. I would like to display in the output widget a new figure every time, i.e. totally clear the content and then add stuff again; plot again.
I don't understand why clear_output() clears the text line but not the figure.
secondly, as some answers in the mentioned linked pointed out if working with Seaborn I don't want to resort to an underlying library, i.e. Matplotlib. I considere that a work around.
So how is the proper way to go?
thanks
The following code works for me.
I'm not sure if that was a typo, but you wrote clear_output() instead of myout.clear_output()
also, I believe you need to call plt.show() at the end of your callback.
import ipywidgets as wd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
a = np.random.random(size=(5,5))
dd = wd.Dropdown(
options=["YlGnBu","Blues","BuPu","Greens"],
value="Blues",
description='chosse cmap:',
disabled=False,
)
myout = wd.Output()
def draw_plot(change):
with myout:
print('plotting out the following: sns.heatmap(df.corr(), annot=True, cmap = "{}")'.format(change.new))
myout.clear_output()
sns.heatmap(a, annot=True, cmap = change.new)
plt.show()
dd.observe(draw_plot, names='value')
display(dd)
display(myout)
I'm a R programmer learning python and finding the plotting in python much more difficult than R.
I'm trying to write the following function but haven't been successful. Could anyone help?
import pandas as pd
#example data
df1 = pd.DataFrame({
'PC1':[-2.2,-2.0,2.04,0.97],
'PC2':[0.5,-0.6,0.9,-0.5],
'PC3':[-0.1,-0.2,0.2,0.8],
'f1':['a','a','b','b'],
'f2':['x','y','x','y'],
'f3':['k','g','g','k']
})
def drawPCA(df,**kwargs):
"""Produce a 1x3 subplots of scatterplot; each subplot includes two PCs with
no legend, e.g. subplot 1 is PC1 vs PC2. The legend is on the upper middle of
the figure.
Parameters
----------
df: Pandas DataFrame
The first 3 columns are the PCs, followed by sample characters.
kwargs
To specify hue,style,size, etc. if the plotting uses seaborn.scatterplot;
or c,s,etc. if using pyplot scatter
Example
----------
drawPCA(df1, hue="f1")
drawPCA(df1, c="f1", s="f2") #if plotting uses plt.scatter
drawPCA(df1, hue="f1", size="f2",style="f3")
or more varialbes passable to the actual plotting function
"""
This is what I come up with! Just two question:
is there a parameter to set the legend horizontal, instead of using the ncol?
how to prevent the figure from being displayed when running the function like this?
fig,ax=drawPCA(df1,hue="f1",style="f2",size="f3")
#may do more changing on the figure.
Here is the function:
def drawPCA2(df,**kwargs):
import matplotlib.pyplot as plt
import seaborn as sns
from matplotlib.figure import figaspect
nUniVals=sum([df[i].unique().size for i in kwargs.values()])
nKeys=len(kwargs.keys())
w, h = figaspect(1/3)
fig1, axs = plt.subplots(ncols=3,figsize=(w,h))
fig1.suptitle("All the PCs")
sns.scatterplot(x="PC1",y="PC2",data=df,legend=False,ax=axs[0],**kwargs)
sns.scatterplot(x="PC1",y="PC3",data=df,legend=False,ax=axs[1],**kwargs)
sns.scatterplot(x="PC2",y="PC3",data=df,ax=axs[2],label="",**kwargs)
handles, labels = axs[2].get_legend_handles_labels()
fig1.legend(handles, labels, loc='lower center',bbox_to_anchor=(0.5, 0.85), ncol=nUniVals+nKeys)
axs[2].get_legend().remove()
fig1.tight_layout(rect=[0, 0.03, 1, 0.9])
return fig1,axs
I have multiple CSV files that I am trying to plot in same the figure to have a comparison between them. I already read some information about pandas problem not keeping memory plot and creating the new one every time. People were talking about using an ax var, but I do not understand it...
For now I have:
def scatter_plot(csvfile,param,exp):
for i in range (1,10):
df = pd.read_csv('{}{}.csv'.format(csvfile,i))
ax = df.plot(kind='scatter',x=param,y ='Adjusted')
df.plot.line(x=param,y='Adjusted',ax=ax,style='b')
plt.show()
plt.savefig('plot/{}/{}'.format(exp,param),dpi=100)
But it's showing me ten plot and only save the last one.
Any idea?
The structure is
create an axes to plot to
run the loop to populate the axes
save and/or show (save before show)
In terms of code:
import matplotlib.pyplot as plt
import pandas as pd
ax = plt.gca()
for i in range (1,10):
df = pd.read_csv(...)
df.plot(..., ax=ax)
df.plot.line(..., ax=ax)
plt.savefig(...)
plt.show()
I have a data set that has two independent variables and 1 dependent variable. I thought the best way to represent the dataset is by a checkerboard-type plot wherein the color of the cells represent a range of values, like this:
I can't seem to find a code to do this automatically.
You need to use a plotting package to do this. For example, with matplotlib:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
X = 100*np.random.rand(6,6)
fig, ax = plt.subplots()
i = ax.imshow(X, cmap=cm.jet, interpolation='nearest')
fig.colorbar(i)
plt.show()
For those who come across this years later as myself, what Original Poster wants is a heatmap.
Matplotlib has documentation regarding the following example here.