How to add black lines to stacked pandas area plot? - python

I want to add separating black lines to a Python area plot created using pandas. In other words, I want the stacked areas to be separated by black lines.
My current code is the following:
figure1=mydataframe.plot(kind='area', stacked=True)
And I am looking for an additional argument to pass on to the function, such as:
figure1=mydataframe.plot(kind='area', stacked=True, blacklines=TRUE)
Is there a way I can achieve this using pandas or additional matplotlib commands?

Use plt.stackplot(). You can control line width and color with linewidth and edgecolor arguments:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
data = np.random.randn(10,3)
df = pd.DataFrame(abs(data))
plt.stackplot(np.arange(10),[df[0],df[1],df[2]])
plt.show()

Related

How to change the background color of df.plot() in Python Pandas?

I want to specify the color of the area surrounding a plot created using the df.plot() in Pandas/Python.
Using .set_facecolor as in the code below only changes the area inside the axes (see image), I want to change the color outside too.
import pandas as pd
import numpy as np
df = pd.DataFrame(components, columns=['PC1','PC2']
df.plot('PC1','PC2','scatter').set_facecolor('green')
Replacing the last line with these two lines produces the same graph.
ax = df.plot('PC1','PC2','scatter')
ax.set_facecolor('green')
setfacecolor example
IIUC, you can use fig.set_facecolor:
fig, ax = plt.subplots()
df.plot('PC1','PC2','scatter', ax=ax).set_facecolor('green')
fig.set_facecolor('green')
plt.show()
Output:

How to plot only one half of a scatter matrix using pandas

I am using pandas scatter_matrix (couldn't get PairgGrid in seaborn to work) to plot all combinations of a set of columns in a pandas frame. Each column as 1000 data points and there are nine columns.
I am using the following code:
pandas.plotting.scatter_matrix(df, alpha=0.2, figsize=(8,8))
I get the figure shown below:
This is nice., However, you'll notice that across the main diagonal I have a mirror image. Is it possible to plot only the lower portion as in the following fake plot I made using paint:
This is probably not the cleanest way to do it, but it works:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
axes = pd.plotting.scatter_matrix(iris, alpha=0.2, figsize=(8,8))
for i in range(np.shape(axes)[0]):
for j in range(np.shape(axes)[1]):
if i < j:
axes[i,j].set_visible(False)

Random colors by default in matplotlib

I had a look at Kaggle's univariate-plotting-with-pandas. There's this line which generates bar graph.
reviews['province'].value_counts().head(10).plot.bar()
I don't see any color scheme defined specifically.
I tried plotting it using jupyter notebook but could see only one color instead of all multiple colors as at Kaggle.
I tried reading the document and online help but couldn't get any method to generate these colors just by the line above.
How do we do that? Is there a config to set this randomness by default?
It seems like the multicoloured bars were the default behaviour in one of the former pandas versions and Kaggle must have used that one for their tutorial (you can read more here).
You can easily recreate the plot by defining a list of standard colours and then using it as an argument in bar.
colors = ['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#9467bd',
'#8c564b', '#e377c2', '#7f7f7f', '#bcbd22', '#17becf']
reviews['province'].value_counts().head(10).plot.bar(color=colors)
Tested on pandas 0.24.1 and matplotlib 2.2.2.
In seaborn is it not problem:
import seaborn as sns
sns.countplot(x='province', data=reviews)
In matplotlib are not spaces, but possible with convert values to one row DataFrame:
reviews['province'].value_counts().head(10).to_frame(0).T.plot.bar()
Or use some qualitative colormap:
import matplotlib.pyplot as plt
N = 10
reviews['province'].value_counts().head(N).plot.bar(color=plt.cm.Paired(np.arange(N)))
reviews['province'].value_counts().head(N).plot.bar(color=plt.cm.Pastel1(np.arange(N)))
The colorful plot has been produced with an earlier version of pandas (<= 0.23). Since then, pandas has decided to make bar plots monochrome, because the color of the bars is pretty meaningless. If you still want to produce a bar chart with the default colors from the "tab10" colormap in pandas >= 0.24, and hence recreate the previous behaviour, it would look like
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
N = 13
df = pd.Series(np.random.randint(10,50,N), index=np.arange(1,N+1))
cmap = plt.cm.tab10
colors = cmap(np.arange(len(df)) % cmap.N)
df.plot.bar(color=colors)
plt.show()

Python subplots with seaborn or pyplot

I'm a R programmer learning python and finding the plotting in python much more difficult than R.
I'm trying to write the following function but haven't been successful. Could anyone help?
import pandas as pd
#example data
df1 = pd.DataFrame({
'PC1':[-2.2,-2.0,2.04,0.97],
'PC2':[0.5,-0.6,0.9,-0.5],
'PC3':[-0.1,-0.2,0.2,0.8],
'f1':['a','a','b','b'],
'f2':['x','y','x','y'],
'f3':['k','g','g','k']
})
def drawPCA(df,**kwargs):
"""Produce a 1x3 subplots of scatterplot; each subplot includes two PCs with
no legend, e.g. subplot 1 is PC1 vs PC2. The legend is on the upper middle of
the figure.
Parameters
----------
df: Pandas DataFrame
The first 3 columns are the PCs, followed by sample characters.
kwargs
To specify hue,style,size, etc. if the plotting uses seaborn.scatterplot;
or c,s,etc. if using pyplot scatter
Example
----------
drawPCA(df1, hue="f1")
drawPCA(df1, c="f1", s="f2") #if plotting uses plt.scatter
drawPCA(df1, hue="f1", size="f2",style="f3")
or more varialbes passable to the actual plotting function
"""
This is what I come up with! Just two question:
is there a parameter to set the legend horizontal, instead of using the ncol?
how to prevent the figure from being displayed when running the function like this?
fig,ax=drawPCA(df1,hue="f1",style="f2",size="f3")
#may do more changing on the figure.
Here is the function:
def drawPCA2(df,**kwargs):
import matplotlib.pyplot as plt
import seaborn as sns
from matplotlib.figure import figaspect
nUniVals=sum([df[i].unique().size for i in kwargs.values()])
nKeys=len(kwargs.keys())
w, h = figaspect(1/3)
fig1, axs = plt.subplots(ncols=3,figsize=(w,h))
fig1.suptitle("All the PCs")
sns.scatterplot(x="PC1",y="PC2",data=df,legend=False,ax=axs[0],**kwargs)
sns.scatterplot(x="PC1",y="PC3",data=df,legend=False,ax=axs[1],**kwargs)
sns.scatterplot(x="PC2",y="PC3",data=df,ax=axs[2],label="",**kwargs)
handles, labels = axs[2].get_legend_handles_labels()
fig1.legend(handles, labels, loc='lower center',bbox_to_anchor=(0.5, 0.85), ncol=nUniVals+nKeys)
axs[2].get_legend().remove()
fig1.tight_layout(rect=[0, 0.03, 1, 0.9])
return fig1,axs

Plot pandas dataframe with varying number of columns along imshow

I want to plot an image and a pandas bar plot side by side in an iPython notebook. This is part of a function so that the dataframe containing the values for the bar chart can vary with respect to number of columns.
The libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
%matplotlib inline
Dataframe
faces = pd.Dataframe(...) # return values for 8 characteristics
This returns the the bar chart I'm looking for and works for a varying number of columns.
faces.plot(kind='bar').set_xticklabels(result[0]['scores'].keys())
But I didn't find a way to plot it in a pyplot figure also containing the image. This is what I tried:
fig, (ax_l, ax_r) = plt.subplots(nrows=1, ncols=2, figsize=(15, 5))
ax_l.imshow( img )
ax_r=faces.plot(kind='bar').set_xticklabels(result[0]['scores'].keys())
The output i get is the image on the left and an empty plot area with the correct plot below. There is
ax_r.bar(...)
but I couldn't find a way around having to define the columns to be plotted.
You just need to specify your axes object in your DataFrame.plot calls.
In other words: faces.plot(kind='bar', ax=ax_r)

Categories

Resources