I had a look at Kaggle's univariate-plotting-with-pandas. There's this line which generates bar graph.
reviews['province'].value_counts().head(10).plot.bar()
I don't see any color scheme defined specifically.
I tried plotting it using jupyter notebook but could see only one color instead of all multiple colors as at Kaggle.
I tried reading the document and online help but couldn't get any method to generate these colors just by the line above.
How do we do that? Is there a config to set this randomness by default?
It seems like the multicoloured bars were the default behaviour in one of the former pandas versions and Kaggle must have used that one for their tutorial (you can read more here).
You can easily recreate the plot by defining a list of standard colours and then using it as an argument in bar.
colors = ['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#9467bd',
'#8c564b', '#e377c2', '#7f7f7f', '#bcbd22', '#17becf']
reviews['province'].value_counts().head(10).plot.bar(color=colors)
Tested on pandas 0.24.1 and matplotlib 2.2.2.
In seaborn is it not problem:
import seaborn as sns
sns.countplot(x='province', data=reviews)
In matplotlib are not spaces, but possible with convert values to one row DataFrame:
reviews['province'].value_counts().head(10).to_frame(0).T.plot.bar()
Or use some qualitative colormap:
import matplotlib.pyplot as plt
N = 10
reviews['province'].value_counts().head(N).plot.bar(color=plt.cm.Paired(np.arange(N)))
reviews['province'].value_counts().head(N).plot.bar(color=plt.cm.Pastel1(np.arange(N)))
The colorful plot has been produced with an earlier version of pandas (<= 0.23). Since then, pandas has decided to make bar plots monochrome, because the color of the bars is pretty meaningless. If you still want to produce a bar chart with the default colors from the "tab10" colormap in pandas >= 0.24, and hence recreate the previous behaviour, it would look like
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
N = 13
df = pd.Series(np.random.randint(10,50,N), index=np.arange(1,N+1))
cmap = plt.cm.tab10
colors = cmap(np.arange(len(df)) % cmap.N)
df.plot.bar(color=colors)
plt.show()
Related
I have made a colourmap from my chosen colours, however I'd like to convert it to a palette which can be used to 'hue' a seaborn plot. Is this possible, and if so how so?
I have used...
cmap = pl.colors.LinearSegmentedColormap.from_list("", ["red","white"],gamma=0.5,N=len(hue))
...in order to make my own colourmap, which I know works because it can be applied to a standard matplotlib.pyplot.scatter plot successfully, as shown below.
plt.scatter(x=[1,2,3,4,5],y=[1,2,3,4,5],c=[5,4,3,2,1],cmap=cmap)
click here to see the output as it won't let me embed an image
However, I am trying to use seaborn's swarmplot function and pass in my colourmap as the hue parameter. Obviously this doesn't work as this parameter requires a palette - hence my question!
I'm not quite sure where to start! Any help would be appreciated!
A seaborn palette is a simple list of colors. You may obtain the colors via
cmap(np.linspace(0,1,cmap.N))
Complete example:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.colors as mcolors
import seaborn as sns
df =pd.DataFrame({"x" : np.random.randint(0,4, size=200),
"y" : np.random.randn(200),
"hue" : np.random.randint(0,4, size=200)})
u = np.unique(df["hue"].values)
cmap = mcolors.LinearSegmentedColormap.from_list("", ["indigo","gold"],gamma=0.5,N=len(u))
sns.swarmplot("x", "y", hue="hue", data=df, palette=cmap(np.linspace(0,1,cmap.N)))
plt.show()
For diverging values seaborn by default seems to show big numbers in warm tone (orange) and small numbers in cold tone (blue).
If I need to switch color to opposite, to show big numbers in blue and small in orange, how to do so?
I've searched but haven't found a way.
sns.heatmap(flights, center=flights.loc["January", 1955])
You can reverse all of the matplotlib colormaps by appending _r to the name, i.e., plt.cm.coolwarm vs plt.cm.coolwarm_r.
I believe seaborn uses a cubehelix colormap by default.
So you'd do:
from matplotlib import pyplot
import seaborn as sns
colormap = pyplot.cm.cubehelix_r
flights = sns.load_dataset('flights').pivot("month", "year", "passengers")
sns.heatmap(flights, cmap=colormap)
There is no need to build a separate reverse function for heatmap.
Just use cmap = 'magma_r'. The default setting is magma and thus we just append the '_r' for reverse.
I am plotting several graphs with matplotlib for a publication and I need that all have the same style. Some graphs have more than 6 categories and I have noticed that, by default, it does not plot more than 6 different colours. 7 or more and I start to have repeated colours.
e.g.
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
plt.style.use('seaborn-muted')
df2= pd.DataFrame(np.random.rand(10,8))
df2.plot(kind='bar',stacked=True)
plt.legend(fontsize=13,loc=1)
plt.show()
There is probably a cognitive reason not to include more than 6 different colours, but if I need to, How can I do it? I have tried different stylesheets (seaborn, ggplot, classic) and all have seem to have the same "limitation".
Do I need to change the colormap/stylesheet? Ideally, I would like to use a qualitative colormap (there is no order in the categories that I am plotting) and use a pre-existing one... I am not very good choosing colours.
thanks!
By default, matplotlib will cycle through a series of six colors. If you want to change the default colors (or number of colors), you can use cycler to loop through the colors you want instead of the defaults.
from cycler import cycler
% Change the default cycle colors to be red, green, blue, and yellow
plt.rc('axes', prop_cycle=(cycler('color', ['r', 'g', 'b', 'y']))
demo here
A better way is just to manually specify plot colors when you create your plot so that not every plot you make has to use the same colors.
plt.plot([1,2,3], 'r')
plt.plot([4,5,6], 'g')
plt.plot([7,8,9], 'b')
plt.plot([10,11,12], 'y')
Or you can change the color after creation
h = plt.plot([1,2,3])
h.set_color('r')
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
colors = plt.cm.jet(np.linspace(0, 1, 10))
df2= pd.DataFrame(np.random.rand(10,8))
df2.plot(kind='bar',color=colors, stacked=True)
plt.legend(fontsize=13,loc=1)
this is basically copied from Plotting with more colors in matplotlib. check the document of colormap and its example page.
I want to add separating black lines to a Python area plot created using pandas. In other words, I want the stacked areas to be separated by black lines.
My current code is the following:
figure1=mydataframe.plot(kind='area', stacked=True)
And I am looking for an additional argument to pass on to the function, such as:
figure1=mydataframe.plot(kind='area', stacked=True, blacklines=TRUE)
Is there a way I can achieve this using pandas or additional matplotlib commands?
Use plt.stackplot(). You can control line width and color with linewidth and edgecolor arguments:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
data = np.random.randn(10,3)
df = pd.DataFrame(abs(data))
plt.stackplot(np.arange(10),[df[0],df[1],df[2]])
plt.show()
I have a data set that has two independent variables and 1 dependent variable. I thought the best way to represent the dataset is by a checkerboard-type plot wherein the color of the cells represent a range of values, like this:
I can't seem to find a code to do this automatically.
You need to use a plotting package to do this. For example, with matplotlib:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
X = 100*np.random.rand(6,6)
fig, ax = plt.subplots()
i = ax.imshow(X, cmap=cm.jet, interpolation='nearest')
fig.colorbar(i)
plt.show()
For those who come across this years later as myself, what Original Poster wants is a heatmap.
Matplotlib has documentation regarding the following example here.