This question already has an answer here:
Why is pandas applying the same values on both sides of an asymmetric error bar?
(1 answer)
Closed 5 years ago.
I want to plot asymmetrical errorbars with pandas. According to official docs this should work
df = pd.DataFrame([[1,0.2,0.7]])
fig, ax = plt.subplots()
df[0].plot.bar(yerr=[df[1], df[2]], ax=ax)
But pandas renders errorbar as df[1] for both lower and upper limits (-0.2/+0.2 istead of -0.2/+0.7):
Where do I make a mistake?
I use pandas v0.20.3 with python v2.7.13 under Windows 7.
Your yerr is 1D:
yerr=[df[1], df[2]]
It needs to be 2D, specifically one row per data point and each row having two values for negative and positive error:
yerr=[[df[1], df[2]]]
Related
This question already has answers here:
How to plot multiple pandas columns
(3 answers)
Plot multiple columns of pandas DataFrame using Seaborn
(2 answers)
How do I create a multiline plot using seaborn?
(3 answers)
Closed 26 days ago.
Newbie to Python so am unsure whether this can be done in one graph or not. I have one DataFrame containing Year, Number of Accidents and Number of Fatalities:
I am trying to generate a line plot that shows x axis = Year, y axis = number of instances per year, and 2 lines showing number of each individual column. Using Seaborn, I can only see a way to map 2 columns and hue. Can anyone please provide any advice on whether this is achievable in either Matplotlib or Seaborn.
Tried using Seaborn but cannot work out how to set up x and y axis as required and show 2 individual columns within that:
sns.lineplot(x=f1_safety['NumberOfFatalities'],y=f1_safety['NumberOfAccidents'].count(), hue = f1_safety['year'].count())
plt.show()
There are at least two ways to accomplish what you want to do here.
The simpler one uses pandas built-in plotting API. You can plot dataframes directly when they are already in the correct form. In your case, you need to set the year as the index, and then can plot right away:f1_safety.set_index("year").plot()
If you want to use seaborn, you first need to transform the data into the correct format. seaborn takes x and y, and you can not specify different y columns directly (like y1, y2 and so on). Instead, you need to transform the data into "long format". In such a table, you get one index or id column, one value column and a "description" kind of columns. This works like this:
f1_safety = pd.melt(df, id_vars="year", value_vars=["NumberOfAccidents", "NumberOfFatalities"])
sns.lineplot(data=f1_safety, x="year", y="value", hue="variable")
The plot in both cases looks quite the same:
There are other ways. In particular, in Jupyter you can execute two plot statements in the same cell, and matplotlib will put the plots into the same figure, even cycling through the colors as necessary.
This question already has answers here:
seaborn boxplot and stripplot points aren't aligned over the x-axis by hue
(1 answer)
Seaborn boxplot + stripplot: duplicate legend
(2 answers)
How to do a boxplot with individual data points using seaborn
(1 answer)
How to overlay data points on seaborn figure-level boxplots
(2 answers)
Closed 7 months ago.
I'm trying to make a box plot and a strip plot from the following dataframe, but I think I'm not putting the arguments or maybe the df should be re-arranged but I don't know how.
the dataframe is here
I'm hopping to have a strip plot and a box for each color (column) in the same plot.
I'd be grateful if you could help
This question already has answers here:
How to plot in multiple subplots
(12 answers)
How to plot multiple dataframes in subplots
(10 answers)
Plotting Pandas into subplots
(1 answer)
Plot two pandas data frames side by side, each in subplot style
(1 answer)
Closed 1 year ago.
I'm trying to plot different dataframes in a plot (each dataframe is a subplot).
I'm doing:
plt.figure(figsize=(20,20))
for i,name in enumerate(names):
df = pd.read_excel(myPath+name)
df['Datetime'] = pd.to_datetime(df['Datetime'])
df = df.set_index('Datetime')
plt.subplot(4, 1, i+1) # I have 4 dataframes
df.plot()
plt.show()
But I obtain this...
How can I do it correctly?
Thank you!
df.plot doesn't integrate too well in more complex plotting patterns. It's meant more as a shortcut to get a quick plot from your Dataframe while doing data exploration.
Thus, you end up creating 4 empty subplots and 4 df.plot(). I imagine that the first 3 df.plot results plots are overridden by the last one. This is probably due to the figure/axes handling of df.plot which is outside the scope of this question.
Try something like this instead.
fig, axes = plt.subplots(4, 1)
for i,name in enumerate(names):
df = pd.read_excel(myPath+name)
df['Datetime'] = pd.to_datetime(df['Datetime'])
df = df.set_index('Datetime')
axes[i].plot(df)
plt.show()
For more information on proper use of subplots, have a look at these examples: https://matplotlib.org/3.1.0/gallery/subplots_axes_and_figures/subplots_demo.html
EDIT: I had misread your code and my previous comments about specifying the keys were probably erroneous. I have updated my answer now. Hope it works for you
This question already has answers here:
Detect and exclude outliers in a pandas DataFrame
(19 answers)
Closed 1 year ago.
See the violinplot:
here I'm showing the points to show that the long tail of the violin is due to a single point. I would like to ignore these outliers points so that I have a more concise violin plot. Can I do that with seaborn when plotting the violin or do I have to remove them from the distribution myself?
You can do it by excluding the outlier data while passing it through the plot function.
e.g.
sns.violinplot(y = df[df["Column"]<x]["Column"])
wherein, df is your dataframe. Column is the name of the column you want to plot and x is the outlier value that you want to exclude.
This question already has answers here:
How to plot multiple pandas columns
(3 answers)
Use index in pandas to plot data
(6 answers)
Closed 1 year ago.
I am trying to make an area plot. However the x axis of the graph shows 1,2,3 rather than the year. How can I change this? I saw some related questions here, however the code there is a bit too complicated for me.
My code is:
import matplotlib as mpl
import matplotlib.pyplot as plt
areaplot=r'data.xlsx'
df_areaplot = pd.read_excel(areaplot)
df_areaplot.plot(kind='area')
plt.title('SDG Spending Trend')
plt.ylabel('Amount Spent')
plt.xlabel('Years')
plt.show()
As #tmdavison points out in the comment, you can use column names to determine axes, even with specified lists of columns, i.e.:
df_areaplot.plot(kind='area',
x='BDG',
y=['Citizen Initiatives',
'Education and Employability',
'Enviromental Sustainability',
'Woman Empowerment'])