Holoviews - How to create side-by-side bars from dataframe columns?

Holoviews - How to create side-by-side bars from dataframe columns? - python

I would like to create a Holoviews bar chart (using Bokeh backend) in which Year is on the X-axis and columns A and B on the Y-Axis. For each Year, I want bars for values from columns A and B to appear next to each other i.e. for Year 2008, I have bars of heights 1 and 3, for year 2009, I have bars 3 and 6 height, and so on. I have tried numerous different ways including the example of grouped bars in the documentation but can't get it to work. See example below:
%%opts Bars [xrotation=90 width=600 show_legend=False tools=['hover']]
df=pd.DataFrame({'Year':[2008,2009,2010,2011,2012,2013],
'A': [1,2,3,4,5,6],'B':[3,6,9,12,15,18]})
print(df)
bars = hv.Bars(df, kdims=['Year'], vdims=['A'])
bars
Please help. I am losing my mind!

HoloViews generally works best when your data is in what's called a tidy format. However to make it easier to work with data like yours we have developed a companion library called hvPlot. To generate the plot you want you can simply run:
import hvplot.pandas
df=pd.DataFrame({'Year':[2008,2009,2010,2011,2012,2013],
'A': [1,2,3,4,5,6],'B':[3,6,9,12,15,18]})
df.hvplot.bar('Year')
Alternatively you can learn about the pd.melt method, which can take your data in a wide format and convert it to a tidy dataset:
%%opts Bars [xrotation=90 width=600 show_legend=False tools=['hover']]
df=pd.DataFrame({'Year':[2008,2009,2010,2011,2012,2013],
'A': [1,2,3,4,5,6],'B':[3,6,9,12,15,18]})
tidy_df = df.melt(id_vars=['Year'], value_vars=['A', 'B'])
bars = hv.Bars(tidy_df, ['Year', 'variable'], ['value'])
bars

To respond to #pongo30 you can suppress printing 'A' and 'B' on the x-axis by adding .opts(xlabel='') to the call to hvplot.bar() (ex: df.hvplot.bar('Year').opts(xlabel=''))

Related

Python how to plot a frequency pie chart with one column using plotly.express

I have a data frame and I only want to plot the frequency of one column, for example, the count of cars of different brands. Cars: [FORD, FORD, BMW, GMC, GMC, GMC, GMC.....] I want to plot them in pie charts of any suitable graphs, without using matplotlib.
I tried to create pivot tables, which are like Ford: 4, Chevy: 3, BMW: 5, GMC: 10. but I don't know how to access the column labels and I can't use them in plotly.

You can groupby cars and get the counts in a new column count like so
df = df.groupby(['cars'])['cars'].count().reset_index(name='count')
Then you may use Plotly to render a pie chart like so
import plotly.express as px
fig = px.pie(df, values='count', names='cars', title='Cars')
fig.show()

Plot a graph in matplotlib with two different scales on one axis

I'm trying to plot a graph with time data on X-Axis. My data has daily information, but I want to create something that has two different date scales on X-Axis.
I want to start it from 2005 and it goes to 2014, but after 2014, I want that, the data continues by months of 2015. Is this possible to do? If so: how can I create this kind of plot?
Thanks.
I provided an image below:

Yes you can, just use the following pattern as I observed your X-axis values are already the same so it would just plot the other graph on the right
For a dataframe:
import numpy, matplotlib
data = numpy.array([45,63,83,91,101])
df1 = pd.DataFrame(data, index=pd.date_range('2005-10-09', periods=5, freq='W'), columns=['events'])
df2 = pd.DataFrame(numpy.arange(10,21,2), index=pd.date_range('2015-01-09', periods=6, freq='M'), columns=['events'])
matplotlib.pyplot.plot(df1.index, df1.events)
matplotlib.pyplot.plot(df2.index, df2.events)
matplotlib.pyplot.show()
You can change the parameters according to your convenience.

Python horizontal bar plotly not showig the whole range of timestamp data [duplicate]

This question already has an answer here:
How to show timestamp x-axis in Python Plotly
(1 answer)
Closed 3 years ago.
I want to plot the data availability using pyplot. I got the code from #vestland. My monthly data is here.
In general, the data spans from January 2009 to January 2019. Each variable comes with its own time period.
Below is the code.
import pandas as pd
import plotly.express as px
path = r'C:\Users\....\availability3.txt'
df = pd.read_csv(path)
df = df.drop(['Unnamed: 0'], axis=1)
fig = px.bar(df, x="Timestamp", y="variable", color='value', orientation='h',
hover_data=["Timestamp"],
height=300,
color_continuous_scale=['firebrick', '#2ca02c'],
title='Data Availabiltiy Plot',
template='plotly_white',
)
fig.update_layout(yaxis=dict(title=''),
xaxis=dict(
title='',
showgrid=True,
gridcolor='white',
tickvals=[]
)
)
fig.show()
As you can see below, the plot shows only the first row of the data which is the first day.
What I want is to show the whole range of the data on the x axis with corresponding values and colors. The result should show data from January 2009 to January 2019, variable values of 0 is shown on red and 1 in green.
Perhaps this is an issue with timestamp, because when using the number index, the plot is just okay.
Edit
By removing duplicates in the dataset and set timestamp as index, I got an almost the expected result. This the new code.
fig = px.bar(df, y="variable", color='value', orientation='h',
hover_data=[df.index],
height=300,
color_continuous_scale=['firebrick', '#2ca02c'],
title='Data Availabiltiy Plot',
template='plotly_white',
)
Now the whole time span is showing as expected. But the value of x-axis timestamp is not yet showing. I will ask in another post

I checked the documentation for plotly.express.bar and briefly worked with your code. Your data may be stacked one on top of each other.
Setting orientation='v' shows all of the data, but not in any particularly intuitive way, although I believe it does answer the question you asked. Yes, the data for Alice, Thalia, Citra, and Pebaru are all present, but the y-axis needs modification to get the proper labels:
Alternatively, setting orientation='h' and barmode='overlay' shows all of the data when you hover, but not as individual bars. You can see the overlay blur on the right edge of the bars:
There are quite a few arguments for plotly.express.bar in the documentation: https://plot.ly/python-api-reference/generated/plotly.express.bar.html#plotly.express.bar. Experiment around and see what you can come up with.
EDIT:
1) Set the x-axis independently using the Timeframe column.
2) Use .groupby() with an averaging function on value.

pandas color scheme not working properly with my data (python) [duplicate]

This question already has answers here:
Pandas DataFrame Bar Plot - Plot Bars Different Colors From Specific Colormap
(3 answers)
Closed 4 years ago.
I would like to change the default color scheme of my pandas plot. I tried with different color schemes through cmap pandas parameter, but when I change it, all bars of my barplot get the same color.
The code I tried is the following one:
yearlySalesGenre = df1.groupby('Genre').Global_Sales.sum().sort_values()
fig = plt.figure()
ax2 = plt.subplot()
yearlySalesGenre.plot(kind='bar',ax=ax2, sort_columns=True, cmap='tab20')
plt.show(fig)
And the data that I plot (yearlySalesGenre) is a pandas Series type:
Genre
Strategy 174.50
Adventure 237.69
Puzzle 243.02
Simulation 390.42
Fighting 447.48
Racing 728.90
Misc 803.18
Platform 828.08
Role-Playing 934.40
Shooter 1052.94
Sports 1249.47
Action 1745.27
Using tab20 cmap I get the following plot:
I get all bars with the first color of all the tab20 scheme. What I am doing wrong?
Note that if I use the default color scheme of pandas plot, it properly displays all bars with different colors, but the thing is that I want to use a particular color scheme.
As posted, it's a duplicated answer. Just in case, the answer is that pandas makes color schemes based on different columns, not in rows. So to use different colors you can transpose the data + some other stuff (duplicated link), or directly use the matplotlib.pyplot plotting that allows more flexibility (in my case):
plt.bar(range(len(df)), df, color=plt.cm.tab20(np.arange(len(df))))

Maybe this is what you want:
df2.T.plot( kind='bar', sort_columns=True, cmap='tab20')
I think the problem you have is that you only have one series. Pandas plot bar will plot separate series (columns) each with its own color, and separate each each bar based on the index.
By using .T, the series in your data become multiple columns but within only one index. I am sure you can play with the legend to get a better display.

how to draw a stacked bar

I have a dataframe - df as below :
df = pd.DataFrame({"Card_name":['AAA','AAA','AAA','BBB','BBB','BBB','CCC','CCC','CCC'],
"Amount":['900','800','700','600','500','400','400','300','200'],
"Category" :['Grocery','Bank','Gas','Bank','Grocery','Recreation',
'Bank','Grocery','Gas']})
I want to build a visualization plot, where i can show for all the "Card_name" the Categories along with the amount. Maybe a stacked bar chart which shows all the categories for each "Card_name". Each area(size of the area) in the stacked bar chart depends on the Amount.
I tried many possible ways but i am not able to visualize ? Any help will be appreciated.

First pivot your df, then call give the option stacked=True:
df = pd.DataFrame({"Card_name":['AAA','AAA','AAA','BBB','BBB','BBB','CCC','CCC','CCC'],
"Amount":['900','800','700','600','500','400','400','300','200'],
"Category" :['Grocery','Bank','Gas','Bank','Grocery','Recreation','Bank','Grocery','Gas']})
df['Amount'] = pd.to_numeric(df['Amount'])
df.pivot(index='Card_name', columns='Category', values='Amount').plot(kind='bar', stacked=True)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Holoviews - How to create side-by-side bars from dataframe columns? - python

To respond to #pongo30 you can suppress printing 'A' and 'B' on the x-axis by adding .opts(xlabel='') to the call to hvplot.bar() (ex: df.hvplot.bar('Year').opts(xlabel=''))

Related

Python how to plot a frequency pie chart with one column using plotly.express

Plot a graph in matplotlib with two different scales on one axis

Python horizontal bar plotly not showig the whole range of timestamp data [duplicate]

pandas color scheme not working properly with my data (python) [duplicate]

how to draw a stacked bar

Categories

Resources