create box plot of subcolumns of pandas dataframe

create box plot of subcolumns of pandas dataframe - python

I have following pandas dataframe. I would like to create box (sub)plots of all the 5 columns (in one plot). How can I achieve this.
I am using following python statement but I am not getting the output.
df.boxplot(column=['synonym']['score'])

Here is an example of boxplot via plotly.express:
import plotly.express as px
df = pd.DataFrame(dict(x1=[1,2,3], x2=[4,8,12],x3=[1,5,10]))
df = df.melt(value_vars=['x1','x2','x3'])
fig = px.box(df, x='variable', y='value', color='variable')
fig.show()

Related

python plotly express mutiple layer graph (boxchart + scatter)

I want to create a multi layer graph with the same data frame from pandas.
One should be a boxplot and the other a scatter to see where the company is located.
Is there a way to combine both plots?
boxplot
scatterplot
import pandas as pd
import plotly.express as px
df = pd.read_csv("company_index.csv", sep=";", decimal=",")
print(df)
df_u9 = df.loc[df["company"].isin(["U9"])]
fig_1 = px.box(
df,
x="period",
y="index"
)
fig_2 = px.scatter(
df_u9,
x="period",
y="index"
)
fig_1.show()
fig_2.show()
company_index.csv
period;index;company
1;202,4;U1
1;226,69;U10
1;235,18;U9
1;236,49;U4
1;238,13;U2
1;244,05;U6
1;252,08;U3
1;256,68;U8
1;294,99;U5
1;299,391;U7
2;243,78;U1
2;264,26;U10
2;270,6;U2
2;272,89;U9
2;285,26;U5
2;289,29;U4
2;291,15;U6
2;291,19;U3
2;305,92;U7
2;314,65;U8
3;271,82;U1
3;278,65;U2
3;296,16;U10
3;297,21;U4
3;305,93;U6
3;308,96;U5
3;323,74;U9
3;335,93;U3
3;354,13;U8
3;381,2;U7
4;281,26;U5
4;308,5;U2
4;311,61;U1
4;334,03;U4
4;335,72;U9
4;344,32;U8
4;345,27;U6
4;355,44;U3
4;373,54;U7
4;381,68;U10
5;288,6;U1
5;305,66;U5
5;323,2;U2
5;358,46;U8
5;365,57;U3
5;366,96;U10
5;368,38;U7
5;371,23;U6
5;373,63;U4
5;422,93;U9
6;285,32;U5
6;291,65;U1
6;308,68;U2
6;372,04;U8
6;376,64;U3
6;403,55;U6
6;407,38;U4
6;420,65;U10
6;423,68;U9
6;453,09;U7

Found this solution. Works rather well.
Im still struggling to understand the ".data[0]" but i believe its referring to the first fig in use. Maybe if you have multiple graphs.
import pandas as pd
import plotly.express as px
df = pd.read_csv("company_index.csv", sep=";", decimal=",")
print(df)
df_u9 = df.loc[df["company"].isin(["U9"])].copy()
df_u9["size"] = 1
fig = px.box(
df,
x="period",
y="index"
)
fig.add_trace(px.scatter(
df_u9,
x="period",
y="index",
size="size",
size_max=15,
color_discrete_sequence=(203,153,201)
).data[0])
fig.show()

How do you add two df to a plot map?

I want to combine df1.plot() and df0.plot() onto one plot graph. Currently, when running both, it will give me two plot maps, and I'm out of my expertise on how to join both of them.
import pandas as pd
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = [25, 10]
df = df.sort_values('datetime', ascending=True)
df1.plot()
df0.plot()
plt.show()

Replace your code
df1.plot()
df0.plot()
The following code will help:
plt.plot(df1["X"], df1["Y"])
plt.plot(df0["X"], df0["Y"])
plt.show()

You can either entirely switch over to Matplotlib code as Mohit Mehlawat suggested, or you can keep using pandas.DataFrame.plot() by setting an ax variable and place both plots on it like so:
df1 = pd.DataFrame({'Value':[1,2,3,4,5]})
df2 = pd.DataFrame({'Value':[5,4,3,2,1]})
# Relevant code
ax = df1.plot()
df2.plot(ax=ax)
Output:

Multiple consecutive bar plots with a time slider in Plotly, Python

I have a Pandas dataframe representing portfolio weights in multiple dates, such as the following contents in CSV format:
DATE,ASSET1,ASSET2,ASSET3,ASSET4,ASSET5,ASSET6,ASSET7
2010-01-04,0.250000,0.0,0.250000,0.000000,0.25,0.000000,0.250000
2010-02-03,0.250000,0.0,0.250000,0.000000,0.25,0.000000,0.250000
2010-03-05,0.217195,0.0,0.250000,0.032805,0.25,0.000000,0.250000
2010-04-06,0.139636,0.0,0.250000,0.110364,0.25,0.000000,0.250000
2010-05-05,0.179569,0.0,0.218951,0.101480,0.25,0.000000,0.250000
2010-06-04,0.207270,0.0,0.211974,0.080756,0.25,0.000000,0.250000
2010-07-06,0.132468,0.0,0.250000,0.117532,0.25,0.000000,0.250000
2010-08-04,0.116353,0.0,0.250000,0.133647,0.25,0.000000,0.250000
2010-09-02,0.081677,0.0,0.250000,0.168323,0.25,0.000000,0.250000
2010-10-04,0.000000,0.0,0.250000,0.250000,0.25,0.009955,0.240045
For each row in the Pandas dataframe resulting from this CSV, we can generate a bar chart with the portfolio composition at that day. I would like to have multiple bar charts, with a time slider, such that we can choose one of the dates and see the portfolio composition during that day.
Can this be achieved with Plotly?

I could not find a way to do it straight in the dataframe above, but it is possible to do it by "melting" the dataframe. The following code achieves what I was looking for, together with some beautification of the chart:
import pandas as pd
from io import StringIO
import plotly.express as px
string = """
DATE,ASSET1,ASSET2,ASSET3,ASSET4,ASSET5,ASSET6,ASSET7
2010-01-04,0.250000,0.0,0.250000,0.000000,0.25,0.000000,0.250000
2010-02-03,0.250000,0.0,0.250000,0.000000,0.25,0.000000,0.250000
2010-03-05,0.217195,0.0,0.250000,0.032805,0.25,0.000000,0.250000
2010-04-06,0.139636,0.0,0.250000,0.110364,0.25,0.000000,0.250000
2010-05-05,0.179569,0.0,0.218951,0.101480,0.25,0.000000,0.250000
2010-06-04,0.207270,0.0,0.211974,0.080756,0.25,0.000000,0.250000
2010-07-06,0.132468,0.0,0.250000,0.117532,0.25,0.000000,0.250000
2010-08-04,0.116353,0.0,0.250000,0.133647,0.25,0.000000,0.250000
2010-09-02,0.081677,0.0,0.250000,0.168323,0.25,0.000000,0.250000
2010-10-04,0.000000,0.0,0.250000,0.250000,0.25,0.009955,0.240045
"""
df = pd.read_csv(StringIO(string))
df = df.melt(id_vars=['DATE']).sort_values(by = 'DATE')
fig = px.bar(df, x="variable", y="value", animation_frame="DATE")
fig.update_layout(legend_title_text = None)
fig.update_xaxes(title = "Asset")
fig.update_yaxes(title = "Proportion")
fig.update_layout(autosize = True, height = 600)
fig.update_layout(hovermode="x")
fig.update_layout(plot_bgcolor="#F8F8F8")
fig.update_traces(
hovertemplate=
'<i></i> %{y:.2%}'
)
fig.show()
This produces the following:

Can I make a pie chart based on indexes in Python?

Could you please help me if you know how to make a pie chart in Python from it?
This is a reproducible example how the df looks like. However, I have way more rows over there.
import pandas as pd
data = [["70%"], ["20%"], ["10%"]]
example = pd.DataFrame(data, columns = ['percentage'])
example.index = ['Lasiogl', 'Centella', 'Osmia']
example

You can use matplotlib to plot the pie chart using dataframe and its indexes as labels of the chart:
import matplotlib.pyplot as plt
import pandas as pd
data = ['percentage':["70%"], ["20%"], ["10%"]]
example = pd.DataFrame(data, columns = ['percentage'])
my_labels = 'Lasiogl', 'Centella', 'Osmia'
plt.pie(example,labels=my_labels,autopct='%1.1f%%')
plt.show()

Plotly graph_objects add df column to hovertemplate

I am trying to generally recreate this graph and struggling with adding a column to the hovertemplate of a plotly Scatter. Here is a working example:
import pandas as pd
import chart_studio.plotly as py
import plotly.graph_objects as go
dfs = pd.read_html('https://coronavirus.jhu.edu/data/mortality', header=0)
df = dfs[0]
percent = df['Case-Fatality'] # This is my closest guess, but isn't working
fig = go.Figure(data=go.Scatter(x=df['Confirmed'],
y = df['Deaths'],
mode='markers',
hovertext=df['Country'],
hoverlabel=dict(namelength=0),
hovertemplate = '%{hovertext}<br>Confirmed: %{x}<br>Fatalities: %{y}<br>%{percent}',
))
fig.show()
I'd like to get the column Cast-Fatality to show under {percent}
I've also tried putting in the Scatter() call a line for text = [df['Case-Fatality']], and switching {percent} to {text} as shown in this example, but this doesn't pull from the dataframe as hoped.
I've tried replotting it as a px, following this example but it throws the error dictionary changed size during iteration and I think using go may be simpler than px but I'm new to plotly.
Thanks in advance for any insight for how to add a column to the hover.

As the question asks for a solution with graph_objects, here are two that work-
Method (i)
Adding %{text} where you want the variable value to be and passing another variable called text that is a list of values needed in the go.Scatter() call. Like this-
percent = df['Case-Fatality']
hovertemplate = '%{hovertext}<br>Confirmed: %{x}<br>Fatalities: %{y}<br>%{text}',text = percent
Here is the complete code-
import pandas as pd
import plotly.graph_objects as go
dfs = pd.read_html('https://coronavirus.jhu.edu/data/mortality', header=0)
df = dfs[0]
percent = df['Case-Fatality'] # This is my closest guess, but isn't working
fig = go.Figure(data=go.Scatter(x=df['Confirmed'],
y = df['Deaths'],
mode='markers',
hovertext=df['Country'],
hoverlabel=dict(namelength=0),
hovertemplate = '%{hovertext}<br>Confirmed: %{x}<br>Fatalities: %{y}<br>%{text}',
text = percent))
fig.show()
Method (ii)
This solution requires you to see the hoverlabel as when you pass x unified to hovermode. All you need to do then is pass an invisible trace with the same x-axis and the desired y-axis values. Passing mode='none' makes it invisible. Here is the complete code-
import pandas as pd
import plotly.graph_objects as go
dfs = pd.read_html('https://coronavirus.jhu.edu/data/mortality', header=0)
df = dfs[0]
percent = df['Case-Fatality'] # This is my closest guess, but isn't working
fig = go.Figure(data=go.Scatter(x=df['Confirmed'],
y = df['Deaths'],
mode='markers',
hovertext=df['Country'],
hoverlabel=dict(namelength=0)))
fig.add_scatter(x=df.Confirmed, y=percent, mode='none')
fig.update_layout(hovermode='x unified')
fig.show()

The link you shared is broken. Are you looking for something like this?
import pandas as pd
import plotly.express as px
px.scatter(df,
x="Confirmed",
y="Deaths",
hover_name="Country",
hover_data={"Case-Fatality":True})
Then if you need to use bold or change your hover_template you can follow the last step in this answer

Drawing inspiration from another SO question/answer, I find that this is working as desired and permits adding multiple cols to the hover data:
import pandas as pd
import plotly.express as px
fig = px.scatter(df,
x="Confirmed",
y="Deaths",
hover_name="Country",
hover_data=[df['Case-Fatality'], df['Deaths/100K pop.']])
fig.show()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

create box plot of subcolumns of pandas dataframe - python

I have following pandas dataframe. I would like to create box (sub)plots of all the 5 columns (in one plot). How can I achieve this. I am using following python statement but I am not getting the output. df.boxplot(column=['synonym']['score'])

Here is an example of boxplot via plotly.express: import plotly.express as px df = pd.DataFrame(dict(x1=[1,2,3], x2=[4,8,12],x3=[1,5,10])) df = df.melt(value_vars=['x1','x2','x3']) fig = px.box(df, x='variable', y='value', color='variable') fig.show()

Related

python plotly express mutiple layer graph (boxchart + scatter)

How do you add two df to a plot map?

Multiple consecutive bar plots with a time slider in Plotly, Python

Can I make a pie chart based on indexes in Python?

Plotly graph_objects add df column to hovertemplate

Categories

Resources