Using Altair charting, how can I create a chart of value_counts() of multiple columns? This is easily done by matplotlib. How can the identical chart be created using Altair?
import matplotlib.pyplot as plt
import pandas as pd
df = pd.DataFrame({'Col1':[0,1,2,3],
'Col2':[0,1,2,2],
'Col3':[2,3,3,3]})
pd.DataFrame({col:df[col].value_counts(normalize=True) for col in df}).plot(kind='bar')
You could do this:
import pandas as pd
import altair as alt
df = pd.DataFrame({
'Col1':[0,1,2,3],
'Col2':[0,1,2,2],
'Col3':[2,3,3,3]
}).melt(var_name='column')
alt.Chart(df).mark_bar().encode(
x='column',
y='count()',
column='value:O',
color='column'
)
In the next major release of Altair you can use the offset channels instead of faceting as in this example.
Related
Could you please help me if you know how to make a pie chart in Python from it?
This is a reproducible example how the df looks like. However, I have way more rows over there.
import pandas as pd
data = [["70%"], ["20%"], ["10%"]]
example = pd.DataFrame(data, columns = ['percentage'])
example.index = ['Lasiogl', 'Centella', 'Osmia']
example
You can use matplotlib to plot the pie chart using dataframe and its indexes as labels of the chart:
import matplotlib.pyplot as plt
import pandas as pd
data = ['percentage':["70%"], ["20%"], ["10%"]]
example = pd.DataFrame(data, columns = ['percentage'])
my_labels = 'Lasiogl', 'Centella', 'Osmia'
plt.pie(example,labels=my_labels,autopct='%1.1f%%')
plt.show()
I have seen many questions on changing the tick frequency on SO, and that did help when I am building a line chart, but I have been struggling when its a bar chart. So below are my codes
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
df = pd.DataFrame(np.random.randint(1,10,(90,1)),columns=['Values'])
df.plot(kind='bar')
plt.show()
and thats the output I see. How do I change the tick frequency ?
(To be more clearer frequency of 5 on x axis!)
Using Pandas plot function you can do:
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randint(1,10,(90,1)),columns=['Values'])
df.plot(kind='bar', xticks=np.arange(0,90,5))
Or better:
df.plot(kind='bar', xticks=list(df.index[0::5]))
The following is my dataframe.
How can I create a line using plotly where x-axis contains the years and y-axis contains the production value of that specific year
IIUC, this code should do the job:
import plotly.graph_objects as go
import pandas as pd
df = pd.DataFrame({'Minerals':['Nat. Gas'],
'2013':[5886],
'2014':[5258],
'2015':[5214],
'2016':[5073],
'2017':[5009],})
fig = go.Figure(data=go.Scatter(x=df.columns[1:],
y=df.loc[0][1:]))
fig.show()
and you get:
I have a dataframe with several categorical columns. I know how to do countplot which routinly plots ONE column.
Q: how to plot maximum count from ALL columns in one plot?
here is an exemplary dataframe to clarify the question:
import pandas as pd
import numpy as np
import seaborn as sns
testdf=pd.DataFrame(({ 'Ahome' : pd.Categorical(["home"]*10),
'Bsearch' : pd.Categorical(["search"]*8 + ["NO"]*2),
'Cbuy' : pd.Categorical(["buy"]*5 + ["NO"]*5),
'Dcheck' : pd.Categorical(["check"]*3 + ["NO"]*7),
} ))
testdf.head(10)
sns.countplot(data=testdf,x='Bsearch');
The last line is just using normal countplot for one column. I'd like to have the columns category (home,search,buy and check) in x-axis and their frequency in y-axis.
You need to use countplot as below:
df = pd.melt(testdf)
sns.countplot(data=df.loc[df['value']!="NO"], x='variable', hue='value')
Output:
As #HarvIpan points out, using melt you would create a long-form dataframe with the column names as entries. Calling countplot on this dataframe produces the correct plot.
As a difference to the existing solution, I would recommend not to use the hue argument at all.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df=pd.DataFrame(({ 'Ahome' : pd.Categorical(["home"]*10),
'Bsearch' : pd.Categorical(["search"]*8 + ["NO"]*2),
'Cbuy' : pd.Categorical(["buy"]*5 + ["NO"]*5),
'Dcheck' : pd.Categorical(["check"]*3 + ["NO"]*7),
} ))
df2 = df.melt(value_vars=df.columns)
df2 = df2[df2["value"] != "NO"]
sns.countplot(data=df2, x="variable")
plt.show()
How can I plot this data frame using seaborn to show the KPI per model?
allFrame = pd.DataFrame({'modelName':['first','second', 'third'],
'kpi_1':[1,2,3],
'kpi_2':[2,4,3]})
Not like sns.barplot(x="kpi2", y="kpi1", hue="modelName", data=allFrame)
But rather like this per KPI
Try melting the dataframe first, and then you can plot using seaborn:
import pandas as pd
import seaborn as sns
allFrame = pd.DataFrame({'modelName':['first','second', 'third'],
'kpi_1':[1,2,3],
'kpi_2':[2,4,3]})
allFrame2 = pd.melt(frame=allFrame,
id_vars=['modelName'],
value_vars=["kpi_1","kpi_2"],
value_name="Values", var_name="kpis")
sns.barplot(x="kpis", y="Values", hue="modelName", data=allFrame2)
Thanks!