Related
I'm trying to visualize the following .csv data:
Q1,Q2,Q3,Q4,Q5,Q6,Q7,Q8,Q9,Q10,Q11,Q12,Q13,Q14,Q15,Q16,Q17,Q18,Q19,Q20
4,4,2,2,4,2,3,5,3,4,2,5,2,1,4,4,2,1,5,2
2,2,4,4,4,2,2,2,4,4,2,4,2,2,3,2,2,4,5,2
4,5,4,1,4,2,2,4,4,3,2,2,2,1,2,4,4,2,5,4
3,4,2,4,4,2,2,2,4,3,2,4,4,3,3,4,2,4,5,1
4,4,3,2,4,3,4,5,4,3,1,5,3,2,4,2,2,3,4,2
4,5,2,3,5,1,3,4,3,3,1,2,4,4,5,4,1,4,5,4
5,5,5,2,4,3,2,4,4,2,2,4,4,2,4,2,2,4,4,5
4,4,3,1,5,3,2,4,2,2,1,4,4,2,4,1,2,5,5,3
1,3,5,2,4,4,3,1,4,4,2,3,1,4,3,4,3,3,4,1
3,3,5,2,4,2,4,4,3,4,1,5,4,2,1,2,2,4,5,2
Here's my code:
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('data.csv')
map = sns.clustermap(df, annot=True, linewidths=2, linecolor='yellow', metric="correlation", method="single")
plt.show()
Which returns:
I want to rearrange my heatmap and order it column-wise by the frequency of each response. For example, The column Q5 has the value 4 repeated 8 times (more than any other column), so it should be the first column. Columns 17 and 19 have a value that is repeated 7 times, so they should come in second and third (exact order doesn't matter). How can I do this?
You can compute the order and reindex before using the data in clustermap:
order = (df.apply(pd.Series.value_counts)
.max()
.sort_values(ascending=False)
.index
)
import seaborn as sns
cm = sns.clustermap(df[order], col_cluster=False, annot=True, linewidths=2, linecolor='yellow', metric="correlation", method="single")
Output:
I have a dataframe with 3 variables:
data= [["2019/oct",10,"Approved"],["2019/oct",20,"Approved"],["2019/oct",30,"Approved"],["2019/oct",40,"Approved"],["2019/nov",20,"Under evaluation"],["2019/dec",30,"Aproved"]]
df = pd.DataFrame(data, columns=['Period', 'Observations', 'Result'])
I want a barplot grouped by the Period column, showing all the values ​​contained in the Observations column and colored with the Result column.
How can I do this?
I tried the sns.barplot, but it joined the values in Observations column in just one bar(mean of the values).
sns.barplot(x='Period',y='Observations',hue='Result',data=df,ci=None)
Plot output
Assuming that you want one bar for each row, you can do as follows:
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
result_cat = df["Result"].astype("category")
result_codes = result_cat.cat.codes.values
cmap = plt.cm.Dark2(range(df["Result"].unique().shape[0]))
patches = []
for code in result_cat.cat.codes.unique():
cat = result_cat.cat.categories[code]
patches.append(mpatches.Patch(color=cmap[code], label=cat))
df.plot.bar(x='Period',
y='Observations',
color=cmap[result_codes],
legend=False)
plt.ylabel("Observations")
plt.legend(handles=patches)
If you would like it grouped by the months, and then stacked, please use the following (note I updated your code to make sure one month had more than one status), but not sure I completely understood your question correctly:
%matplotlib inline
import pandas as pd
import matplotlib.pyplot as plt
data= [["2019/oct",10,"Approved"],["2019/oct",20,"Approved"],["2019/oct",30,"Approved"],["2019/oct",40,"Under evaluation"],["2019/nov",20,"Under evaluation"],["2019/dec",30,"Aproved"]]
df = pd.DataFrame(data, columns=['Period', 'Observations', 'Result'])
df.groupby(['Period', 'Result'])['Observations'].sum().unstack('Result').plot(kind='bar', stacked=True)
I have 2 separate dataframes that look exactly the same but with different numbers in it
df = pd.DataFrame({'clip emotes':[79,223,435,291,188,99,153,50,55,78,83,48,43,73]}, index=['roohappy','rooblank','lul','omegalul','pog','pogchamp','roovv','roowut','roopog','pepehands','biblethumb','roocry','rooree','rooblind'])
df
and
df = pd.DataFrame({'vod emotes':[3963,7286,5560,4390,3386,3111,2639,2612,2422,1999,1948,1691,1654,1573,1308,1090,1024,1019,1019,974,945,912,893,856,790,771,731,677,658,652]}, index=['rood','roovv','pepega','lul','clap','rookek','roocult','rooblank','pog','rooree','rooaww','roohappy','omegaroll','rooduck','rooh','rareroo','roocry','pepehand','lulw','rooderp','roopog','hyperclap','roospy','rooayaya','omegalul','roolove','roowut','roonya','monkas','roo4'])
df
and then I do df.plot(kind = 'bar') for both of the separately. I cant figure out how can I put these two datas into a one graph one over the other so that one bar with the same name would be over the other with a different colour.
You can do it by joining them:
import pandas as pd
import matplotlib.pyplot as plt
df1 = pd.DataFrame({'clip emotes':[79,223,435,291,188,99,153,50,55,78,83,48,43,73]}, index=['roohappy','rooblank','lul','omegalul','pog','pogchamp','roovv','roowut','roopog','pepehands','biblethumb','roocry','rooree','rooblind'])
df2 = pd.DataFrame({'vod emotes':[3963,7286,5560,4390,3386,3111,2639,2612,2422,1999,1948,1691,1654,1573,1308,1090,1024,1019,1019,974,945,912,893,856,790,771,731,677,658,652]}, index=['rood','roovv','pepega','lul','clap','rookek','roocult','rooblank','pog','rooree','rooaww','roohappy','omegaroll','rooduck','rooh','rareroo','roocry','pepehand','lulw','rooderp','roopog','hyperclap','roospy','rooayaya','omegalul','roolove','roowut','roonya','monkas','roo4'])
df3 = df2.join(df1)
df3.plot(kind='bar', stacked=True)
plt.tight_layout()
Hello,
I'm trying to plot a box plot combining columns from two different data frames. Help please :)
This is the code:
import pandas as pd
from numpy import random
#Generating the data frame
df1 = pd.DataFrame(data = random.randn(5,2), columns = ['W','Y'])
df2 = pd.DataFrame(data = random.randn(5,2), columns = ['X','Y'])
print(df1.head())
print('\n')
print(df2.head())
This is the output:
This is what I want to get:
The following will give you what you desire:
import matplotlib.pyplot as plt
fig, ax = plt.subplots(1, 1)
ax.boxplot([df1['Y'], df2['Y']], positions=[1, 2])
ax.set_xticklabels(['W', 'X'])
ax.set_ylabel('Y')
This gave me the plot below (which I think is what you were aiming for):
I have a DataFrame with 700 rows and 6 columns:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame(np.random.rand(700,6))
I can plot all columns in a single plot by calling:
df.plot()
And I can plot each column in a single plot by calling:
df.plot(subplots=True)
How can I have two subplots with three columns each from my DataFrame?!
Here's a general approach to plot a dataframe with n columns in each subplot:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame(np.random.rand(700,6))
col_per_plot = 3
cols = df.columns.tolist()
# Create groups of 3 columns
cols_splits = [cols[i:i+col_per_plot] for i in range(0, len(cols), col_per_plot)]
# Define plot grid.
# Here I assume it is always one row and many columns. You could fancier...
fig, axarr = plt.subplots(1, len(cols_splits))
# Plot each "slice" of the dataframe in a different subplot
for cc, ax in zip(cols_splits, axarr):
df.loc[:, cc].plot(ax = ax)
This gives the following picture: