Rearranging the columns of my heatmap using python's seaborn - python

I'm trying to visualize the following .csv data:
Here's my code:
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('data.csv')
map = sns.clustermap(df, annot=True, linewidths=2, linecolor='yellow', metric="correlation", method="single")
Which returns:
I want to rearrange my heatmap and order it column-wise by the frequency of each response. For example, The column Q5 has the value 4 repeated 8 times (more than any other column), so it should be the first column. Columns 17 and 19 have a value that is repeated 7 times, so they should come in second and third (exact order doesn't matter). How can I do this?

You can compute the order and reindex before using the data in clustermap:
order = (df.apply(pd.Series.value_counts)
import seaborn as sns
cm = sns.clustermap(df[order], col_cluster=False, annot=True, linewidths=2, linecolor='yellow', metric="correlation", method="single")


Pandas, Seaborn, Plot boxplot with 2 columns and a 3º as hue

in a Pandas Df with 3 variables i want to plot 2 columns in 2 different boxes and the 3rd column as hue with seaborn
I can reach the first step with pd.melt but I cant insert the hue and make it work
This is what I have:
sb.boxplot(data=pd.melt(df2), x="variable", y="value",palette= 'Blues')
I want to do this in the first DF, setting variable 'A' as hue
Can you help me?
Thank you
IIUC, you can achieve this as follows:
Apply df.melt, using column A for id_vars, and ['B','C'] for value_vars.
Next, inside sns.boxplot, feed the melted df to the data parameter, and add hue='A'.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.DataFrame({'A':['a','a','b','a','b'], 'B':[1,3,5,4,7], 'C':[2,3,4,1,3]})
sns.boxplot(data=df.melt(id_vars='A', value_vars=['B','C']),
x='variable', y='value', hue='A', palette='Blues')

Multi Index Seaborn Line Plot

I have a multi index dataframe, with the two indices being Sample and Lithology
Sample 20EC-P 20EC-8 20EC-10-1 ... 20EC-43 20EC-45 20EC-54
Lithology Pd Di-Grd Gb ... Hbl Plag Pd Di-Grd Gb
Rb 7.401575 39.055118 6.456693 ... 0.629921 56.535433 11.653543
Ba 24.610102 43.067678 10.716841 ... 1.073115 58.520532 56.946630
Th 3.176471 19.647059 3.647059 ... 0.823529 29.647059 5.294118
I am trying to put it into a seaborn lineplot as such.
spider = sns.lineplot(data = data, hue = data.columns.get_level_values("Lithology"),
style = data.columns.get_level_values("Sample"),
dashes = False, palette = "deep")
The lineplot comes out as
I have two issues. First, I want to format hues by lithology and style by sample. Outside of the lineplot function, I can successfully access sample and lithology using data.columns.get_level_values, but in the lineplot they don't seem to do anything and I haven't figured out another way to access these values. Also, the lineplot reorganizes the x-axis by alphabetical order. I want to force it to keep the same order as the dataframe, but I don't see any way to do this in the documentation.
To use hue= and style=, seaborn prefers it's dataframes in long form. pd.melt() will combine all columns and create new columns with the old column names, and a column for the values. The index too needs to be converted to a regular column (with .reset_index()).
Most seaborn functions use order= to set an order on the x-values, but with lineplot the only way is to make the column categorical applying a fixed order.
from matplotlib import pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
column_tuples = [('20EC-P', 'Pd '), ('20EC-8', 'Di-Grd'), ('20EC-10-1 ', 'Gb'),
('20EC-43', 'Hbl Plag Pd'), ('20EC-45', 'Di-Grd'), ('20EC-54', 'Gb')]
col_index = pd.MultiIndex.from_tuples(column_tuples, names=["Sample", "Lithology"])
data = pd.DataFrame(np.random.uniform(0, 50, size=(3, len(col_index))), columns=col_index, index=['Rb', 'Ba', 'Th'])
data_long = data.melt(ignore_index=False).reset_index()
data_long['index'] = pd.Categorical(data_long['index'], data.index) # make categorical, use order of the original dataframe
ax = sns.lineplot(data=data_long, x='index', y='value',
hue="Lithology", style="Sample", dashes=False, markers=True, palette="deep")
ax.legend(loc='upper left', bbox_to_anchor=(1.01, 1.02))
plt.tight_layout() # fit legend and labels into the figure
The long dataframe looks like:
index Sample Lithology value
0 Rb 20EC-P Pd 6.135005
1 Ba 20EC-P Pd 6.924961
2 Th 20EC-P Pd 44.270570

Python Pandas - Plotting multiple Bar plots by category from dataframe

I have dataframe which looks like
df = pd.DataFrame(data={'ID':[1,1,1,2,2,2], 'Value':[13, 12, 15, 4, 2, 3]})
Index ID Value
0 1 13
1 1 12
2 1 15
3 2 4
4 2 2
5 2 3
and I want to plot it by the IDs (categories) so that each category would have different bar plot,
so in this case I would have two figures,
one figure with bar plot of ID=1,
and second separate figure bar plot of ID=2.
Can I do it (preferably without loops) with something like df.plot(y='Value', kind='bar')?
2 options are possible, one using matplotlib and the other seaborn that you should absolutely now as it works well with Pandas.
Pandas with matplotlib
You have to create a subplot with a number of columns and rows you set. It gives an array axes in 1-D if either nrows or ncols is set to 1, or in 2-D otherwise. Then, you give this object to the Pandas plot method.
If the number of categories is not known or high, you need to use a loop.
import pandas as pd
import matplotlib.pyplot as plt
fig, axes = plt.subplots( nrows=1, ncols=2, sharey=True )
df.loc[ df["ID"] == 1, 'Value' ] ax=axes[0] )
df.loc[ df["ID"] == 2, 'Value' ] ax=axes[1] )
Pandas with seaborn
Seaborn is the most amazing graphical tool that I know. The function catplot enables to plot a series of graph according to the values of a column when you set the argument col. You can select the type of plot with kind.
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df['index'] = [1,2,3] * 2
sns.catplot(kind='bar', data=df, x='index', y='Value', col='ID')
I added a column index in order to compare with the If you don't want to, remove x='index' and it will display an unique bar with errors.

Bar plot and coloured categorical variable

I have a dataframe with 3 variables:
data= [["2019/oct",10,"Approved"],["2019/oct",20,"Approved"],["2019/oct",30,"Approved"],["2019/oct",40,"Approved"],["2019/nov",20,"Under evaluation"],["2019/dec",30,"Aproved"]]
df = pd.DataFrame(data, columns=['Period', 'Observations', 'Result'])
I want a barplot grouped by the Period column, showing all the values ​​contained in the Observations column and colored with the Result column.
How can I do this?
I tried the sns.barplot, but it joined the values in Observations column in just one bar(mean of the values).
Plot output
Assuming that you want one bar for each row, you can do as follows:
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
result_cat = df["Result"].astype("category")
result_codes =
cmap =["Result"].unique().shape[0]))
patches = []
for code in
cat =[code]
patches.append(mpatches.Patch(color=cmap[code], label=cat))'Period',
If you would like it grouped by the months, and then stacked, please use the following (note I updated your code to make sure one month had more than one status), but not sure I completely understood your question correctly:
%matplotlib inline
import pandas as pd
import matplotlib.pyplot as plt
data= [["2019/oct",10,"Approved"],["2019/oct",20,"Approved"],["2019/oct",30,"Approved"],["2019/oct",40,"Under evaluation"],["2019/nov",20,"Under evaluation"],["2019/dec",30,"Aproved"]]
df = pd.DataFrame(data, columns=['Period', 'Observations', 'Result'])
df.groupby(['Period', 'Result'])['Observations'].sum().unstack('Result').plot(kind='bar', stacked=True)

Can we create scatter plot with a single data line

I have sample data in dataframe as below
Date EmpCount DeptCount
0 2009-01-01 100 200
Can we generate Scatter plot(or any Line chart etc..) only with this one record.
I tried multiple approaches but i am getting
TypeError: no numeric data to plot
In X Axis: Dates
In Y Axis: Two dots one for Emp Count , and other one is for dept count
Starting from #the-cauchy-criterion, try this:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame([['2009-01-01',100,200]],columns=header)
ax = plt.plot(b, linewidth=3, markersize=10, marker='.')
What are you using to plot the scatter plot?
Here's how to do it with pyplot.
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame([['2009-01-01',100,200]],columns=header)
iloc[0] gets the first entry, [1:] takes all the columns except the first and the * operator unpacks the arguments.

