How to add individual data labels to countplot in seaborn? [duplicate] - python

This question already has answers here:
How to plot and annotate grouped bars in seaborn / matplotlib
(1 answer)
How to plot and annotate a grouped bar chart
(1 answer)
How to add value labels on a bar chart
(7 answers)
Closed 1 year ago.
I have been working on a campus recruitment dataset. The target variable in the dataset is "status", which indicates if the student is placed or not. Now, I am comparing each variable (for e.g. gender) with the target variable (status of placement), to know which variable affects the target variable the most. To compare two variables, I have been using countplots in seaborn. The plot for the variable "gender" looks like this.
Image showing the sns plot
The code for the sns plot is as follows:
ax = sns.countplot(x = "cat_degree_t", hue = "status", order = df['cat_degree_t'].value_counts().index, data = df);
abs_values = df["cat_degree_t"].value_counts().values;
ax.bar_label(container=ax.containers[0], labels=abs_values);
Now I want to know how I could add values of individual bars in the countplot (not the total value like already written in the figure shown above, but on every individual bar). This would help me find out the percentage of placed and not placed for each category in the variable "gender".
Any help would be really appreciated.
Thanks

Related

Python line plotting 2 columns in same DataFrame using Index and count [duplicate]

This question already has answers here:
How to plot multiple pandas columns
(3 answers)
Plot multiple columns of pandas DataFrame using Seaborn
(2 answers)
How do I create a multiline plot using seaborn?
(3 answers)
Closed 26 days ago.
Newbie to Python so am unsure whether this can be done in one graph or not. I have one DataFrame containing Year, Number of Accidents and Number of Fatalities:
I am trying to generate a line plot that shows x axis = Year, y axis = number of instances per year, and 2 lines showing number of each individual column. Using Seaborn, I can only see a way to map 2 columns and hue. Can anyone please provide any advice on whether this is achievable in either Matplotlib or Seaborn.
Tried using Seaborn but cannot work out how to set up x and y axis as required and show 2 individual columns within that:
sns.lineplot(x=f1_safety['NumberOfFatalities'],y=f1_safety['NumberOfAccidents'].count(), hue = f1_safety['year'].count())
plt.show()
There are at least two ways to accomplish what you want to do here.
The simpler one uses pandas built-in plotting API. You can plot dataframes directly when they are already in the correct form. In your case, you need to set the year as the index, and then can plot right away:f1_safety.set_index("year").plot()
If you want to use seaborn, you first need to transform the data into the correct format. seaborn takes x and y, and you can not specify different y columns directly (like y1, y2 and so on). Instead, you need to transform the data into "long format". In such a table, you get one index or id column, one value column and a "description" kind of columns. This works like this:
f1_safety = pd.melt(df, id_vars="year", value_vars=["NumberOfAccidents", "NumberOfFatalities"])
sns.lineplot(data=f1_safety, x="year", y="value", hue="variable")
The plot in both cases looks quite the same:
There are other ways. In particular, in Jupyter you can execute two plot statements in the same cell, and matplotlib will put the plots into the same figure, even cycling through the colors as necessary.

Facing Problem in counting multi-label for classification [duplicate]

This question already has answers here:
Plot key count per unique value count in pandas
(3 answers)
Closed 2 years ago.
I am trying to count the number of labels for my multilabel classification, but I fail to plot a bar graph for my label column. Is there anybody who can help me out? i already used below code to plot but it shows
*'DataFrame' object has no attribute 'arange'
As you can see the multiple labels are there in a Label column so I want to plot a bar graph for them please help me out
i=data.arange(20)
tag_df_sorted.head(20).plot(kind='bar')
plt.title('Frequency of top 20 tags')
plt.xticks(i, tag_df_sorted['Labels'])
plt.xlabel('Tags')
plt.ylabel('Counts')
plt.show()
Seems like you want to have a histogram.
You can either go like this:
tag_df_sorted.groupby('Labels').count().plot()
or with Pandas's hist function:
# number of unique values in the column "Labels"
Num = len(tag_df_sorted['Labels'].unique())
# plot histogram
hist = tag_df_sorted['Labels'].hist(bins=Num )
There is a nice little tutorial on plotting histograms here.

How do I set column colors in a bar plot of a dataframe? [duplicate]

This question already has answers here:
Pandas bar plot -- specify bar color by column
(2 answers)
vary the color of each bar in bargraph using particular value
(3 answers)
Closed 2 years ago.
I was trying the tutorial posted on http://queirozf.com/entries/pandas-dataframe-plot-examples-with-matplotlib-pyplot
and was wondering whether it was possible to have a bar chart created that could have colored columns.
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({
'name':['john','mary','peter','jeff','bill','lisa','jose'],
'age':[23,78,22,19,45,33,20],
'gender':['M','F','M','M','M','F','M'],
'state':['california','dc','california','dc','california','texas','texas'],
'num_children':[2,0,0,3,2,1,4],
'num_pets':[5,1,0,5,2,2,3]
})
df.plot(kind='bar',x='name',y='age')
the current code above creates
.
However, I would like to have the eventual output showing that the columns have a different
.
(The column showing John would be red, Mary would be orange and so on)
Any and all help is appreciated! Thank you.
Specify the colors:
df.plot(kind='bar',x='name',y='age',
color=["red","blue","green","yellow","black","grey","purple"])
to get
You might want to remove the the legend using
df.plot(kind='bar',x='name',y='age',
color=["red","blue","green","yellow","black","grey","purple"],
legend=False)
as it only displays one color.

How to reduce the number of values on x-axis in a matplotlib graph [duplicate]

This question already has an answer here:
pyplot, why isn't the x-axis showing?
(1 answer)
Closed 3 years ago.
I am trying to plot a graph using matplotlib library.
This is my code:
df = pd.DataFrame()
df = milo_data2.loc[milo_data2['id'] == device]
plt.figure()
plt.title(device)
plt.ylabel('Counter')
plt.plot(df['timestamp'],df['counter'])
The graph looks like
The values on the x-axis are crowded and not readable.(The bold black line is the group of values overlapping each other) How do I reduce the number of values on the x-axis so that I can see some values on x-axis to get an estimate.
You can manually set the ticks to display. For instance, you can leave every tenth tick:
ticks = list(df['timestamp'])
plt.xticks([ticks[i] for i in range(len(ticks)) if i % 10 == 0], rotation='vertical')
For more information see documentation

pandas color scheme not working properly with my data (python) [duplicate]

This question already has answers here:
Pandas DataFrame Bar Plot - Plot Bars Different Colors From Specific Colormap
(3 answers)
Closed 4 years ago.
I would like to change the default color scheme of my pandas plot. I tried with different color schemes through cmap pandas parameter, but when I change it, all bars of my barplot get the same color.
The code I tried is the following one:
yearlySalesGenre = df1.groupby('Genre').Global_Sales.sum().sort_values()
fig = plt.figure()
ax2 = plt.subplot()
yearlySalesGenre.plot(kind='bar',ax=ax2, sort_columns=True, cmap='tab20')
plt.show(fig)
And the data that I plot (yearlySalesGenre) is a pandas Series type:
Genre
Strategy 174.50
Adventure 237.69
Puzzle 243.02
Simulation 390.42
Fighting 447.48
Racing 728.90
Misc 803.18
Platform 828.08
Role-Playing 934.40
Shooter 1052.94
Sports 1249.47
Action 1745.27
Using tab20 cmap I get the following plot:
I get all bars with the first color of all the tab20 scheme. What I am doing wrong?
Note that if I use the default color scheme of pandas plot, it properly displays all bars with different colors, but the thing is that I want to use a particular color scheme.
As posted, it's a duplicated answer. Just in case, the answer is that pandas makes color schemes based on different columns, not in rows. So to use different colors you can transpose the data + some other stuff (duplicated link), or directly use the matplotlib.pyplot plotting that allows more flexibility (in my case):
plt.bar(range(len(df)), df, color=plt.cm.tab20(np.arange(len(df))))
Maybe this is what you want:
df2.T.plot( kind='bar', sort_columns=True, cmap='tab20')
I think the problem you have is that you only have one series. Pandas plot bar will plot separate series (columns) each with its own color, and separate each each bar based on the index.
By using .T, the series in your data become multiple columns but within only one index. I am sure you can play with the legend to get a better display.

Categories

Resources