How to count values in dataframe column? [duplicate] - python

This question already has answers here:
Get statistics for each group (such as count, mean, etc) using pandas GroupBy?
(9 answers)
Count the frequency that a value occurs in a dataframe column
(15 answers)
Closed 3 years ago.
For the dataframe as below
animal direction
0 monkey north
1 frog north
2 monkey east
3 zebra west
....
I would like to count the number of animals presented in this dataframe to have the below dataframe
animal count
0 monkey 3
1 frog 9
2 zebra 4
3 elephant 11
....
How can I achieve this? I tried value_counts(), and groupby but with my knowledge I couldn't quite achieve what I wanted...
Thank you for help.

df.groupby(['col1']).size().reset_index(name='counts')
This method worked beautifully. Thank you.

Related

Splitting columns with made with groupby [duplicate]

This question already has answers here:
Groupby and aggregate using lambda functions
(2 answers)
How can I pivot a dataframe?
(5 answers)
Python: Counting specific occurrences in dataframe by group
(2 answers)
Closed 1 year ago.
I can't seem to find a solution to my problem. I have used groupby and value_count to get all the results for each team but I can not find a way to split the column to make three new ones for (H = Win, D = Draw, A= Loss) the new column names.
I don't know if I am just serching for the wrong thing or the way I have it now will not work.
matchdata.groupby("HomeTeam")["FullTimeResult"].value_counts()
Results in:
HomeTeam FullTimeResult
Arsenal H 10
D 6
A 3
Aston Villa A 9
H 7
D 3
Wanted output
HomeTeam Win Draw Loss
Arsenal 10 6 3
AstonVilla 7 3 9
eg...

pandas select rows with no duplicate [duplicate]

This question already has answers here:
Remove pandas rows with duplicate indices
(7 answers)
Closed 2 years ago.
I have 1 dataframe and I want to select all rows that don't have duplicates
My df:
Name Age
Jp 4
Anna 15
Jp 4
John 10
My output should be :
Name Age
Anna 15
John 10
I am using Pandas dataframe
any suggestions?
You want to drop duplicates across multiple columns:
df.drop_duplicates(['Name','Age'])
Please see the pandas documentation on basic methods of dataframes.

Count the number of times a pair of value occurs in a Dartaframe [duplicate]

This question already has answers here:
Pandas, groupby and count
(3 answers)
Get statistics for each group (such as count, mean, etc) using pandas GroupBy?
(9 answers)
Adding a 'count' column to the result of a groupby in pandas?
(2 answers)
Pandas create new column with count from groupby
(5 answers)
Closed 3 years ago.
I have the following dataframe:
print(df)
Product Store Quantity_Sold
A NORTH 10
A NORTH 5
A SOUTH 8
B SOUTH 8
B SOUTH 5
(...)
I would like to count the number of times the same pair of product and store is present; to illustrate:
print(final_df)
Product Store count
A NORTH 2
A SOUTH 1
B SOUTH 2
(...)
I tried with:
df["Product"].value_counts()
But it only works with single columns. How can I create final_df?

How to calculate in python the sum of variables in each unique raw in dataframe? [duplicate]

This question already has answers here:
How do I Pandas group-by to get sum?
(11 answers)
Closed 3 years ago.
I have a dataframe with information of different users (ID) with many duplicated categorical variables (photo) and its corresponding numbers of interactions (likes). How i can calculate the sum of total likes for each different photo type?
For example:
id photo_type likes
1 nature 2
2 art 4
3 art 1
4 fashion 3
5 fashion 2
I expect to get information like that:
total numbers of likes for nature:2
total numbers of likes for art: 5
total numbers of likes for fashion: 5
Use pandas.DataFrame.groupby:
df.groupby('photo_type')['likes'].sum()
Output:
photo_type
art 5
fashion 5
nature 2
Name: likes, dtype: int64

How do I clean phone numbers in pandas [duplicate]

This question already has answers here:
How to only do string manupilation on column of pandas that have 4 digits or less?
(3 answers)
Closed 3 years ago.
I have a pandas dataframe with a column for Phone however, the data is a bit inconsistent. Here are some examples that I would like to focus on.
df["Phone"]
0 732009852
1 738073222
2 755920306
3 0755353288
Row 3 has the necessary leading 0 for an Australian number. How do I update rows like 0,1 and 2?
Use pandas.Series.str.zfill:
s = pd.Series(['732009852', '0755353288'])
s.str.zfill(10)
Output:
0 0732009852
1 0755353288
Or pd.Series.str.rjust:
print(df["Phone"].str.rjust(10, '0'))
Output:
0 0732009852
1 0738073222
2 0755920306
3 0755353288

Categories

Resources