Count the number of occurrences from a column [duplicate] - python

This question already has answers here:
What is the most efficient way of counting occurrences in pandas?
(4 answers)
Closed 5 months ago.
Be a DataFrame in pandas of this format:
ID
time
other
0
81219
blue
0
32323
green
1
423
red
1
4232
blue
1
42424
red
2
42422
blue
I simply want to create a DataFrame like the following by counting the number of times each row is output in the previous DataFrame.
ID
number_appears
0
2
1
3
2
1

Try this:
df.groupby('ID').count()

Related

How to count the sum of values of a column in a table output [duplicate]

This question already has answers here:
Get total of Pandas column
(5 answers)
How can I strip the whitespace from Pandas DataFrame headers?
(5 answers)
Closed 6 months ago.
This is the output of a code run, and i would like to know how am i able to count the sum of values of the column 'age' in this table output
value
age
change
Car
110
10
1
Drum
46
3
0
Bottle
12
510
1
Shoes
80
29
1
df['age'].sum() (considering that the table is called df)

Find Max Value of Duplicate Rows But Keep All Data [duplicate]

This question already has answers here:
Remove duplicates by columns A, keeping the row with the highest value in column B
(14 answers)
Get the row(s) which have the max value in groups using groupby
(15 answers)
Closed 1 year ago.
Trying to merge duplicate rows with identical columns besides value but want to keep all data that is not duplicated.
I thought a groupby function and resetting the index would allow me to achieve this goal but that obviously did not work.
Tried to run Microsoft Visual Basic for Applications to achieve my goal but it omitted non duplicate data as well.
Was hoping for some pandas or even excel tips or pandas/excel documentation that could assist me.
My Code:
grouped_df = result1.groupby(['ID','Name','Value'])
maximums = grouped_df.max('Price')
maximums = maximums.reset_index()
Dataset before:
ID
Name
Value
1
Apple
3
2
Banana
4
2
Banana
5
3
Orange
3
4
Pear
7
4
Pear
5
What I am getting with my code:
ID
Name
Value
2
Banana
5
4
Pear
7
What I wish to achieve:
ID
Name
Value
1
Apple
3
2
Banana
5
3
Orange
3
4
Pear
7

How to use groupby to create repeating index for each group in a Dataframe? [duplicate]

This question already has answers here:
Add a sequential counter column on groups to a pandas dataframe
(4 answers)
Closed 1 year ago.
When using groupby(), how can I create a DataFrame with a new column containing an increasing index of each group. For example, if I have
df=pd.DataFrame('a':[1,1,1,2,2,2])
df
a
0 1
1 1
2 1
3 2
4 2
5 2
How can I get a DataFrame where the index resets for each new group in the column. The association between a and index is not important...just need to have each case of a receive a unique index starting from 1.
a idx
0 1 1
1 1 2
2 1 3
3 2 1
4 2 2
5 2 3
The answer in the comments :
df['idx'] = df.groupby('a').cumcount() + 1

how to replace variables in a column [duplicate]

This question already has answers here:
Replacing column values in a pandas DataFrame
(16 answers)
Closed 2 years ago.
i wanted to to replace the yes and no values in No-show column to be changed to 0 and 1 valuesenter image description here
Here is a simple answer:
df = pd.DataFrame({'No-show':['Yes','No','No','Yes']})
df['No-show'] = df['No-show'].replace('Yes',1).replace('No',0)
df
output:
No-show
0 1
1 0
2 0
3 1

Replace column values with their frequency [duplicate]

This question already has answers here:
Adding a new pandas column with mapped value from a dictionary [duplicate]
(1 answer)
Pandas create new column with count from groupby
(5 answers)
Closed 2 years ago.
I'm looking to replace values in a Pandas column with their respective frequencies in the column.
I'm aware I can use value_counts to retrieve the frequency distribution for each value in the column. What I'm not sure on is how to replace every occurance of a value with its respective frequency.
An example dataframe:
a b c
0 tiger 2 3
1 tiger 5 6
2 lion 8 9
Example output of df['a'].value_counts():
tiger 2
lion 1
Name: a, dtype: int64
Expected result when applied to column 'a':
a b c
0 2 2 3
1 2 5 6
2 1 8 9

Categories

Resources