This question already has answers here:
Counting duplicate values in Pandas DataFrame
(3 answers)
Count unique values using pandas groupby [duplicate]
(5 answers)
Closed 4 years ago.
I have a Pandas DataFrame like this :
df = pd.DataFrame({
'Date': ['2017-1-1', '2017-1-1', '2017-1-2', '2017-1-2', '2017-1-3'],
'Groups': ['one', 'one', 'one', 'two', 'two']})
Date Groups
0 2017-1-1 one
1 2017-1-1 one
2 2017-1-2 one
3 2017-1-2 two
4 2017-1-3 two
How can I generate a new DataFrame like this?
Date Groups_counts
0 2017-1-1 1
1 2017-1-2 2
2 2017-1-3 1
Thanks a lot!
To get count of unique records use:
df.groupby('Date')['Groups'].nunique()
Related
This question already has answers here:
Pandas, Pivot table from 2 columns with values being a count of one of those columns
(2 answers)
Most efficient way to melt dataframe with a ton of possible values pandas
(2 answers)
How to form a pivot table on two categorical columns and count for each index?
(2 answers)
Closed 2 years ago.
am trying to transform the rows and count the occurrences of the values based on groupby the id
Dataframe:
id value
A cake
A cookie
B cookie
B cookie
C cake
C cake
C cookie
expected:
id cake cookie
A 1 1
B 0 2
c 2 1
This question already has answers here:
Pandas Merging 101
(8 answers)
Closed 2 years ago.
I am new to python. I'm want to change all the values in the column 'Starting' from df_2 with the 'Station' column from df_1. I did it by using for loop . But How can I perform this task in simplest way?
df_1:
ID Station
0 1 Satose
1 2 Forlango
2 3 poterio
.
.
df_2:
Rail_Number Starting Ending
AABDD 3 44433
DLRAKA 1 45232
MiGOMu 2 18756
.
.
I have answered a similar question here :
Replace a value in a dataframe with a value from another dataframe
Step 1: Convert both columns from df_1 into a dictionary by using the following code:
d = dict(zip(df_1.ID,df_1.Station))
Step 2: Now we just need to map this dictionary and df_2:
df_2.Starting = df_1.ID.map(d)
This question already has answers here:
Pandas groupby with delimiter join
(2 answers)
Concatenate strings from several rows using Pandas groupby
(8 answers)
Closed 3 years ago.
Given a Pandas Dataframe df, with column names 'Session', and 'List':
Can I group together the 'List' values for the same values of 'Session'?
My Approach
I've tried solving the problem by creating a new dataframe, and iterating through the rows of the inital dataframe while maintaing a session counter that I increment if I see that the session has changed.
If it hasn't changed, then I append the List value that corresponds to that rows value with a comma.
Whenever the session changes, I used strip to get rid of the last comma (extra).
Initial DataFrame
Session List
0 1 a
1 1 b
2 1 c
3 2 d
4 2 e
5 3 f
Required DataFrame
Session List
0 1 a,b,c
1 2 d,e
2 3 f
Can someone suggest something more efficient or simple?
Thank you in advance.
Use groupby and apply and reset_index:
>>> df.groupby('Session')['List'].agg(','.join).reset_index()
Session List
0 1 a,b,c
1 2 d,e
2 3 f
>>>
This question already has answers here:
Dataframe set_index not setting
(2 answers)
Pandas set_index does not set the index
(1 answer)
Closed 3 years ago.
I have tried to choose a column to be an index of a data frame. The examples I've seen so far suggests to use the method set_index(), but it doesn't work in my case. I use python 3.7.0
import pandas as pd
df = pd.DataFrame({'Fruit' : ['Apples','Oranges'],
'Amount': [ 1, 17 ]})
df.set_index('Fruit')
print(df1)
The output that I get is
Fruit Amount
0 'Apples' 1
1 'Oranges' 17
The output I want would be something like
Amount
Fruit
'Apples' 1
'Oranges' 17
This question already has answers here:
Use a list of values to select rows from a Pandas dataframe
(8 answers)
Filter dataframe rows if value in column is in a set list of values [duplicate]
(7 answers)
Closed 5 years ago.
This is a subset of a dataframe:
index drug_id values
1 le.1 f
2 le.7 h
3 le.10 9
4 le.11 10
5 le.15 S
I am going to remove rows that values in the drug_id column are: le.7, le.10, le.11.
This is my code:
df.drop(df.drug_id[['le.7', 'le.10', 'le.11']], inplace = True )
I also tried this:
df.drop(df.drug_id == ['le.7', 'le.10', 'le.11'], inplace = True )
But none of them worked. Any suggestion ?