This question already has answers here:
Replacing column values in a pandas DataFrame
(16 answers)
Closed 2 years ago.
i wanted to to replace the yes and no values in No-show column to be changed to 0 and 1 valuesenter image description here
Here is a simple answer:
df = pd.DataFrame({'No-show':['Yes','No','No','Yes']})
df['No-show'] = df['No-show'].replace('Yes',1).replace('No',0)
df
output:
No-show
0 1
1 0
2 0
3 1
Related
This question already has answers here:
What is the most efficient way of counting occurrences in pandas?
(4 answers)
Closed 5 months ago.
Be a DataFrame in pandas of this format:
ID
time
other
0
81219
blue
0
32323
green
1
423
red
1
4232
blue
1
42424
red
2
42422
blue
I simply want to create a DataFrame like the following by counting the number of times each row is output in the previous DataFrame.
ID
number_appears
0
2
1
3
2
1
Try this:
df.groupby('ID').count()
This question already has answers here:
How to replace the white space in a string in a pandas dataframe?
(4 answers)
Closed 1 year ago.
I got a series like this
0 stand and on the top of the m
1 be aware of the p
2 in the night o
3 tt
4 锉
Here is my code
x1=x.str.split(pat='/').str[0].copy()
x2=x1.str.split(expand=True).copy()
x2['combined']=x2[x2.columns].apply(lambda row: '+'.join(row.values.astype(str)), axis=1)
x2['combined']
the result of x2 is
0 stand+and+on+the+top+of+the+m
1 be+aware+of+the+p+None+None+None
2 in+the+night+o+None+None+None+None
3 tt+None+None+None+None+None+None+None
4 Nan+Nan+Nan+Nan+Nan+Nan+Nan+Nan
The outcome I want is
0 stand+and+on+the+top+of+the+m
1 be+aware+of+the+p
2 in+the+night+o
3 tt
4
what should I do?
Just replace the spacer:
x.str.replace('\s+', '+', regex=True)
output:
0 stand+and+on+the+top+of+the+m
1 be+aware+of+the+p
2 in+the+night+o
3 tt
4 锉
Use:
x['combined']=x.str.split(pat='/').str[0].str.split().str.join('+')
This question already has answers here:
Add a sequential counter column on groups to a pandas dataframe
(4 answers)
Closed 1 year ago.
When using groupby(), how can I create a DataFrame with a new column containing an increasing index of each group. For example, if I have
df=pd.DataFrame('a':[1,1,1,2,2,2])
df
a
0 1
1 1
2 1
3 2
4 2
5 2
How can I get a DataFrame where the index resets for each new group in the column. The association between a and index is not important...just need to have each case of a receive a unique index starting from 1.
a idx
0 1 1
1 1 2
2 1 3
3 2 1
4 2 2
5 2 3
The answer in the comments :
df['idx'] = df.groupby('a').cumcount() + 1
This question already has answers here:
How to sort a dataFrame in python pandas by two or more columns?
(3 answers)
Closed 3 years ago.
I have a data frame:
df =
ID Num
a 3
b 4
b 2
a 1
Want to sort in ascending order by taking into account unique values of ID column
My Try:
df.sort_values(by=['Num'])
But it gave me ascending order by neglecting ID column
Desired output:
df =
ID Num
a 1
a 3
b 2
b 4
Just do:
df.sort_values(['ID', 'Num'])
Output
ID Num
3 a 1
0 a 3
2 b 2
1 b 4
This question already has answers here:
How to move pandas data from index to column after multiple groupby
(4 answers)
How to convert index of a pandas dataframe into a column
(9 answers)
Closed 4 years ago.
This is the original table:
A B C E
0 1 1 5 4
1 1 1 1 1
2 3 3 8 2
I wanted to apply some aggregate functions to this table which I did with:
df.sort_values('C').groupby(['A', 'B'], sort=False).agg({'C': 'sum', 'E': 'last'})
My new table looks like this:
A B C E
1 1 6 4
3 3 8 2
When I measure the column lenght of the original VS the modified table with this command len(df.columns) , the results differ though.
The original table returns 4 columns and the modified table returns 2 columns.
My question: Why did this happen and how can I get to return 4 columns with the modified table?