This question already has answers here:
How to flip a column of ratios, convert into a fraction and convert to a float
(3 answers)
Closed 2 years ago.
I have downloaded and created the dataframe below. I would like to create an additional column, in which I divide the second number from a cell by the first one. To give an example, the first cell of the column should be 0.8 (because it's 4/5 = 0.8). Does anyone know how to get the numbers from the string directly and divide them?
Thanks in advance, any help or tips appreciated
Use:
df['Ratio'] = (df['Ratio'].str.split(' for ', expand=True)
.astype(float)
.assign(Ratio= lambda x: x[0] / x[1])['Ratio'])
Related
This question already has answers here:
Converting string column from DataFrame to float for .sum()
(4 answers)
Change column type in pandas
(16 answers)
Closed 1 year ago.
I a trying to get the sum of two numbers by using groupby and transform in pandas library but It is giving some garbage value, can someone guide me on how to solve this:
my data looks like this:
SKU Fees
45241 6.91
45241 6.91
55732 119.05
55732 137.98
I have tried using this code:
df['total_fees'] = df.groupby(['sku'])['Fees'].transform('sum')
what I am getting is this:
SKU Fees total_fees
45241 6.91 6.91.6.91
45241 6.91 6.91.6.91
55732 119.05 119.05.137.98
55732 137.98 119.05.137.98
df['Fees'] = df['Fees'].astype(float)
df.groupby(['sku'])['Fees'].sum()
# Computes the sum
df.groupby(['sku'])['Fees'].transform('sum')
# Computes the sum but using 'transform' duplicates the value for each row
This question already has answers here:
Repeat each row of data.frame the number of times specified in a column
(10 answers)
Closed 2 years ago.
Is there a way in excel, Python, or R to convert data that is in the format of time and quantity per date into one long column. For instance:
Current format:
Instead I want this data to be one long column of 17 0s followed by 1 1 and 176 0s etc.
Thank you in advance for any help.
To elaborate the data looks like this:
Current data:
And I need this data to look like this:
Final result:
One option with uncount
library(tidyr)
uncount(dat, quantity)
Or with rep
with(dat, rep(time, quantity))
This question already has answers here:
The first three max value in a column in python
(1 answer)
Count and Sort with Pandas
(5 answers)
Closed 3 years ago.
I am doing an online course which has a problem like " Find the name of the state with maximum number of counties". The problem dataframe is the image below
Problem Dataframe
Now, I have given the dataframe two new index (hierarchical indexing) and after that the dataframe takes a new look like the image below
Modified Dataframe
I have used this code to get the modified dataframe:
def answer_five():
new_df = census_df[census_df['SUMLEV'] == 50]
new_df = new_df.set_index(['STNAME', 'CTYNAME'])
return new_df
answer_five()
What I want to do now is to find the name of the state with most number of counties i.e to find the index with maximum number of rows. How Can I do that?
I know that using something like groupby() method this can be done but I'm not familiar with this method yet and so don't want to use it. Can anyone help? I have searched for this but failed. Sorry if the problem is rudimentary. Thanks in advance.
This question already has answers here:
Pandas groupby: How to get a union of strings
(8 answers)
Closed 3 years ago.
new in pandas and I was able to create a dataframe from a csv file. I was also able to sort it out.
What I am struggling now is the following: I give an image as an example from a pandas data frame.
First column is the index,
Second column is a group number
Third column is what happened.
I want based on the second column to take out the third column on the same unique data frame.
I highlight few examples: For the number 9 return back the sequence
[60,61,70,51]
For the number 6 get back the sequence
[65,55,56]
For the number 8 get back the single element 8.
How groupby can be used to do this extraction?
Thanks a lot
Regards
Alex
Starting from the answers on this question we can extract following code to receive the desired result.
dataframe = pd.DataFrame({'index':[0,1,2,3,4], 'groupNumber':[9,9,9,9,9], 'value':[12,13,14,15,16]})
grouped = dataframe.groupby('groupNumber')['value'].apply(list)
This question already has answers here:
pandas - how to convert all columns from object to float type
(3 answers)
Closed 4 years ago.
I am a beginner trying to analyze a dataset of Congressional campaign funding sources but they are all string values with '$' in them. How can I quickly change every value into a numerical value?enter image description here
states_table[dollar_columns] = states_table[dollar_columns].replace('[\$,]', '', regex=True).astype(float)
Where dollar_columns is a list of the columns you want to convert. For instance:
dollar_columns = ['net_con', 'net_ope_exp']