Remove column without headers and data [duplicate] - python

This question already has answers here:
Remove Unnamed columns in pandas dataframe [duplicate]
(4 answers)
Closed 4 years ago.
I have a CSV file and when I bring it to python as a dataframe, it create a new Unnamed: 1 column in dataframe. So how could I remove it or filter it.
So I need only Title and Date column in my dataframe not the column B of csv. Dataframe look like,
Title Unnamed: 1 Date
0 Đồng Nai Province makes it easier for people w... NaN 18/07/2018
1 Ex-NBA forward Washington gets six-year prison... NaN 10/07/2018
2 Helicobacter pylori NaN 10/07/2018
3 Paedophile gets prison term for sexual assault NaN 03/07/2018
4 Immunodeficiency burdens families NaN 28/06/2018

Drop that column from your dataframe:
df.drop(["Unnamed: 1"], inplace=True)

Related

Merge rows based on same column value (float type) [duplicate]

This question already has answers here:
pandas join rows/groupby with categorical data and lots of nan values
(3 answers)
Closed 6 months ago.
I have a dataset that looks like the following:
id name phone diagnosis
0 1 archie 12345 healthy
1 2 betty 23456 dead
2 3 clara 34567 NaN
3 3 clara 34567 kidney
4 4 diana 45678 cancer
I want to merge duplicated rows and have a table that looks like this:
id name phone diagnosis
0 1 archie 12345 healthy
1 2 betty 23456 dead
2 3 clara 34567 NaN, kidney
3 4 diana 45678 cancer
In short I want the entries in the diagnosis column put together so I can have an overview. I have tried running the following but it throws out an error, stating that a string was expected but a float was found.
data = data.groupby(['id','name','phone'])['diagnosis'].apply(', '.join).reset_index()
Anyone have any ideas how I can merge the rows?
It is because of NaN values. And you can't really concatenate strings with NaN as expected. One alternative way is to fill nans with string 'NaN':
data.fillna('NaN', inplace=True)
data.groupby(['id', 'name', 'phone']).diagnosis.apply(', '.join).reset_index()

Creating a DataFrame from other 2 dataframes based on coditions in Python [duplicate]

This question already has answers here:
Pandas Merging 101
(8 answers)
Closed 1 year ago.
I have 2 csv file. They have one common column which is ID. What I want to do is I want to extract the common rows and built another dataframe. Firstly, I want to select job, and after that, as I said they have one common column, I want to find the rows whose IDs are the same. Visually, the dataframe should be seen like this:
Let first DataFrame is:
#ID
#Gender
#Job
#Shift
#Wage
1
Male
Engineer
Night
8000
2
Male
Engineer
Night
7865
3
Female
Worker
Day
5870
4
Male
Accountant
Day
5870
5
Female
Architecture
Day
4900
Let second one is:
#ID
#Department
1
IT
2
Quality Control
5
Construction
7
Construction
8
Human Resources
And the new DataFrame should be like:
#ID
#Department
#Job
#Wage
1
IT
Engineer
8000
2
Quality Control
Engineer
7865
5
Construction
Architecture
4900
You can use:
df_result = df1.merge(df2, on = 'ID', how = 'inner')
If you want to select only certain columns from a certain df use:
df_result = df1[['ID','Job', 'Wage']].merge(df2[['ID', 'Department']], on = `ID`, how = 'inner')
Use:
df = df2.merge(df1[['ID','Job', 'Wage']], on='ID')

How to delete multiple rows in pandas dataframe based on one column object? [duplicate]

This question already has answers here:
Deleting DataFrame row in Pandas based on column value
(18 answers)
Closed 2 years ago.
I have a dataframe that looks like this
Year Season
2000 Winter
2002 Winter
2002 Summer
2004 Summer
2006 Winter
and I want to be able to remove all the rows with Winter so it look like this
Year Season
2002 Summer
2004 Summer
Like this:
df = df[df['Season']!='Winter']
Explanation:
df['Season']!='Winter' returns a boolean mask that you can use to index the original dataframe, thereby dropping all rows where season is winter.
See here: How to select rows from a DataFrame based on column values?

Pandas: Update values in one column based on the values from another pandas table [duplicate]

This question already has answers here:
Remap values in pandas column with a dict, preserve NaNs
(11 answers)
Pandas Merging 101
(8 answers)
Closed 3 years ago.
Imagine we have the following 2 pandas tables:
employee_id country
1 Sweden
2 Denmark
....
45 Germany
In another table I have the employee_id and want to create a new column where the country for the employee_ID is matched from the other table. Note that in this table the same employee_id can occur throughout multiple rows.
employee_id year salary
1 2017 45000
2 2017 50000
1 2018 46000
....
How do I create the extra column on the last table using the information from the first table?
I was trying a code like:
df.loc[df.isin(df1.employee_id), 'country'] = df1.loc[df1.employee_id.isin(df.id), 'country']
However that did not work.

Add a value to a new column on Data frame that depends on the value on another Data frame [duplicate]

This question already has answers here:
Pandas Merging 101
(8 answers)
Closed 4 years ago.
I have two data frames df1 and df2. df1 has entries of amounts spent by users and each user can have several entries with different amounts values.
The second data frame just holds the information of every users(each user is unique in this data frame).
i want to create a new column on df1 that includes the country value of each unique user from df2.
Any help will be appreciated
df1
name_id Dept amt_spent
0 Alex-01 Engineering 5
1 Bob-01 Finance 5
2 Charles-01 HR 10
3 David-01 HR 6
4 Alex-01 Engineering 50
df2
name_id Country
0 Alex-01 UK
1 Bob-01 USA
2 Charles-01 GHANA
3 David-01 BRAZIL
Result
name_id Dept amt_spent Country
0 Alex-01 Engineering 5 UK
1 Bob-01 Finance 5 USA
2 Charles-01 HR 10 GHANA
3 David-01 HR 6 BRAZIL
4 Alex-01 Engineering 50 UK
This should work:
df = pd.merge(df1, df2)

Categories

Resources