I have 3 dataframes:
df1 :
ip name
df2 :
name country
df3:
country city
I have to match them by IP. What is correct way to do this? We match them df1 and df2 and then match result of df1 and df2 with df3 with index change. I think that ist not correct way.
It seems you need double merge, parameter on should me omit if only same joined columns of dfs:
df = df1.merge(df2).merge(df3)
Related
I have df1 with country and sales and df2 with country and sales. Country spelling is not exactly the same in df1 and df2. How do I make country spelling in df1 match that of df2 before merging the two dataframes together.
The merge below doesn't give me all matches due to different spellings:
pd.merge(df1, df2, on='country', how='left')
I have two dataframes, df1 and df2.
df1 contains integers and df2 contains booleans.
df1 and df2 are exactly the same size (like both are 10x10).
I would like to create a df3 that would take the data from df1 only if the value in the same location in df2 is True. All False would be replaced by Nan in df3
Thanks in advance!
I have two DataFrames as follow:
df1 = pd.DataFrame({'Group1': [0.5,5,3], 'Group2' : [2,.06,0.9]}, index=['Dan','Max','Joe'])
df2 = pd.DataFrame({'Name' : ['Joe','Max'], 'Team' : ['Group2','Group1']})
My goal is to get the right value for the Name of the person considering the the column 'Team'.
So the result should look something like this:
I tried it with a merge but I failed because I don't know how to merge on these conditions.
What's the best way in Python to reach my goal?
You can unstack df1, reset its indices, rename columns and merge on Name and Team:
out = (df1.unstack()
.reset_index()
.rename({'level_0':'Team', 'level_1':'Name', 0:'Value'}, axis=1)
.merge(df2, on=['Name','Team']))
Output:
Team Name 0
0 Group1 Max 5.0
1 Group2 Joe 0.9
I have 2 data frames that look like this
Df1
City Code ColA Col..Z
LA LAA
LA LAB
LA LAC
Df2
Code ColA Col..Z
LA LAA
NY NYA
CH CH1
What I'm trying to do have the result of
df3
Code ColA Col..Z
NY NYA
CH CH1
Normally I would loop through each row in df2 and say:
Df3 = If df2.row['Code'] in df1 then drop it.
But I want to find a pythonic pandas way to do it instead of looping through the dataframe. I was looking at examples using joins or merging but I cant seem to work it out.
This Df3 = If df2.row['Code'] in df1 then drop it. translates to
df3 = df2[~df2['Code'].isin(df1['City'] ]
To keep only the different items in df2 based on the code column, you can do something like this, using drop_duplicates :
df2[df2.code.isin(
# the different values in df2's 'code' column
pd.concat([df1.code, df2.code]).drop_duplicates(keep=False)
)]
There is a pandas compare df method which might be relevant?:
df1 = pd.read_clipboard()
df1
df2 = pd.read_clipboard()
df2
df1.compare(df2).drop('self', axis=1, level=1).droplevel(1, axis=1)
(And I'm making an assumption you had a typo in your dataframes with the City col missing from df2?)
As someone who is super new in merge/append on Python, I am trying to merge two different DF together.
DF1 has 2 columns with Text and ID columns and 100 rows
DF2 has 3 columns with Text, ID, and Match columns and has 20 rows
My goal is to combine the two DFs together so the "Match" column from DF2 can be merged into DF1.
The Match column is all "True" value, so when it gets merged over the other 80 rows on DF1 can be NaN and I can fix it later.
Thank you to everyone for the help and support!
Try a left merge using .merge(), like this:
DF_out = DF1.merge(DF2, on=['Text', 'ID'], how='left')