Pandas : Two 'isin', one condition [duplicate] - python

This question already has answers here:
Pandas Merging 101
(8 answers)
Closed 2 years ago.
I've been using pandas for some months and today I found something weird.
Let's say I have these two dataframes :
df1 = pd.DataFrame(data={'C1' : [1,1,1,2],'C2' : ['A','B','C','D']})
df2 = pd.DataFrame(data={'C1':[2,2,2],'C2':['A','B','C']})
What I want is : from df2, every pairs of {C1,C2} that exist in df1.
This is what I wrote : df2[df2.C1.isin(df1.C1) & df2.C2.isin(df1.C2)]
The result I would like to have is an empty dataFrame because in df1, 2 is not linked with 'A','B' or 'C' and what I get is df2. I tried df2[df2[["C1,"C2"]].isin(df1[["C1,"C2"]])] but it does not work if df2 has more columns (even if unused).

You can do it with inner merge:
df2.merge(df1, how='inner', on=['C1', 'C2'])
Empty DataFrame
Columns: [C1, C2]
Index: []

Related

Merge dataframes with conditions with pandas python [duplicate]

This question already has answers here:
Pandas Merging 101
(8 answers)
Closed 1 year ago.
I want to merge two dataframes, but under certain conditions:
If the values of the columns from dataframe 2 (Source1,Target1,Source2,Target2) occur in dataframe 1, then I want to replace them with the data from dataframe 2, but merge them with all columns from dataframe 2 and all columns from dataframe 1.
My current problem is when I do a concatenation, the data from DF2 is only merged with that from DF1 and I have invalid data.
In short: match DF1 with DF2 and if there are intersections, then overwrite the intersection from in DF1, but merge all columns from DF2 with those from DF1.
Thanks for your help
DF1
DF2
What I get
What I need
frames = [DF1,DF2]
result = pd.concat(frames)
print(result)
Use merge:
out = pd.merge(DF1, DF2, how='left',
on=['Source 1', 'Target 1', 'Source 2', 'Target 2'])
Take a while to read Pandas Merging 101

Merging pandas dataframes to produce a dataframe with lists [duplicate]

This question already has answers here:
How do I combine two dataframes?
(8 answers)
Pandas Merging 101
(8 answers)
How to group dataframe rows into list in pandas groupby
(17 answers)
Closed 1 year ago.
I have two data frames with the same column names and the same indices. Each entry
in the data frames is an int or a float. I would like to combine the data frames into
a single data frame. I would like each entry of this data frame to be a list containing the individual elements from the separate data frames.
As an example, df1 and df2 are the original data frames:
A B
df1 = 0 0 1
A B
df2 = 0 2 3
I would like to produce the following dataframe:
A B
df3 = 0 [0, 2] [1, 3]
I tried the following:
merger = lambda s1, s2: s1.append(s2)
df1.combine(df2, merger)
This gives me the error:
ValueError: cannot reindex from a duplicate axis
I can think of a few ways to do it with loops but I'd like to avoid that if possible. It seems like this is something that should be built into pandas.
Cheers
Try with
out = pd.concat([df1,df2]).groupby(level=0).agg(list)

How to merge two Pandas DataFrames when two columns are the same [duplicate]

This question already has answers here:
Pandas Merging 101
(8 answers)
Closed 3 years ago.
I have these two dataframes:
orderItemId orderId orderDate latestDeliveryDate
0 BFC0000332253518 2648507110 2019-11-10T21:08:30+01:00 2019-11-11T00:00:00+01:00
0 BFC0000332123047 2647717360 2019-11-10T15:42:39+01:00 2019-11-11T00:00:00+01:00
0 BFC0000332291194 2648712140 2019-11-10T22:24:56+01:00 2019-11-11T00:00:00+01:00
orderItemId orderId shipmentId shipmentReference shipmentDate
0 BFC0000332253518 2648507110 689508122 081234500926730318 2019-11-11T00:10:06+01:00
1 BFC0000332123047 2647717360 689505054 081234500926572451 2019-11-10T23:55:38+01:00
2 BFC0000332291194 2648712140 689505045 081234500926710549 2019-11-10T23:55:37+01:00
How can I merge those together with Pandas merge? Because they have two columns that are the same. Can I use multiple on= values?
Yes you can use multiple on values. I suppose from your example above, you want to merge on orderItemId and orderId right?
Just use:
final_df = pd.merge(df1, df2, how = 'inner', left_on = ['orderItemId','orderId'], right_on = ['orderItemId','orderId'])

Join two Dataframe table [duplicate]

This question already has answers here:
Merge two dataframes by index
(7 answers)
Pandas Merging 101
(8 answers)
Closed 5 years ago.
I have two dataframe table :
df1
id A
1 wer
3 dfg
5 dfg
df2
id A
2 fgv
4 sdfsdf
I want to join this to dataframe for one that will look like that:
df3
id A
1 wer
2 fgv
3 dfg
...
df3 = df1.merge(df2,how='outer',sort=True)
There is concat method in pandas that you can use.
df3 = pd.concat([df1, df2])
You can sort index with -
df3 = df3.sort_index()
Or reset index like
df3 = df3.reset_index(drop=True)
I see you have ellipsis (...) at the end of your df3 dataframe if that means continuation in dataframe use above otherwise go for Jibril's answer

Merge a list of dataframes to create one dataframe [duplicate]

This question already has answers here:
Pandas Merging 101
(8 answers)
Closed 4 years ago.
I have a list of 18 data frames:
dfList = [df1, df2, df3, df4, df5, df6.....df18]
All of the data frames have a common id column so it's easy to join them each together with pd.merge 2 at a time. Is there a way to join them all at once so that dfList comes back as a single dataframe?
I think you need concat, but first set index of each DataFrame by common column:
dfs = [df.set_index('id') for df in dfList]
print pd.concat(dfs, axis=1)
If need join by merge:
from functools import reduce
df = reduce(lambda df1,df2: pd.merge(df1,df2,on='id'), dfList)

Categories

Resources