Pandas replace value with other column value [duplicate] - python

This question already has answers here:
Pandas conditional creation of a series/dataframe column
(13 answers)
Closed 25 days ago.
I have a table and I want to replace column values with other columns values based on a condition:
Table:
A
B
C
D
E
x
1
test
fool
bar
y
3
test
fool
bar
If column C contains the word test -> value should be replaced with content of column A
If column D contains the word fool -> value should be replaced with content of column B
A
B
C
D
E
x
1
x
1
bar
y
3
y
3
bar
How can I create this table?

We can use np.where here:
df["C"] = np.where(df["C"] == "test", df["A"], df["C"])
df["D"] = np.where(df["D"] == "fool", df["B"], df["D"])

Related

Copy Row(s) from One DataFrame to Another with Regex [duplicate]

This question already has answers here:
How to test if a string contains one of the substrings in a list, in pandas?
(4 answers)
Closed 5 months ago.
I am trying to extract specific rows from a dataframe where values in a column contain a designated string. For example, the current dataframe looks like:
df1=
Location Value Name Type
Up 10 Test A X
Up 12 Test B Y
Down 11 Prod 1 Y
Left 8 Test C Y
Down 15 Prod 2 Y
Right 30 Prod 3 X
And I am trying to build a new dataframe will all rows that have "Test" in the 'Name' column.
df2=
Location Value Name Type
Up 10 Test A X
Up 12 Test B Y
Left 8 Test C Y
Is there a way to do this with regex or match?
Try:
df_out = df[df["Name"].str.contains("Test")]
print(df_out)
Prints:
Location Value Name Type
0 Up 10 Test A X
1 Up 12 Test B Y
3 Left 8 Test C Y
How about: df2 = df1.loc[['Test' in name for name in df1.Name ]]

fill a dataframe with value of another dataframe according to columns value [duplicate]

This question already has answers here:
Pandas Merging 101
(8 answers)
Closed 10 months ago.
I have two dataframe:
the first one, let's say dfrA
x,y,z
0,0,1
0,1,2
0,2,3
0,3,4
1,0,5
1,1,6
1,2,7
1,3,8
2,0,9
2,1,10
2,2,11
2,3,12
3,0,13
3,1,14
3,2,15
3,3,16
and another one, let's say dfrB
x,y
1,2
2,3
I would like to add a column in dfrB according with z value in the dfrA which has the same x and y of the dfrB.
In other words I expect:
x,y,z
1,2,7
2,3,12
I am able to a empty column to dfrB:
df_support = pd.DataFrame(columns=['z'])
dfrB = dfrB.join(df_support, how="outer")
how can now fill column z in dfrB? I would like to avoid to do a cycle full of if.
You can try pandas.DataFrame.merge
dfrB['z'] = dfrB.merge(dfrA, on=['x', 'y'], how='left')['z']
print(dfrB)
x y z
0 1 2 7
1 2 3 12

Are the values ​of column xy of df1 also present in column zy of df2? [duplicate]

This question already has answers here:
Check if Pandas column contains value from another column
(3 answers)
Check if value from one dataframe exists in another dataframe
(4 answers)
Closed 11 months ago.
I have two dataframes and I want to check which value of df1 in col1 also occurs in df2 in col1. If it occurs: a 1 in col2_new, otherwise a 0. Is it best to do this using a list? So column of df1 converted into list and then a loop over the column of the other data frame or is there a more elegant way?
df1 (before):
index
col1
1
a
2
b
3
c
df2:
index
col1
1
a
2
e
3
b
df1 (after):
index
col1
col2_new
1
a
1
2
b
1
3
c
0
Use Series.isin with converting mask to integers:
df1['col2_new'] = df1['col1'].isin(df2['col1']).astype(int)
Or:
df1['col2_new'] = np.where(df1['col1'].isin(df2['col1']), 1, 0)

Two column DataFrame to transition table (pivot) [duplicate]

This question already has answers here:
Get statistics for each group (such as count, mean, etc) using pandas GroupBy?
(9 answers)
How can I pivot a dataframe?
(5 answers)
Closed 3 years ago.
I have a pandas dataframe with two columns. I want to measure the transition count, that is, the number of times that each unique first column value is related to each unique second column value. This should be a pivot or pivot_table but I am stuck. In the code pasted, trial is the input dataframe, and ans is the answer dataframe what I would like to see by manipulating the trial dataframe.
I did not spot a similar dataframe question which has only two columns. The others used pivot on a third table where a mean or sum aggfunc were used. This is a case where there are only two columns, and I want to count the transitions. The other questions also used numerical columns where aggregation is possible. I want to count the columns for a non-numeric value.
If there is a similar question, would be very helpful if someone can point me to it.
trial=pd.DataFrame({'col1':list('AABCCCDDDD'),'col2':list('XYXXXYYXZZ')})
index col1 col2
0 A X
1 A Y
2 B X
3 C X
4 C X
5 C Y
6 D Y
7 D X
8 D Z
9 D Z
ans=pd.DataFrame({'col1':list('ABCD'),'X':[1,1,2,1],'Y':[1,0,1,1],'Z':[0,0,0,2]})
ans.set_index('col1')
col1 X Y Z
A 1 1 0
B 1 0 0
C 2 1 0
D 1 1 2

Getting a cell value from a column based on cell value from another column [duplicate]

This question already has an answer here:
Pandas select rows and columns based on boolean condition
(1 answer)
Closed 4 years ago.
I have this dataframe
d = {'Number': [1, 2,3,4,5,6,7], 'Letters': ["a", "d","z","f","u","p","g"]}
df = pd.DataFrame(data=d)
Number Letters
0 1 a
1 2 d
2 3 z
3 4 f
4 5 u
5 6 p
6 7 g
And i want to get a value from Letters column based on the Number column
Lets say i want to get the letter where the number is 3
What I did was
letter = df.loc[df['Number'] == 3]
dfletter = pd.DataFrame(data=letter.values, columns = ['Number', 'Letter'])
dfletter = dfletter.drop(columns = 'Number')
which gives me what i want
Letter
0 z
But this seems like a dumb workaround, so I am looking for a better solution
output = df.loc[df['Number'] == 3, 'Letters']
>>> df[df.Number == 3].Letters
2 z
Name: Letters, dtype: object
Or, if you really need a scalar value:
>>> df[df.Number == 3].Letters.values[0]
'z'

Categories

Resources