I'm wanting to put multiple conditions into one variable so that I can determine the value I will insert into my column 'EmptyCol'. Please see below. Note: This works with one condition but I believe I'm missing something with multiple conditions
Condition = ((df['status']=='Live') and
(df['name'].str.startswith('A') and
(df['true']==1))
df.loc[Condition, 'EmptyCol'] = 'True'
Use "&" instead of "and"
Condition = ((df['status']=='Live') &
(df['name'].str.startswith('A') &
(df['true']==1))
also I recomend to use df.at
I got some truble with df.loc sometime !
Condition = ((df['status']=='Live') &
(df['name'].str.startswith('A') &
(df['true']==1))
def ChangeValueFunc(Record):
df.at[Record['index'],'EmptyCol'] = 'True'
df_2.loc[Condition ,:].reset_index().apply(lambda x:ChangeValueFunc(x) , axis = 1)
Related
I know that one can compare a whole column of a dataframe and making a list out of all rows that contain a certain value with:
values = parsedData[parsedData['column'] == valueToCompare]
But is there a possibility to make a list out of all rows, by comparing two columns with values like:
values = parsedData[parsedData['column01'] == valueToCompare01 and parsedData['column02'] == valueToCompare02]
Thank you!
It is completely possible, but I have never tried using and in order to mask the dataframe, rather using & would be of interest in this case. Note that, if you want your code to be more clear, use ( ) in each statement:
values = parsedData[(parsedData['column01'] == valueToCompare01) & (parsedData['column02'] == valueToCompare02)]
I have a table with several dummy variables
I would now like to create a subgroup where I list the winpercent values of those rows where fruity=1 and hard=0. My first attempt was this one but it was unsuccesful:
df6=full_data[full_data['fruity'&'hard']==['1'&'0'].iloc[:,-1]
Can anyone help, please?
Please write the conditions one by one separated by the '&' operator:
full_data.loc[(full_data['fruity'] == 1) &
(full_data['hard'] == 0), 'winpercent']
You can also query it:
full_data.query("fruity == 1 and hard == 0", inplace=False)['winpercent']
I've got a Pandas dataframe (v0.25.3, Python 3.6) and I'm trying to do operations on rows that match certain conditions. I've done this hundreds of times, but now I'm getting weird behavior that I can't figure out. Specifically, I've got two conditions, and I want to capture only rows where both conditions are True, but I'm getting rows in my results where either or both conditions are False.
For example,
print(data.loc[1,"var1"] != None)
print(data.loc[1,"var2"] != None)
returns False and True, but when I run
thisData1 = data.loc[((data["var1"] != None) & (data["var2"] != None))]
print(thisData1.head())
row 1 is still in there...all the data is still in there! If I use the older styling without .loc I get the same results. Row 0 is sill in there and they are both None. Furthermore, when I run just
print(len(data[data['var1'] != None]))
It again doesn't filter anything even though print(data.loc[1,"var1"] != None) => False
Everything here SEEMS to conform to the correct Pandas way to do this (e.g., see this question), and it usually works, but I can't see what I'm doing wrong in this case. Can anybody spot my error or recommend a way a different/safer way to run these filters? If the problem is my dataset, what should I check?
Use notnull instead of != None
thisData1 = data[data["var1"].notnull() & data["var2"].notnull()]
I'm trying to insert a new column in Python using Pandas based on an "or" condition, but I'm having trouble with the code. Here's what I'm trying to do:
If the column "Regulatory Body" says FDIC, Fed, or Treasury, then I want a new column "Filing" to say "Yes"; otherwise, say "No". This is what I've written. My dataframe is df200.
df200["Filing"] = np.where(df200["Regulatory Body"]=="FDIC","Yes","No")
Is there a way to have an "or" condition in this code to fit the other two variables?
Thanks!
Yes. Use pd.Series.isin:
bodies = {'FDIC', 'Fed', 'Treasury'}
df200['Filing'] = np.where(df200['Regulatory Body'].isin(bodies), 'Yes', 'No')
Alternatively, use pd.Series.map with the Boolean array you receive from pd.Series.isin:
df200['Filing'] = df200['Regulatory Body'].isin(bodies).map({True: 'Yes', False: 'No'})
I am trying to fill in the missing values of county column based on its add_suburb value. I tried the following two codes which doesn't work
for index, row in fileco.iterrows():
df.loc[df['add_suburb'].str.contains(str(row['place'])) & ( df['county'].str=='') , 'county'] = str('County '+row['county']).title()
for index, row in fileco.iterrows():
df.loc[df['add_suburb'].str.contains(str(row['place'])) & ( df['county'].str is None) , 'county'] = str('County '+row['county']).title()
But the following code works if i do not check for None or ==''.
for index, row in fileco.iterrows():
df.loc[df['add_suburb'].str.contains(str(row['place'])) , 'county'] = str('County '+row['county']).title()
What's the correct way to fill in only the missing column values? How should I correct the condition after the & ?
I don't exactly understand what you're trying to do in the loop (what are you looping over?), but I think it should work if you enclose your conditions in brackets like this:
df.loc[(condition1) & (condition2)] = "replacement"