I have a DataFrame with multiple columns and I need to set the criteria to access specific values from two different columns. I'm able to do it successfully on one column as shown here:
status_filter = df[df['STATUS'] == 'Complete']
But I'm struggling to specify values from two columns. I've tried something like this but get errors:
status_filter = df[df['STATUS'] == 'Complete' and df['READY TO INVOICE'] == 'No']
It may be a simple answer, but any help is appreciated.
Your code has two very small errors: 1) need parentheses for two or more criteria and 2) you need to use the ampersand between your criteria:
status_filter = df[(df['STATUS'] == 'Complete') & (df['READY TO INVOICE'] == 'No')]
status_filter = df.ix[(df['STATUS'] == 'Complete') & (df['READY TO INVOICE'] == 'No'),]
ur welcome
you can use:
status_filter = df[(df['STATUS'] == 'Complete') & (df['READY TO INVOICE'] == 'No')]
Related
I know that one can compare a whole column of a dataframe and making a list out of all rows that contain a certain value with:
values = parsedData[parsedData['column'] == valueToCompare]
But is there a possibility to make a list out of all rows, by comparing two columns with values like:
values = parsedData[parsedData['column01'] == valueToCompare01 and parsedData['column02'] == valueToCompare02]
Thank you!
It is completely possible, but I have never tried using and in order to mask the dataframe, rather using & would be of interest in this case. Note that, if you want your code to be more clear, use ( ) in each statement:
values = parsedData[(parsedData['column01'] == valueToCompare01) & (parsedData['column02'] == valueToCompare02)]
Df is a loaded in csv file that contains different stats.
player_name,player_id,season,season_type,team
Giannis Antetokounmpo,antetgi01,2020,PO,MIL
I have tried this:
print(df.loc[(df["team"] == "LAL") & (df["team"] == "LAC") & (df["season_type"] == "
I am trying to access the "team" column and filter elements that also meet the "season_type" requirement, however there is no output.
What works currently:
print(df.loc[(df["team"] == "LAL") & (df["season_type"] == "PO")])
When I do this I am able to get the correct output but for only one specific team.
My question is how can I perform this on multiple names?
Good question, this should work for you:
team_list = ["LAL", "LAC"]
df = df[df.team.isin(team_list) & df.season_type == 'PO']
I have a table with several dummy variables
I would now like to create a subgroup where I list the winpercent values of those rows where fruity=1 and hard=0. My first attempt was this one but it was unsuccesful:
df6=full_data[full_data['fruity'&'hard']==['1'&'0'].iloc[:,-1]
Can anyone help, please?
Please write the conditions one by one separated by the '&' operator:
full_data.loc[(full_data['fruity'] == 1) &
(full_data['hard'] == 0), 'winpercent']
You can also query it:
full_data.query("fruity == 1 and hard == 0", inplace=False)['winpercent']
I'm wanting to put multiple conditions into one variable so that I can determine the value I will insert into my column 'EmptyCol'. Please see below. Note: This works with one condition but I believe I'm missing something with multiple conditions
Condition = ((df['status']=='Live') and
(df['name'].str.startswith('A') and
(df['true']==1))
df.loc[Condition, 'EmptyCol'] = 'True'
Use "&" instead of "and"
Condition = ((df['status']=='Live') &
(df['name'].str.startswith('A') &
(df['true']==1))
also I recomend to use df.at
I got some truble with df.loc sometime !
Condition = ((df['status']=='Live') &
(df['name'].str.startswith('A') &
(df['true']==1))
def ChangeValueFunc(Record):
df.at[Record['index'],'EmptyCol'] = 'True'
df_2.loc[Condition ,:].reset_index().apply(lambda x:ChangeValueFunc(x) , axis = 1)
I'm trying to set the mean value of group of products in my dataset (wants to iterate each category and fill the missing data eventually)
df.loc[df.iCode == 160610,'oPrice'].fillna(value=df[df.iCode == 160610].oPrice.mean(), inplace=True)
it's not working (maybe treating it like a copy)
Thanks
df.loc[(df.iCode == 160610) & (df.oPrice.isna()),'oPrice'] = df.loc[df.iCode == 160610].oPrice.mean()