Pandas - Selecting multiple dataframe criteria - python

I have a DataFrame with multiple columns and I need to set the criteria to access specific values from two different columns. I'm able to do it successfully on one column as shown here:
status_filter = df[df['STATUS'] == 'Complete']
But I'm struggling to specify values from two columns. I've tried something like this but get errors:
status_filter = df[df['STATUS'] == 'Complete' and df['READY TO INVOICE'] == 'No']
It may be a simple answer, but any help is appreciated.

Your code has two very small errors: 1) need parentheses for two or more criteria and 2) you need to use the ampersand between your criteria:
status_filter = df[(df['STATUS'] == 'Complete') & (df['READY TO INVOICE'] == 'No')]

status_filter = df.ix[(df['STATUS'] == 'Complete') & (df['READY TO INVOICE'] == 'No'),]
ur welcome

you can use:
status_filter = df[(df['STATUS'] == 'Complete') & (df['READY TO INVOICE'] == 'No')]

Related

comparing two columns of a row in python dataframe

I know that one can compare a whole column of a dataframe and making a list out of all rows that contain a certain value with:
values = parsedData[parsedData['column'] == valueToCompare]
But is there a possibility to make a list out of all rows, by comparing two columns with values like:
values = parsedData[parsedData['column01'] == valueToCompare01 and parsedData['column02'] == valueToCompare02]
Thank you!
It is completely possible, but I have never tried using and in order to mask the dataframe, rather using & would be of interest in this case. Note that, if you want your code to be more clear, use ( ) in each statement:
values = parsedData[(parsedData['column01'] == valueToCompare01) & (parsedData['column02'] == valueToCompare02)]

Access different values in one data frame column?

Df is a loaded in csv file that contains different stats.
player_name,player_id,season,season_type,team
Giannis Antetokounmpo,antetgi01,2020,PO,MIL
I have tried this:
print(df.loc[(df["team"] == "LAL") & (df["team"] == "LAC") & (df["season_type"] == "
I am trying to access the "team" column and filter elements that also meet the "season_type" requirement, however there is no output.
What works currently:
print(df.loc[(df["team"] == "LAL") & (df["season_type"] == "PO")])
When I do this I am able to get the correct output but for only one specific team.
My question is how can I perform this on multiple names?
Good question, this should work for you:
team_list = ["LAL", "LAC"]
df = df[df.team.isin(team_list) & df.season_type == 'PO']

Python: Define subgroup of data with multiple conditions

I have a table with several dummy variables
I would now like to create a subgroup where I list the winpercent values of those rows where fruity=1 and hard=0. My first attempt was this one but it was unsuccesful:
df6=full_data[full_data['fruity'&'hard']==['1'&'0'].iloc[:,-1]
Can anyone help, please?
Please write the conditions one by one separated by the '&' operator:
full_data.loc[(full_data['fruity'] == 1) &
(full_data['hard'] == 0), 'winpercent']
You can also query it:
full_data.query("fruity == 1 and hard == 0", inplace=False)['winpercent']

Multiple Conditions in 1 Variable

I'm wanting to put multiple conditions into one variable so that I can determine the value I will insert into my column 'EmptyCol'. Please see below. Note: This works with one condition but I believe I'm missing something with multiple conditions
Condition = ((df['status']=='Live') and
(df['name'].str.startswith('A') and
(df['true']==1))
df.loc[Condition, 'EmptyCol'] = 'True'
Use "&" instead of "and"
Condition = ((df['status']=='Live') &
(df['name'].str.startswith('A') &
(df['true']==1))
also I recomend to use df.at
I got some truble with df.loc sometime !
Condition = ((df['status']=='Live') &
(df['name'].str.startswith('A') &
(df['true']==1))
def ChangeValueFunc(Record):
df.at[Record['index'],'EmptyCol'] = 'True'
df_2.loc[Condition ,:].reset_index().apply(lambda x:ChangeValueFunc(x) , axis = 1)

python pandas assignment of missing value as an copy

I'm trying to set the mean value of group of products in my dataset (wants to iterate each category and fill the missing data eventually)
df.loc[df.iCode == 160610,'oPrice'].fillna(value=df[df.iCode == 160610].oPrice.mean(), inplace=True)
it's not working (maybe treating it like a copy)
Thanks
df.loc[(df.iCode == 160610) & (df.oPrice.isna()),'oPrice'] = df.loc[df.iCode == 160610].oPrice.mean()

Categories

Resources