Comparison within a Dataframe

Comparison within a Dataframe - python

I have list of Data from a CSV file like this :
I wish to find a list of all members whose values lie within an interval. For ex. From the attached Dataset, to find list of all warriors whose powerlevels lie between 675000 and 750000.
In the following code I enter, the operators 'and', 'or', '&','|' are not working and are returning a ValueError.
strong = df[['name', 'attack', 'defense', 'HP','armour','powerlevel']][df.powerlevel > 675000 & df.powerlevel < 750000]
print(strong)
I get the following error-
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
How can I get by this issue, without creating a different dataframe each time?

You can use loc
strong = df.loc[(df.powerlevel > 675000) & (df.powerlevel < 750000)]
strong = strong[['name', 'attack', 'defense', 'HP','armour','powerlevel']]

Related

Can you use Dataframe Values as Matplotlib Ylabel?

I want to use a value from a specific column in my Pandas dataframe as the Y-axis label. The reason for this is that the label could change depending on the Unit of Measure (UoM) - it could be kg, number of bags etc.
#create function using plant and material input to chart planned and actual manufactured quantities
def filter_df(df, plant: str = "", material: str = ""):
output_df = df.loc[(df['Plant'] == plant) & (df['Material'].str.contains(material))].reset_index()
return output_df['Planned_Qty_Cumsum'].plot.area (label = 'Planned Quantity'),\
output_df['Goods_Receipted_Qty_Cumsum'].plot.line(label = 'Delivered Quantity'),\
plt.title('Planned and Deliverd Quanties'),\
plt.legend(),\
plt.xlabel('Number of Process Orders'),\
plt.ylabel(output_df['UoM (of GR)']),\
plt.show()
#run function
filter_df(df_yield_data_formatted,'*plant*','*material*')
When running the function I get the following error message:
ValueError: The truth value of a Series is ambiguous. Use a.empty,
a.bool(), a.item(), a.any() or a.all().

Yes you can, but the way you are doing you are saying all the values of the Dataframe in that column and you should indicate what row and column you want for the label, use iloc for instace and it will work.
plt.ylabel(df.iloc[2,1])

Python - Panel data create indicator with if statement

I am trying to create an indicator equal to 1 if my meeting_date variable matches my date variable, and zero otherwise. I am getting an error in my code that consists of the following:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Please let me know what I am doing wrong! Here is my code:
if crsp_12['meeting_date'] == crsp_12['date']:
crsp_12['i_meeting_date_dayof'] == 1
else:
crsp_12['i_meeting_date_dayof'] == 0

You should always avoid classical if/for constructs with pandas. Use vectorial code:
crsp_12['i_meeting_date_dayof'] = crsp_12['meeting_date'].eq(crsp_12['date']).astype(int)

'The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().'

I am using simple function on Python :
def liste_bis(data):
Country_Name = []
Platform_Category = []
Platform = []
Country_Acronym = []
for m,n in enumerate (np.arange(np.shape(data)[0])):
if (data["domain1"][m]=='afe'):
Country_Name.append('France')
Platform_Category.append('App')
Platform.append('BDDF-HB App')
Country_Acronym.append('FR')
elif (data["domain1"][m]=='afer'):
Country_Name.append('France')
Platform_Category.append('Site')
Platform.append('BDDF-HB Site')
Country_Acronym.append('FR')
elif (data["domain1"][m]=='afert'):
Country_Name.append('France')
Platform_Category.append('App')
Platform.append('BDDF-BNP App')
Country_Acronym.append('FR')
elif (data["domain1"][m]=='aferty'):
Country_Name.append('France')
Platform_Category.append('Site')
Platform.append('BDDF-BNP Site')
Country_Acronym.append('FR')
elif (data["domain1"][m]=='afertyu'):
Country_Name.append('Luxembourg')
Platform_Category.append('App')
Platform.append('BGL-BNP App')
Country_Acronym.append('LU')
dictionnaire = {"Country_Name":Country_Name,"Platform_Category":Platform_Category,"Platform":Platform,"Country_Acronym":Country_Acronym}
return(dictionnaire)
But i have some troubles.
When I execute programm, it returns me :
'The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().'
But When i use this function with DataFrame with just 1 row, it works weel.
But When i have more than 1 row, it doesnt work ...
Let you show an example of dataframe that i use :
dataframe_example
Could you helpe me please ?
Thank you

Why it works only with a one-element result: When you have a multi-element Series, its "truth value" might be a Series of truth values, or it might be the answer to "are all of these values True", etc. With one row, there is no such ambiguity. So choose one of the explicit methods recommended by the error message (depending on what you are really after), and move on.

Python ternary operation on vectors

Could someone help me with the proper format of a python ternary operation on a vector? I have two dataframes temperature: df_today and df_yesterday. I am trying to calculate a new column for df_today to determine whether the temperature is warmer than yesterday:
df["warmer_than_yesterday"] = 'yes, warmer' if df["temp_celsius"] > df_yesterday["temp_celsius"] and df["temp_celsius"] > 10 else 'nah, not warmer'
However, I keep getting the error ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Does anyone know what I might be doing wrong?
Thanks in advance!

First, you can combine your if conditions into one, using np.maximum (for conciseness). Should also be more performant.
m = df["temp_celsius"] > np.maximum(10, df_yesterday["temp_celsius"])
Now, pass this mask to np.where,
df["warmer_than_yesterday"] = np.where(m, 'yes', 'no')
Or, to loc to set slices:
df["warmer_than_yesterday"] = 'no'
df.loc[m, "warmer_than_yesterday"] = 'yes'

Dataframe.isin() giving this error: The truth value of a DataFrame is ambiguous

Can you help with this error: what am I doing wrong with the df.isin function?
cursor = con.cursor()
cursor.execute("""SELECT distinct date FROM raw_finmis_online_activation_temp""")
existing_dates = [x[0] for x in cursor.fetchall()]
if df[df['date'].isin(existing_dates)]:
print "Yes it's in there"
else:
print "N"
It's giving me this error:
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty,
a.bool(), a.item(), a.any() or a.all().

df[df['date'].isin(existing_dates)] returns a dataframe. Unlike normal sequences, DataFrames inherit their truthyness from numpy.ndarray which is don't allow you to do a truth check on it (unless it has length 1 -- which is weird).
The solution depends on what you want out of that expression ... e.g. if you want to check if there is at least one element:
len(df[df['date'].isin(existing_dates)])
or if you want to check if all the elements are "truthy":
df[df['date'].isin(existing_dates)].all()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Comparison within a Dataframe - python

You can use loc strong = df.loc[(df.powerlevel > 675000) & (df.powerlevel < 750000)] strong = strong[['name', 'attack', 'defense', 'HP','armour','powerlevel']]

Related

Can you use Dataframe Values as Matplotlib Ylabel?

Python - Panel data create indicator with if statement

'The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().'

Python ternary operation on vectors

Dataframe.isin() giving this error: The truth value of a DataFrame is ambiguous

Categories

Resources