Value Error while applying conditions to a loop (pandas) - python

while (i< len(df)):
if (df['ID'][i] == df['ID'][i+1]) & (df['Week_start'] == df['Week_end']):
if (df['ship'][i] > df['ship'][i+1] ):
df['radar'][i] =df['radar'][i+1] + df['parked'][i] - df['parked'][i+1]
else:
df['radar'][i] =df['radar'][i+1]
else:
df['radar'][i] = df['ship'][i]
i = i+1
I tried to get this code running but I keep on getting an error:
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
What do you recommend? Essentially I want to fill up the column radar based on conditions, I think the rest but that part work.

You are getting the error in this line:
df['Week_start'] == df['Week_end']
specify some index like
df['Week_start'][i]== df['Week_end'][i+1]
Hope this will help!

Related

Python - Panel data create indicator with if statement

I am trying to create an indicator equal to 1 if my meeting_date variable matches my date variable, and zero otherwise. I am getting an error in my code that consists of the following:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Please let me know what I am doing wrong! Here is my code:
if crsp_12['meeting_date'] == crsp_12['date']:
crsp_12['i_meeting_date_dayof'] == 1
else:
crsp_12['i_meeting_date_dayof'] == 0
You should always avoid classical if/for constructs with pandas. Use vectorial code:
crsp_12['i_meeting_date_dayof'] = crsp_12['meeting_date'].eq(crsp_12['date']).astype(int)

Multiple if conditions pandas

Looking to write an if statement which does a calculation based on if 3 conditions across other columns in a dataframe are true. I have tried the below code which seems to have worked for others on stackoverflow but kicks up an error for me. Note the 'check', 'sqm' and 'sqft' columns are in float64 format.
if ((merge['check'] == 1) & (merge['sqft'] > 0) & (merge['sqm'] == 0)):
merge['checksqm'] == merge['sqft']/10.7639
#Error below:
alueError Traceback (most recent call last)
<ipython-input-383-e84717fde2c0> in <module>
----> 1 if ((merge['check'] == 1) & (merge['sqft'] > 0) & (merge['sqm'] == 0)):
2 merge['checksqm'] == merge['sqft']/10.7639
~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/generic.py in __nonzero__(self)
1327
1328 def __nonzero__(self):
-> 1329 raise ValueError(
1330 f"The truth value of a {type(self).__name__} is ambiguous. "
1331 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Each condition you code evaluates into a series of multiple boolean values. The combined result of the 3 conditions also become a boolean series. Python if statement cannot handle such Pandas series with evaluating each element in the series and feed to the statement following it one by one. Hence, the error ValueError: The truth value of a Series is ambiguous.
To solve the problem, you have to code it using Pandas syntax, like the following:
mask = (merge['check'] == 1) & (merge['sqft'] > 0) & (merge['sqm'] == 0)
merge.loc[mask, 'checksqm'] = merge['sqft']/10.7639
or, combine in one statement, as follows:
merge.loc[(merge['check'] == 1) & (merge['sqft'] > 0) & (merge['sqm'] == 0), 'checksqm'] = merge['sqft']/10.7639
In this way, Pandas can evaluate the boolean series and work on the rows corresponding to True values of the combined 3 conditions and process each row one by one taking corresponding values from each row for processing. This kind of vectorized operation under the scene is not supported by ordinary Python statement such as if statement.
You are trying to use pd.Series as the condition inside the if clause. This series is a mask of True, False values. You need to cast the series to bool using series.any() or series.all().

'The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().'

I am using simple function on Python :
def liste_bis(data):
Country_Name = []
Platform_Category = []
Platform = []
Country_Acronym = []
for m,n in enumerate (np.arange(np.shape(data)[0])):
if (data["domain1"][m]=='afe'):
Country_Name.append('France')
Platform_Category.append('App')
Platform.append('BDDF-HB App')
Country_Acronym.append('FR')
elif (data["domain1"][m]=='afer'):
Country_Name.append('France')
Platform_Category.append('Site')
Platform.append('BDDF-HB Site')
Country_Acronym.append('FR')
elif (data["domain1"][m]=='afert'):
Country_Name.append('France')
Platform_Category.append('App')
Platform.append('BDDF-BNP App')
Country_Acronym.append('FR')
elif (data["domain1"][m]=='aferty'):
Country_Name.append('France')
Platform_Category.append('Site')
Platform.append('BDDF-BNP Site')
Country_Acronym.append('FR')
elif (data["domain1"][m]=='afertyu'):
Country_Name.append('Luxembourg')
Platform_Category.append('App')
Platform.append('BGL-BNP App')
Country_Acronym.append('LU')
dictionnaire = {"Country_Name":Country_Name,"Platform_Category":Platform_Category,"Platform":Platform,"Country_Acronym":Country_Acronym}
return(dictionnaire)
But i have some troubles.
When I execute programm, it returns me :
'The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().'
But When i use this function with DataFrame with just 1 row, it works weel.
But When i have more than 1 row, it doesnt work ...
Let you show an example of dataframe that i use :
dataframe_example
Could you helpe me please ?
Thank you
Why it works only with a one-element result: When you have a multi-element Series, its "truth value" might be a Series of truth values, or it might be the answer to "are all of these values True", etc. With one row, there is no such ambiguity. So choose one of the explicit methods recommended by the error message (depending on what you are really after), and move on.

Python ternary operation on vectors

Could someone help me with the proper format of a python ternary operation on a vector? I have two dataframes temperature: df_today and df_yesterday. I am trying to calculate a new column for df_today to determine whether the temperature is warmer than yesterday:
df["warmer_than_yesterday"] = 'yes, warmer' if df["temp_celsius"] > df_yesterday["temp_celsius"] and df["temp_celsius"] > 10 else 'nah, not warmer'
However, I keep getting the error ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Does anyone know what I might be doing wrong?
Thanks in advance!
First, you can combine your if conditions into one, using np.maximum (for conciseness). Should also be more performant.
m = df["temp_celsius"] > np.maximum(10, df_yesterday["temp_celsius"])
Now, pass this mask to np.where,
df["warmer_than_yesterday"] = np.where(m, 'yes', 'no')
Or, to loc to set slices:
df["warmer_than_yesterday"] = 'no'
df.loc[m, "warmer_than_yesterday"] = 'yes'

Dataframe.isin() giving this error: The truth value of a DataFrame is ambiguous

Can you help with this error: what am I doing wrong with the df.isin function?
cursor = con.cursor()
cursor.execute("""SELECT distinct date FROM raw_finmis_online_activation_temp""")
existing_dates = [x[0] for x in cursor.fetchall()]
if df[df['date'].isin(existing_dates)]:
print "Yes it's in there"
else:
print "N"
It's giving me this error:
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty,
a.bool(), a.item(), a.any() or a.all().
df[df['date'].isin(existing_dates)] returns a dataframe. Unlike normal sequences, DataFrames inherit their truthyness from numpy.ndarray which is don't allow you to do a truth check on it (unless it has length 1 -- which is weird).
The solution depends on what you want out of that expression ... e.g. if you want to check if there is at least one element:
len(df[df['date'].isin(existing_dates)])
or if you want to check if all the elements are "truthy":
df[df['date'].isin(existing_dates)].all()

Categories

Resources