Python - Panel data create indicator with if statement - python

I am trying to create an indicator equal to 1 if my meeting_date variable matches my date variable, and zero otherwise. I am getting an error in my code that consists of the following:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Please let me know what I am doing wrong! Here is my code:
if crsp_12['meeting_date'] == crsp_12['date']:
crsp_12['i_meeting_date_dayof'] == 1
else:
crsp_12['i_meeting_date_dayof'] == 0

You should always avoid classical if/for constructs with pandas. Use vectorial code:
crsp_12['i_meeting_date_dayof'] = crsp_12['meeting_date'].eq(crsp_12['date']).astype(int)

Related

Value Error while applying conditions to a loop (pandas)

while (i< len(df)):
if (df['ID'][i] == df['ID'][i+1]) & (df['Week_start'] == df['Week_end']):
if (df['ship'][i] > df['ship'][i+1] ):
df['radar'][i] =df['radar'][i+1] + df['parked'][i] - df['parked'][i+1]
else:
df['radar'][i] =df['radar'][i+1]
else:
df['radar'][i] = df['ship'][i]
i = i+1
I tried to get this code running but I keep on getting an error:
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
What do you recommend? Essentially I want to fill up the column radar based on conditions, I think the rest but that part work.
You are getting the error in this line:
df['Week_start'] == df['Week_end']
specify some index like
df['Week_start'][i]== df['Week_end'][i+1]
Hope this will help!

How to include a string being equal to itself shifted as a coniditon in a function definition?

I'm defining a simple if xxxx return y - else return NaN function. If the record, ['Product'], equals ['Product'] offset by 8 then the if condition is true.
I've tried calling the record and setting it equal to itself offset by 8 using == and .shift(8). ['Product'] is a string and ['Sales'] is an integer.
def Growth (X):
if X['Product'] == X['Product'].shift(8):
return (1+ X['Sales'].shift(4)) / (1+ X['Sales'].shift(8) - 1)
else:
return 'NaN'
I expect the output to be NaN for the first 8 records, and then to have numbers at record 9, but I receive the error instead.
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Firstly a general comment from StackOverflow's Truth value of a Series is ambiguous...:
The or and and python statements require truth-values. For pandas these are considered ambiguous so you should use "bitwise" | (or) or & (and) operations.
Secondly, you use == on Series objects. For this Pandas tries to convert the first object to a truth value - and fails, because this is ambiguous.
use X['Product'].equals(X['Product'].shift(8))

'The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().'

I am using simple function on Python :
def liste_bis(data):
Country_Name = []
Platform_Category = []
Platform = []
Country_Acronym = []
for m,n in enumerate (np.arange(np.shape(data)[0])):
if (data["domain1"][m]=='afe'):
Country_Name.append('France')
Platform_Category.append('App')
Platform.append('BDDF-HB App')
Country_Acronym.append('FR')
elif (data["domain1"][m]=='afer'):
Country_Name.append('France')
Platform_Category.append('Site')
Platform.append('BDDF-HB Site')
Country_Acronym.append('FR')
elif (data["domain1"][m]=='afert'):
Country_Name.append('France')
Platform_Category.append('App')
Platform.append('BDDF-BNP App')
Country_Acronym.append('FR')
elif (data["domain1"][m]=='aferty'):
Country_Name.append('France')
Platform_Category.append('Site')
Platform.append('BDDF-BNP Site')
Country_Acronym.append('FR')
elif (data["domain1"][m]=='afertyu'):
Country_Name.append('Luxembourg')
Platform_Category.append('App')
Platform.append('BGL-BNP App')
Country_Acronym.append('LU')
dictionnaire = {"Country_Name":Country_Name,"Platform_Category":Platform_Category,"Platform":Platform,"Country_Acronym":Country_Acronym}
return(dictionnaire)
But i have some troubles.
When I execute programm, it returns me :
'The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().'
But When i use this function with DataFrame with just 1 row, it works weel.
But When i have more than 1 row, it doesnt work ...
Let you show an example of dataframe that i use :
dataframe_example
Could you helpe me please ?
Thank you
Why it works only with a one-element result: When you have a multi-element Series, its "truth value" might be a Series of truth values, or it might be the answer to "are all of these values True", etc. With one row, there is no such ambiguity. So choose one of the explicit methods recommended by the error message (depending on what you are really after), and move on.

Python ternary operation on vectors

Could someone help me with the proper format of a python ternary operation on a vector? I have two dataframes temperature: df_today and df_yesterday. I am trying to calculate a new column for df_today to determine whether the temperature is warmer than yesterday:
df["warmer_than_yesterday"] = 'yes, warmer' if df["temp_celsius"] > df_yesterday["temp_celsius"] and df["temp_celsius"] > 10 else 'nah, not warmer'
However, I keep getting the error ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Does anyone know what I might be doing wrong?
Thanks in advance!
First, you can combine your if conditions into one, using np.maximum (for conciseness). Should also be more performant.
m = df["temp_celsius"] > np.maximum(10, df_yesterday["temp_celsius"])
Now, pass this mask to np.where,
df["warmer_than_yesterday"] = np.where(m, 'yes', 'no')
Or, to loc to set slices:
df["warmer_than_yesterday"] = 'no'
df.loc[m, "warmer_than_yesterday"] = 'yes'

Dataframe.isin() giving this error: The truth value of a DataFrame is ambiguous

Can you help with this error: what am I doing wrong with the df.isin function?
cursor = con.cursor()
cursor.execute("""SELECT distinct date FROM raw_finmis_online_activation_temp""")
existing_dates = [x[0] for x in cursor.fetchall()]
if df[df['date'].isin(existing_dates)]:
print "Yes it's in there"
else:
print "N"
It's giving me this error:
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty,
a.bool(), a.item(), a.any() or a.all().
df[df['date'].isin(existing_dates)] returns a dataframe. Unlike normal sequences, DataFrames inherit their truthyness from numpy.ndarray which is don't allow you to do a truth check on it (unless it has length 1 -- which is weird).
The solution depends on what you want out of that expression ... e.g. if you want to check if there is at least one element:
len(df[df['date'].isin(existing_dates)])
or if you want to check if all the elements are "truthy":
df[df['date'].isin(existing_dates)].all()

Categories

Resources