Pandas error - "ValueError: labels ['attributes'] not contained in axis" - python

I am extracting data from Saleforce system and converting it to a Dataframe when I get an error:
ValueError: labels ['attributes'] not contained in axis.
Given below is my Python script:
raw = sf_data_cursor.bulk.Case.query('''SELECT Id, Status, AccountName__c, AccountId FROM Case''')
raw_df = pd.DataFrame(raw).drop('attributes', axis= 1,inplace=False)
Could anyone assist.

Generally, this error occurs if the column (in this case attributes) you're trying to drop from raw doesn't exist.
Try the code: raw.columns, and the output should include the column name you're trying to drop.

Related

How to resolve warning "Boolean Series key will be reindexed to match DataFrame index" while using a dataframe to create a new dataframe

I am writing a code to analyze some data where I am removing rows with a null value in a certain column. It works perfectly fine and I am getting the desired results but this warning keeps flashing.
Here's part of the code that's giving the error
original_df = pd.read_csv('titanic.csv')
age_wrangled_df = original_df[pd.notnull(original_df['Age'])]
embark_wrangled_df = age_wrangled_df[pd.notnull(original_df['Embarked'])]
The error I keep getting is
C:\Users\Dell\Downloads\Codes folium\titanic.py:14: UserWarning: Boolean Series key will be reindexed to match DataFrame index.
embark_wrangled_df = age_wrangled_df[pd.notnull(original_df['Embarked'])]
I have read other answers regarding this error but none helped resolve the warning. What does the error mean and how can I fix it?

Concat data frames : Adding a name to data frame column which does not have a name

I am making a data frame by concatenating several data frames .The code is given below.
summary_FR =pd.concat([Chip_Cur_Summary_funct_mode2,Noise_Summary_funct_mode2,VCM_Summary_funct_mode2,Sens_Summary_funct_mode2,Vbias_Summary_funct_mode2,vcm_delta_Summary_funct_mode2,THD_FUN_M2,F_LOW_FUNC_Summary_mode2,OSC_FUNC_Summary_mode2,FOSC_FUNC_Summary_mode2,VREF_CP_FUNC_Summary_mode2,Summary_PSRR_1KHz_funct_mode2,Summary_PSRR_20Hzto20KHz_funct_mode2])
The image of the table is given below. You can see that the 1st column don't have any name.I need to set it name as Parameter and make it as unique index column.
I tried the below code to set the name as 'Parameter' and I failed.
summary_FR.columns = ["Parameters", "SPEC_MIN", "SPEC_TYP", "SPEC_MAX","min","mean","max","std","Units","Remarks"]
# summary_FR.set_index()
May I know where I went wrong.Can someone please help me.
It helps to share the error message but that is probably an index column. You need to call reset_index on the concatenated dataframe like
summary_FR =pd.concat([Chip_Cur_,...,Summar]).reset_index()
then you can change the colun names.
You can give a name for the index in the following way:
your_dataframe.index.name = 'Parameter'

reshaping pandas data frame- Unique row error

I have a data frame as the following;
I am trying to use the reshape function from pandas package and it keep giving me the error that
" the id variables need to uniquely identify each row".
This is my code to reshape:
link to the data: https://pastebin.com/GzujhX3d
GG_long=pd.wide_to_long(data_GG,stubnames='time_',i=['Customer', 'date'], j='Cons')
The combination of 'Customer' and 'Date' is a unique row within my data, so I don't understand why it throws me this error and how I can fix it. Any help is appreciated.
I could identify the issue. the error was due to two things- first the name of the columns having ":" in them and second the format of the date- for some reason it doesn't like dd-mm-yy, instead it works with dd/mm/yy.

What's the easiest way to replace categorical columns of data with codes in Pandas?

I have a table of data in .dta format which I have read into python using Pandas. The data is mostly in the categorical data type and I want to replace the columns with numerical data that can be used with machine learning, such as boolean (1/0) or codes. The trouble is that I can't directly replace the data because it won't let me change the categories, unless I add them.
I have tried using pd.get_dummies(), but it keeps returning an error:
TypeError: 'columns' is an invalid keyword argument for this function
print(pd.get_dummies(feature).head(), columns=['smkevr', 'cignow', 'dnnow',
'dnever', 'complst'])
Is there a simple way to replace this data with numerical codes based on the value (for example 'Not applicable' = 0)?
I do it the following way:
df_dumm = pd.get_dummies(feature).head()
df_dumm.columns = ['smkevr', 'cignow', 'dnnow',
'dnever', 'complst']
print (df_dumm.head())

error occured when using df.fillna(0)

Very simple code using spark + python:
df = spark.read.option("header","true").csv(file_name)
df = df_abnor_matrix.fillna(0)
but error occured:
pyspark.sql.utils.AnalysisException: u'Cannot resolve column name
"cp_com.game.shns.uc" among (ProductVersion, IMEI, FROMTIME, TOTIME,
STATISTICTIME, TimeStamp, label, MD5, cp_com.game.shns.uc,
cp_com.yunchang....
What's wrong with it? cp_com.game.shns.uc is among the list.
Spark does not support dot character in column names, check issue, so you need to replace dots with underscore before working on the csv.

Categories

Resources