AttributeError: 'Series' object has no attribute 'Mean_μg_L' - python

Why am I getting this error if the column name exists.
I have tried everything. I am out of ideas

Since the AttributeError is raised at the first column with a name containing a mathematical symbol (µ), I would suggest you these two solutions :
Use replace right before the loop to get rid of this special character
df.columns = df.columns.str.replace("_\wg_", "_ug_", regex=True)
#change df to Table_1_1a_Tarawa_Terrace_System_1975_to_1985
Then inside the loop, use row.Mean_ug_L, .. instead of row.Mean_µg_L, ..
Use row["col_name"] (highly recommended) to refer to the column rather than row.col_name
for index, row in Table_1_1a_Tarawa_Terrace_System_1975_to_1985.iterrows():
SQL_VALUES_Tarawa = (row["Chemicals"], rows["Contamminant"], row["Mean_µg_L"], row["Median_µg_L"], row["Range_µg_L"], row["Num_Months_Greater_MCL"], row["Num_Months_Greater_100_µg_L"])
cursor.execute(SQL_insert_Tarawa, SQL_VALUES_Tarawa)
counting = cursor.rowcount
print(counting, "Record added")
conn.commit()

Related

If Else, column with several string values to match on

The function will have much more conditional statements, but to start off, and where I've trouble-shooted to, I get the error: 'str' object has no attribute 'isin', etc. I've tried several things to no avail.
def categorise(row):
if (row['state'] == 'FL') & (row['city'].isin(['MIAMI', 'TALLAHASSEE', 'ORLANDO'])):
return 1
...
df['colF'] = df.apply(lambda row: categorise(row), axis=1)

cant change column name from 'NaN' to smth else

i have tried to string the column names and then change them - no succces, it left it NaN
data.rename(columns=str).rename(columns={'NaN':'Tip Analiza','NaN':'Limite' }, inplace=True)
i tried to use the in function to replace NaN- no succes - it gave an error,
TypeError: argument of type 'float' is not iterable
data.columns = pd.Series([np.nan if 'Unnamed:' in x else x for x in data.columns.values]).ffill().values.flatten()
what should i try ?
Try:
data.columns=map(str, data)
# in case of unique column names
data=data.replace({"col1": "rnm1", "col2": "rnm2"})
# otherwise ignore first line, and just do
data.columns=["rnm1", "rnm2"]

How to change all columns in csv file to str?

I am working on a script that imports an excel file, iterates through a column called "Title," and returns False if a certain keyword is present in "Title." The script runs, until I get to part where I want to export another csv file that gives me a separate column. My error is as follows: AttributeError: 'int' object has no attribute 'lower'
Based on this error, I changed the df.Title to a string using df['Title'].astype(str), but I get the same error.
import pandas as pd
data = pd.read_excel(r'C:/Users/Downloads/61_MONDAY_PROCESS_9.16.19.xlsx')
df = pd.DataFrame(data, columns=['Date Added','Track Item', 'Retailer Item ID','UPC','Title','Manufacturer','Brand','Client Product
Group','Category','Subcategory',
'Amazon Sub Category','Segment','Platform'])
df['Title'].astype(str)
df['Retailer Item ID'].astype(str)
excludes = ['chainsaw','pail','leaf blower','HYOUJIN','brush','dryer','genie','Genuine
Joe','backpack','curling iron','dog','cat','wig','animal','dryer',':','tea', 'Adidas', 'Fila',
'Reebok','Puma','Nike','basket','extension','extensions','batteries','battery','[EXPLICIT]']
my_excludes = [set(x.lower().split()) for x in excludes]
match_titles = [e for e in df.Title.astype(str) if any(keywords.issubset(e.lower().split()) for
keywords in my_excludes)]
def is_match(title, excludes = my_excludes):
if any(keywords.issubset(title.lower().split()) for keywords in my_excludes):
return True
return False
This is the part that returns the error:
df['match_titles'] = df['Title'].apply(is_match)
result = df[df['match_titles']]['Retailer Item ID']
print(df)
df.to_csv('Asin_List(9.18.19).csv',index=False)
Use the following code to import your file:
data = pd.read_excel(r'C:/Users/Downloads/61_MONDAY_PROCESS_9.16.19.xlsx',
dtype='str')`
For pandas.read_excel, you can pass an optional parameter dtype.
You can also use it to pass multiple data types for different columns:
ex: dtype={'Retailer Item ID': int, 'Title': str})
At the line where you wrote
match_titles = [e for e in df.Title.astype(str) if any(keywords.issubset(e.lower().split()) for
keywords in my_excludes)]
python returns as variable e an integer and not the String you like.This happens because when you write df.Title.astype(str) you are searching the index of a new pandas dataframe containing only the column Title and not the contents of the column.If you want to iterate through column you should try
match_titles = [e for e in df.ix[:,5] if any(keywords.issubset(e.lower().split()) for keywords in my_excludes)
The df.ix[:,5] returns the fifth column of the dataframe df,which is the column you want.If this doesn't work try with the iteritems() function.
The main idea is that if you directly assign a df[column] to something else,you are assigning its index,not its contents.

Getting the value of a previous excel cell in python

So given a cell I want to know the value in which the cell right before it (same row, previous column) has.
Here is my code and I thought it was working but...:
def excel_test(col_num, sheet_object):
for cell in sheet_object.columns[col_number]:
prev_col = (column_index_from_string(cell.column))
row = cell.row
prev_cell = sheet_object.cell(row, prev_col)
I keep getting this error:
coordinate = coordinate.upper().replace('$', '')
builtins.AttributeError: 'int' object has no attribute 'upper'
I have also tried this:
def excel_test(col_num, sheet_object):
for cell in sheet_object.columns[col_number]:
prev_col = (column_index_from_string(cell.column))
row = cell.row
prev_cell = sheet_object.cell(row, get_column_letter(prev_col))
Can somebody tell me how i can access that, I've also imported everything there needs to be imported.
You should look at the cell.offset() method.

Spark - length of element of row

I am trying to do a filter operation to get all the rows where the length of my variable country is less than 4 and I keep getting errors no matter what I do.
This is the current code (using the Python API)
uniqueRegions = sqlContext.sql("SELECT country, city FROM df")
uniqueRegions = uniqueRegions.rdd
uniqueRegions = uniqueRegions.distinct()
uniqueRegions = uniqueRegions.filter(lambda line: len(line.country) < 4)
This is the error
TypeError: object of type 'NoneType' has no len()
And the first row (done with rdd.first):
Row(country=u'xxxxxx', city=u'xxxxxx')
Any suggestion on how to solve this?
Thanks.
You have a database record where the country is NULL. The length of that doesn't make sense. What should it do when there's no country set?
Maybe you want to filter the records? SELECT country, city FROM df WHERE country IS NOT NULL? Or maybe lambda l: l.country is not None and len(l.country) < 4, or depending on your logic, lambda l: l.country is None or len(l.country) < 4.

Categories

Resources