I am trying to integrate my Jupyter Notebook with Google Sheets. I am trying to execute below code and it is giving me attribute error. Need your help to solve the same.
# Set the sheet name you want to upload data to and the start cell where the upload data begins
wks_name = 'Sheet1'
cell_of_start_df = 'A2'
# upload the dataframe of the clients we want to delete
d2g.upload(rs,
spreadsheet_key,
wks_name,
credentials=credentials,
col_names=False,
row_names=False,
start_cell = cell_of_start_df,
clean=False)
print ('The sheet is updated successfully')
I am getting below error:
AttributeError Traceback (most recent call last)
<ipython-input-30-6aac4a76409f> in <module>
10 row_names=False,
11 start_cell = cell_of_start_df,
---> 12 clean=False)
13 print ('The sheet is updated successfully')
~\Anaconda3\lib\site-packages\df2gspread\df2gspread.py in upload(df, gfile, wks_name, col_names, row_names, clean, credentials, start_cell, df_size, new_sheet_dimensions)
99 last_idx = num_rows + last_idx_adjust
100
--> 101 num_cols = len(df.columns) + 1 if row_names else len(df.columns)
102 last_col_adjust = start_col_int - 1
103 last_col_int = num_cols + last_col_adjust
~\Anaconda3\lib\site-packages\pandas\core\generic.py in __getattr__(self, name)
5268 or name in self._accessors
5269 ):
-> 5270 return object.__getattribute__(self, name)
5271 else:
5272 if self._info_axis._can_hold_identifiers_and_holds_name(name):
AttributeError: 'Series' object has no attribute 'columns'
It will be of great help if it gets solved.
Additional Code:
cols = ['Cases']
mask = lk[cols].applymap(lambda x: isinstance(x, (int, float)))
lk[cols] = lk[cols].where(mask)
print (lk)
mn = lk.replace('NaN',0)
mn
rs=mn.groupby(['States and UTs','district','status'])[mn.columns[0]].sum()
rs
Related
I'm using Colab to run the code but I get this error and I can't fix it. Could you help me out?
I don't know what to do in order to fix it because I have tried to change upper case or lower case.
#Inativos: ajustar nomes das colunas
dfInativos = dfInativos.rename(columns={'userId': 'id'})
dfInativos = dfInativos.rename(columns={'classId': 'ClasseId'})
dfInativos[['id','ClasseId','lastActivityDate','inactivityDaysCount','sevenDayInactiveStatus']] = dfInativos
#dfInativos['id'] = dfInativos['id'].astype(int, errors = 'ignore')
#dfInativos['ClasseId'] = dfInativos['ClasseId'].astype(int, errors = 'ignore')
dfInativos['id'] = pd.to_numeric(dfInativos['id'],errors = 'coerce')
dfInativos['ClasseId'] = pd.to_numeric(dfInativos['ClasseId'],errors = 'coerce')
#dfInativos.dropna(subset = ['lastActivityDate'], inplace=True)
dfInativos.drop_duplicates(subset = ['id','ClasseId'], inplace=True)
dfInativos['seven DayInactiveStatus'] = dfInativos['sevenDayInactiveStatus'].replace(0,'')
#Add Inactive data to main data frame
df = df.merge(dfInativos, on=['id','ClasseId'], how='left')
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-79-10fe94c48d1f> in <module>()
2 dfInativos = dfInativos.rename(columns={'userId': 'id'})
3 dfInativos = dfInativos.rename(columns={'classId': 'ClasseId'})
----> 4 dfInativos[['id','ClasseId','lastActivityDate','inactivityDaysCount','sevenDayInactiveStatus']] = dfInativos
5
6
2 frames
/usr/local/lib/python3.7/dist-packages/pandas/core/indexers.py in check_key_length(columns, key, value)
426 if columns.is_unique:
427 if len(value.columns) != len(key):
--> 428 raise ValueError("Columns must be same length as key")
429 else:
430 # Missing keys in columns are represented as -1
ValueError: Columns must be same length as key
I am facing issue in the below code in pandas python. When I run the following cell, I did not face any issue but when I run the next cell The error shows up:
def build_anonymized_dataset(df, partitions, feature_columns, sensitive_column, max_partitions=None):
aggregations = {}
for column in feature_columns:
if column in categorical:
aggregations[column] = agg_categorical_column
else:
aggregations[column] = agg_numerical_column
rows = []
for i, partition in enumerate(partitions):
if i % 100 == 1:
print("Finished {} partitions...".format(i))
if max_partitions is not None and i > max_partitions:
break
grouped_columns = df.loc[partition].agg(aggregations, squeeze=False)
sensitive_counts = df.loc[partition].groupby(sensitive_column).agg({sensitive_column : 'count'})
values = grouped_columns.iloc[0].to_dict().items()
for sensitive_value, count in sensitive_counts[sensitive_column].items():
if count == 0:
continue
values.update({
sensitive_column : sensitive_value,
'count' : count,
})
rows.append(values.copy())
return pd.DataFrame(rows)
But when I run the following cell, I am facing error related to the previous cell. I do not understand what is this error and how can solve it.
dfn = build_anonymized_dataset(df, finished_partitions, feature_columns, sensitive_column)
the following error appear:
14 grouped_columns = df.loc[partition].agg(aggregations, squeeze=False)
15 sensitive_counts = df.loc[partition].groupby(sensitive_column).agg({sensitive_column : 'count'})
---> 16 values = grouped_columns.iloc[0].to_dict().items()
17 for sensitive_value, count in sensitive_counts[sensitive_column].items():
18 if count == 0:
AttributeError: 'list' object has no attribute 'to_dict'
I have been using the below script to calculate the storage capacity on one of our server environments. It reads the values from a report I get every two weeks and then creates a file I can import into PowerBI to create graphs. It ran without an issue 2 weeks ago but today when I tried to run it I get a TypeError. I assume it is "if float(df['Capacity(TB)']) >= 0.01: " causing the issue as per the error message.
The data I am importing is a xls sheet with a header name and values underneath it. I had a look to see if there are any blank fields but could not find any. Any help/suggestions would be greatly appreciated.
import pandas as pd
import numpy as np
from datetime import datetime
import os
from os import listdir
from os.path import isfile, join
#SCCM resource import as 'df'
pathres = r'C:\Capacity Reports\SCOM Reports'
onlyfiles = [f for f in listdir(pathres) if isfile(join(pathres, f))]
df = pd.DataFrame()
for i in onlyfiles:
print(i)
dfresimp = pd.read_excel(pathres+'\\'+i)
df = pd.concat([df, dfresimp])
#CMDB import as 'df2'
df2 = pd.read_excel('C:\\Capacity Reports\\CMDB_Export.xlsx')
#Windows Lifecycle import as 'df3'
df3 = pd.read_excel('C:\\Capacity Reports\\Windows Server Lifecycle.xlsx')
#SCVMM clusters import as 'df4'
df4 = pd.read_excel('C:\\Capacity Reports\\HyperV Overview.xlsx')
#SCVMM Storage reports import as 'df5'
pathstor = r'C:\Capacity Reports\Hyper-V Storage'
Storfiles = [f for f in listdir(pathstor) if isfile(join(pathstor, f))]
df5 = pd.DataFrame()
for i in Storfiles:
print(i)
dfstorimp = pd.read_excel(pathstor+'\\'+i)
df5 = pd.concat([df5, dfstorimp])
#CREATE MAIN TABLE
df['NAME'] = df['Computer Name'].str.upper()
df11 = pd.DataFrame()
df11['NAME'] = df2['NAME'].str.upper()
df11['Application Owner'] = df2['Application Owner'].str.title()
df11['HW EOSL'] = df2['HW EOSL'].str.title()
#print(df11['HW EOSL'])
Main_Table = df.merge(df11, on='NAME', how='left')
Main_Table = Main_Table.merge(df3, on='Operating System Edition', how='left')
df13 = pd.DataFrame()
df13['Hyper V Cluster name'] = df4['Hyper V Cluster name']
df13['Computer Name'] = df4['Server Name'].str.upper()
Main_Table = Main_Table.merge(df13, on='Computer Name', how='left')
Main_Table['OS_Support'] = pd.to_datetime(Main_Table['Extended_Support_End_Date'], format='"%Y-%m-%d %H:%S:%f')
Main_Table['OS_Support'] = Main_Table['OS_Support'].dt.strftime("%Y-%m-%d")
#print(Main_Table['OS_Support'])
def f(df):
if df['Host/GuestVM'] == 'GuestVM':
result = (df['Total Physical Memory GB']-(df['Total Physical Memory GB']*(df['Memory % Used Max Value']/100)))/2
return result
else:
np.nan
Main_Table['Reclaimable Memory Calculated'] = Main_Table.apply(f, axis=1)
def f(df):
if df['Host/GuestVM'] == 'GuestVM':
result = (df['Total Logical Processors']-(df['Total Logical Processors']*(df['CPU % Used Max Value']/100)))/2
return result
else:
np.nan
Main_Table['Reclaimable CPU Calculated'] = Main_Table.apply(f, axis=1)
Main_Table['Reclaimable Memory Calculated'] = round(Main_Table['Reclaimable Memory Calculated'])
Main_Table['Reclaimable CPU Calculated'] = round(Main_Table['Reclaimable CPU Calculated'])
Main_Table['Report Timestamp'] = Main_Table['Report Timestamp'].dt.strftime("%Y%m%d")
Main_Table = Main_Table.drop_duplicates()
Main_Table['Report Timestamp Number'] = Main_Table['Report Timestamp']
column = Main_Table["Report Timestamp Number"]
max_value = column.max()
Total_Memory_Latest = 0
def f(df):
global Total_Memory_Latest
if df['Report Timestamp Number'] == max_value and df['Host/GuestVM'] == 'Host':
Total_Memory_Latest += df['Total Physical Memory GB']
return 0
else:
np.nan
Main_Table['DummyField'] = Main_Table.apply(f, axis=1)
Main_Table.to_excel(r'C:\Users\storm_he\OneDrive - MTN Group\Documents\Testing\Main_Table.xlsx')
#CREATE STORAGE TABLE AND EXPORT
def f(df):
#if df['Host/GuestVM'] == 'Host':
#try:
if float(df['Capacity(TB)']) >= 0.01:
result = (df['Available(TB)']/df['Capacity(TB)'])*100
return round(result)
else:
return ''
#except:
#return np.nan
df5['% Storage free'] = df5.apply(f, axis=1)
pattern = '|'.join(['.mtn.co.za', '.mtn.com'])
df5['VMHost'] = df5['VMHost'].str.replace(pattern,'')
df5['VMHost'] = df5['VMHost'].str.upper()
df5['Report Timestamp'] = df5['Report Timestamp'].dt.strftime("%Y%m%d")
#print(df5['Report Timestamp'])
df5.to_excel(r'C:\Users\storm_he\OneDrive - MTN Group\Documents\Testing\Main_Storage_table.xlsx')
print('Run Finished')
StackTrace
TypeError Traceback (most recent call last)
<ipython-input-1-3c53bb32e311> in <module>
108 column = Main_Table["Report Timestamp Number"]
109
--> 110 max_value = column.max()
111 Total_Memory_Latest = 0
112
~\Anaconda3\lib\site-packages\pandas\core\generic.py in stat_func(self, axis, skipna, level, numeric_only, **kwargs)
11212 if level is not None:
11213 return self._agg_by_level(name, axis=axis, level=level, skipna=skipna)
> 11214 return self._reduce(
11215 f, name, axis=axis, skipna=skipna, numeric_only=numeric_only
11216 )
~\Anaconda3\lib\site-packages\pandas\core\series.py in _reduce(self, op, name, axis, skipna, numeric_only, filter_type, **kwds)
3889 )
3890 with np.errstate(all="ignore"):
-> 3891 return op(delegate, skipna=skipna, **kwds)
3892
3893 # TODO(EA) dispatch to Index
~\Anaconda3\lib\site-packages\pandas\core\nanops.py in f(values, axis, skipna, **kwds)
123 result = alt(values, axis=axis, skipna=skipna, **kwds)
124 else:
--> 125 result = alt(values, axis=axis, skipna=skipna, **kwds)
126
127 return result
~\Anaconda3\lib\site-packages\pandas\core\nanops.py in reduction(values, axis, skipna, mask)
835 result = np.nan
836 else:
--> 837 result = getattr(values, meth)(axis)
838
839 result = _wrap_results(result, dtype, fill_value)
~\Anaconda3\lib\site-packages\numpy\core\_methods.py in _amax(a, axis, out, keepdims, initial, where)
28 def _amax(a, axis=None, out=None, keepdims=False,
29 initial=_NoValue, where=True):
---> 30 return umr_maximum(a, axis, None, out, keepdims, initial, where)
31
32 def _amin(a, axis=None, out=None, keepdims=False,
TypeError: '>=' not supported between instances of 'float' and 'str'
I'm trying to translate part of SQuAD 1.1 dataset to Sinhalese. I don't know whether i can use the json file straight into translation
What i tried so far is making a little dataframe of SQuAD dataset and try to translate that as a demo to myself. But i got different errors. Below is the error i'm getting now. Can you help me to fix that error or tell me a better way to complete my task using python.
```import googletrans
from googletrans import Translator
import os
from google.cloud import translate_v2 as translate
os.environ['GOOGLE_APPLICATION_CREDENTIALS']=r"C:\Users\Sathsara\Documents\Python Learning\Translation test\translationAPI\flash-medley-278816-b2012b874797.json"
# create a translator object
translator = Translator()
# use translate method to translate a string - by default, the destination language is english
translated = translator.translate('I am Sathsara Rasantha',dest='si')
# the translate method returns an object
print(translated)
# obtain translated string by using attribute .text
translated.text
import pandas as pd
translate_example = pd.read_json("example2.json")
translate_example
contexts = []
questions = []
answers_text = []
answers_start = []
for i in range(translate_example.shape[0]):
topic = translate_example.iloc[i,0]['paragraphs']
for sub_para in topic:
for q_a in sub_para['qas']:
questions.append(q_a['question'])
answers_start.append(q_a['answers'][0]['answer_start'])
answers_text.append(q_a['answers'][0]['text'])
contexts.append(sub_para['context'])
df = pd.DataFrame({"context":contexts, "question": questions, "answer_start": answers_start, "text": answers_text})
df
df=df.loc[0:2,:]
df
# make a deep copy of the data frame
df_si = df.copy()
# translate columns' name using rename function
df_si.rename(columns=lambda x: translator.translate(x).text, inplace=True)
df_si.columns
translations = {}
for column in df_si.columns:
# unique elements of the column
unique_elements = df_si[column].unique()
for element in unique_elements:
# add translation to the dictionary
translations[element] = translator.translate(element,dest='si').text
print(translations)
# modify all the terms of the data frame by using the previously created dictionary
df_si.replace(translations, inplace = True)
# check translation
df_si.head()```
This is the error i get
> --------------------------------------------------------------------------- TypeError Traceback (most recent call
> last) <ipython-input-24-f55a5ca59c36> in <module>
> 5 for element in unique_elements:
> 6 # add translation to the dictionary
> ----> 7 translations[element] = translator.translate(element,dest='si').text
> 8
> 9 print(translations)
>
> ~\Anaconda3\lib\site-packages\googletrans\client.py in translate(self,
> text, dest, src)
> 170
> 171 origin = text
> --> 172 data = self._translate(text, dest, src)
> 173
> 174 # this code will be updated when the format is changed.
>
> ~\Anaconda3\lib\site-packages\googletrans\client.py in
> _translate(self, text, dest, src)
> 73 text = text.decode('utf-8')
> 74
> ---> 75 token = self.token_acquirer.do(text)
> 76 params = utils.build_params(query=text, src=src, dest=dest,
> 77 token=token)
>
> ~\Anaconda3\lib\site-packages\googletrans\gtoken.py in do(self, text)
> 199 def do(self, text):
> 200 self._update()
> --> 201 tk = self.acquire(text)
> 202 return tk
>
> ~\Anaconda3\lib\site-packages\googletrans\gtoken.py in acquire(self,
> text)
> 144 a = []
> 145 # Convert text to ints
> --> 146 for i in text:
> 147 val = ord(i)
> 148 if val < 0x10000:
>
> TypeError: 'numpy.int64' object is not iterable
i have a problem i running this code
resamp = pd.DataFrame()
station_ids = list(set(weather_data.station_id.tolist()))
for _id in station_ids:
idx = weather_data.station_id == _id
ti = time_index[idx]
wdfi = weather_data[idx].set_index(ti)
floating = wdfi[['visibility','temperature','wind_speed', "wind_dir", "Rain"]]
binaries = wdfi[['visibility','temperature','wind_speed', "wind_dir", "Rain"]]
b = binaries.resample('1h').rolling(24).apply(lambda x: x.any())
f = floating.resample('1h').agg({
'wind_speed': 'mean',
'visibility': 'mean',
'temperature': 'mean',
'wind_dir':'mean',
'Rain':'mean'
})
temp = pd.concat((f,b),axis=1)
temp['station_id'] = _id
resamp = resamp.append(temp)
and I get this error
AttributeError Traceback (most recent call last)
in ()
8 floating = wdfi[['visibility','temperature','wind_speed', "wind_dir", "Rain"]]
9 binaries = wdfi[['visibility','temperature','wind_speed', "wind_dir", "Rain"]]
---> 10 b = binaries.resample('1h').rolling(24).apply(lambda x: x.any())
11 f = floating.resample('1h').agg({
12 'wind_speed': 'mean',
~\Anaconda3\envs\arcpro\lib\site-packages\pandas\core\resample.py in getattr(self, attr)
95 return self[attr]
96
---> 97 return object.getattribute(self, attr)
98
99 #property
AttributeError: 'DatetimeIndexResampler' object has no attribute 'rolling'
my pandes v 24
thank you
answer by SvenD could be what you're looking for :
How to convert DatetimeIndexResampler to DataFrame?
"resample no longer returns a dataframe: it's now "lazily evaluated"
at the moment of the aggregation or interpolation. => depending on
your use case, replacing .resample("1D") with
.resample("1D").mean() (i.e. downscaling) or with
.resample("1D").interpolate() (upscaling) could be what you're
after, and they both return a dataframe.
– Svend Sep 15 '16 at 8:57"