str object has no attribute strftime - python

AttributeError: 'str' object has no attribute 'strftime'
if __name__=="__main__":
df=pd.read_excel("abhi.xlsx")
#print(df)
today = datetime.datetime.now().strftime("%d-%m")
yearNow = datetime.datetime.now().strftime("%Y")
#print(type(today))
writeInd =[]
for index, item in df.iterrows():
print(index,item['Birthday'])
pr_bday = item['Birthday'].strftime("%d-%m")
print(pr_bday)

Because your column 'Birthday' is not datetime type, actually it is a string type object.
You can use df.dtype to check type of each column in df.
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.dtypes.html
Or you can just use type(item['Birthday']) to get the type directly.

Related

Unable to solve the AttributeError: 'Series' object has no attribute 'split' while trying to apply a function to an entire column in the dataframe

def split_address(in_address):
address = in_address
address_arr = address.split()
if len(address_arr):
for i, w in enumerate(address_arr):
if w.isnumeric():
break
locationName = ' '.join(address_arr[:i])
locationAddress = ' '.join(address_arr[i:-3])
locationCity = address_arr[-3]
locationState = address_arr[-2]
locationZip = address_arr[-1]
else:
locationName.append('')
locationAddress.append('')
locationCity.append('')
locationState.append('')
locationZip.append('')
return (locationName, locationAddress, locationCity, locationState, locationZip)
new_df = out_df.apply(split_address)
The above code throws an error AttributeError: 'Series' object has no attribute 'split'
I wish to get an output so that the function is applied on all the rows of the column in the dataframe and the output has to be in 5 different columns as mentioned.
It'd be great if you can please help me with this.
Thank you.

Dask compute gives AttributeError: 'Series' object has no attribute 'encode'

I would like to apply a function to each row of a dask dataframe.
Executing the operation with ddf.compute() gives me an error:
AttributeError: 'Series' object has no attribute 'encode'
This is my code:
def polar(data):
data=scale(sid.polarity_scores(data.tweet)['compound'])
return data
t_data['sentiment'] = t_data.map_partitions(polar, meta=('sentiment', int))
And using t_data.head() also result in same error.
I have found out the answer. You have to apply for partition.
t_data['sentiment']=t_data.map_partitions(lambda df : df.apply(polar,axis=1))
You can use the following:
t_data.apply(polar, axis=1)

Adding Hyperlink to Pandas DataFrame Makes it a Non Type Object instead of DataFrame

I'm trying to add a clickable hyperlink to my dataframe, which I've done successfully, but then when I try to use JoogleChart to create drop downs and make the data more manageable for users, I get this error: AttributeError: 'NoneType' object has no attribute 'columns'
[IN]: def candidate_url(row): return """<a
href="hirecentral.corp.indeed.com/candidates? application_id=
{}&from=candidate_search" target="_blank">{reqtitle}</a>
""".format(row['application_id'],
reqtitle = row.application_id) final['candidate_link'] =
final.apply(candidate_url, axis=1)
[IN]: h = final.to_html(escape=True)
[IN]: chart = JoogleChart(h, chart_type = 'Table', allow_nulls = True)
chart.show()
Attribute Error: 'NoneType' object has no attribute 'columns'

AttributeError: 'DataFrame' object has no attribute 'to_datetime'

I want to convert all the items in the 'Time' column of my pandas dataframe from UTC to Eastern time. However, following the answer in this stackoverflow post, some of the keywords are not known in pandas 0.20.3. Overall, how should I do this task?
tweets_df = pd.read_csv('valid_tweets.csv')
tweets_df['Time'] = tweets_df.to_datetime(tweets_df['Time'])
tweets_df.set_index('Time', drop=False, inplace=True)
error is:
tweets_df['Time'] = tweets_df.to_datetime(tweets_df['Time'])
File "/scratch/sjn/anaconda/lib/python3.6/site-packages/pandas/core/generic.py", line 3081, in __getattr__
return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'to_datetime'
items from the Time column look like this:
2016-10-20 03:43:11+00:00
Update:
using
tweets_df['Time'] = pd.to_datetime(tweets_df['Time'])
tweets_df.set_index('Time', drop=False, inplace=True)
tweets_df.index = tweets_df.index.tz_localize('UTC').tz_convert('US/Eastern')
did no time conversion. Any idea what could be fixed?
Update 2:
So the following code, does not do in-place conversion meaning when I print the row['Time'] using iterrows() it shows the original values. Do you know how to do the in-place conversion?
tweets_df['Time'] = pd.to_datetime(tweets_df['Time'])
for index, row in tweets_df.iterrows():
row['Time'].tz_localize('UTC').tz_convert('US/Eastern')
for index, row in tweets_df.iterrows():
print(row['Time'])
to_datetime is a function defined in pandas not a method on a DataFrame. Try:
tweets_df['Time'] = pd.to_datetime(tweets_df['Time'])

AttributeError: 'DataFrame' object has no attribute 'timestamp'

I want to select only those rows that have a timestamp that belongs to last 36 hours. My PySpark DataFrame df has a column unix_timestamp that is a timestamp in seconds.
This is my current code, but it fails with the error AttributeError: 'DataFrame' object has no attribute 'timestamp'. I tried to change it to unix_timestamp, but it fails all the time.
import datetime
hours_36 = (datetime.datetime.now() - datetime.timedelta(hours = 36)).strftime("%Y-%m-%d %H:%M:%S")
df = df.withColumn("unix_timestamp", df.unix_timestamp.cast("timestamp")).filter(df.timestamp > hours_36)
The time stamp column doesn't exist yet when you try to refer to it; You can either use pyspark.sql.functions.col to refer to it in a dynamic way without specifying which data frame object the column belongs to as:
import pyspark.sql.functions as F
df = df.withColumn("unix_timestamp", df.unix_timestamp.cast("timestamp")).filter(F.col("unix_timestamp") > hours_36)
Or without creating the intermediate column:
df.filter(df.unix_timestamp.cast("timestamp") > hours_36)
The API Doc tells me that you can also use a String notation for filtering:
https://spark.apache.org/docs/latest/api/python/pyspark.sql.html#pyspark.sql.DataFrame.filter
import pyspark.sql.functions as F
df = df.withColumn("unix_timestamp", df.unix_timestamp.cast("timestamp"))
.filter("unix_timestamp > %s" % hours_36)
Maybe its not so effienc though

Categories

Resources