Python Out of Index Error - python

I'm working on converting a Twitter created_at date into an integer and using the following code, but i'm getting an out of index error:
creation_date = [time.strftime('%Y-%m-%d %H:%M:%S',time.strptime(status['created_at']
,'%a %b %d %H:%M:%S +0000 %Y')) for status in statuses]
for x in range(len(creation_date)):
year = int(creation_date[x][0:4])
month = int(creation_date[x][5:7])
day = int(creation_date[x][8:10])
newCreationDate = []
newCreationDate[x] = datetime(year,month,day)

try this:
newCreationDate = []
for x in range(len(creation_date)):
year = int(creation_date[x][0:4])
month = int(creation_date[x][5:7])
day = int(creation_date[x][8:10])
newCreationDate.append(datetime(year,month,day))

You're emptying the newCreationDate array each time through the loop.
newCreationDate = []
for d in creation_date:
year = int(d[0:4])
month = int(d[5:7])
day = int(d[8:10])
newCreationDate.append(datetime(year,month,day))

Related

Python Obtaining Week Number

I am trying to pass the week number (week_num) into a dataframe to analyze further my Kmeans output. Here is where I am struggling:
from datetime import date, timedelta, datetime
d1 = date(2013, 1, 1) # start date
d2 = date(2014, 12, 31) # end date
delta = d2 - d1 # timedelta
daysyear = []
day = []
month = []
year = []
#week_num = []
D = {0:'mon', 1:'tue', 2:'wed', 3:'thu', 4:'fri', 5:'sat', 6:'sun'}
for i in range(delta.days + 1):
daysyear.extend([D[(d1 + timedelta(days=i)).weekday()]+"-"+str(d1 + timedelta(days=i))])
day.extend([D[(d1 + timedelta(days=i)).weekday()]])
month.extend([(d1 + timedelta(days=i)).month])
year.extend([(d1 + timedelta(days=i)).year])
#week.extend([(d1 + timedelta(days=i))])
#week.extend(datetime.strptime([(d1 + timedelta(days=i)).year], "%Y-%W-%w").isocalendar()[1])
#week.pd.to_datetime([(d1 + timedelta(days=i))], errors ='coerce')
#yearnum, month_num, day_of_week = (d1 + timedelta(days=i))#.isocalendar()
Then passing into:
labels_2 = pd.DataFrame({'Date': daysyear, 'Cluster_ID': kmeans_2.labels_, 'Power_Usage': np.array(X).mean(axis=1), 'Day_of_Wk': day, 'Month_Num': month, 'Year': year})
Thank you for your help.
I have tried several methods and I am expecting the week number in numerical format to be passed into here
labels_2 = pd.DataFrame({'Date': daysyear, 'Cluster_ID': kmeans_2.labels_, 'Power_Usage': np.array(X).mean(axis=1), 'Day_of_Wk': day, 'Month_Num': month, 'Year': year, 'Week_Num: week_num'})

How to correctly compare the current date with the date of the site (just the day) from which the publication is parsed with today's date?

I need to parse exactly those publications where the publication date coincides with today's date. I only need to match the day. For example, 20 = 20. Here's what I did, but this is not good code:
today = date.today()
d2 = today.strftime("%B %d, %Y")
today_day = d2[8] + d2[9]
for el in items:
title = el.select('.card-stats > div')
p = title[1].text
space = p.replace(" ","")
day = space[1] + space[2]
if day == today_day:
data_id = el.get('data-id')
I think that the mistake could be in the second part of your code where you throw a variable, day, you didn't define before.
Try:
today = date.today()
d2 = today.strftime("%B %d, %Y")
today_day = d2[8] + d2[9]
for el in items:
title = el.select('.card-stats > div')
p = title[1].text
space = p.replace(" ","")
day = space[1] + space[2]
if today == today_day:
data_id = el.get('data-id')

Compare QDateTime

self.start_date = ui.new_scheduled_transmition_day_start_date_2.date().toString()
self.time_start = ui.new_scheduled_transmition_day_start_time_2.time().toString()
self.end_date = ui.new_scheduled_transmition_day_end_date_2.date().toString()
self.end_time = ui.new_scheduled_transmition_day_end_time_2.time().toString()
I want to check if start_datetime<end_datetime.
Any advice would be useful.
I tried this:
date_1 = self.start_date+" "+self.time_start
date_2 = self.end_date+" "+self.end_time
date_1_time_obj = datetime.datetime.strptime(date_1, '%a %b %d %Y %H:%M:%S')
date_2_time_obj = datetime.datetime.strptime(date_2, '%a %b %d %Y %H:%M:%S')
print(date_1_time_obj<date_2_time_obj)
Error:
ValueError: time data '╙άέ ╔άΊ 1 2000 00:00:00' does not match format '%a %b %d %Y %H:%M:%S'
The error happens because .toString() returns day of week and month in local format (greek characters)
self.start_date = ui.new_scheduled_transmition_day_start_date_2.date().toPyDate()
self.time_start = ui.new_scheduled_transmition_day_start_time_2.time().toPyTime()
self.end_date = ui.new_scheduled_transmition_day_end_date_2.date().toPyDate()
self.end_time = ui.new_scheduled_transmition_day_end_time_2.time().toPyTime()
date_1_time_obj = datetime.datetime.combine(self.start_date, self.time_start)
date_2_time_obj = datetime.datetime.combine(self.end_date, self.end_time)
print(date_1_time_obj<date_2_time_obj)
If you want to do just what your title says: compare objects of type QDateTime, that's already possible just like that.
date1 = QDateTime.currentDateTime()
date2 = QDateTime.currentDateTime().addMonths(1)
date1 < date2
=> True
And if you want to combine the date and the time first it should go about this way:
start_date = ui.new_scheduled_transmition_day_start_date_2
start_time = ui.new_scheduled_transmition_day_start_time_2
end_date = ui.new_scheduled_transmition_day_end_date_2
end_time = ui.new_scheduled_transmition_day_end_time_2
date_1 = QDateTime(start_date).addMSecs(start_time.msecsSinceStartOfDay())
date_2 = QDateTime(end_date).addMSecs(end_time.msecsSinceStartOfDay())
date1 < date2
See https://doc.qt.io/qtforpython/PySide2/QtCore/QDateTime.html#PySide2.QtCore.PySide2.QtCore.QDateTime.__lt__ for more information.

ValueError: time data does not match

So I got this error raised
ValueError: time data '8/16/2016 9:55' does not match format '%m/&d/%Y
%H:%M'.
I know that %m is the format for month with two digits (zero-padded). And as we can see that '8' (August) does not have zero padded. Is that the problem for this error? And how I fix this?
import datetime as dt
result_list = []
for a in ask_posts:
result_list.append([a[6], int(a[4])])
counts_by_hour = {}
comments_by_hour = {}
date_format = '%m/&d/%Y %H:%M'
for row in result_list:
date = row[0]
comment = row[1]
time = dt.datetime.strptime(date, date_format).strftime("%H")
``` I want to extract the Hour only```
if time not in counts_by_hour:
counts_by_hour[time] = 1
comments_by_hour[time] = comment
else:
counts_by_hour[time] += 1
comments_by_hours[time] += comment
you have an error in your dateformat % not &
import datetime as dt
result_list = []
for a in ask_posts:
result_list.append([a[6], int(a[4])])
counts_by_hour = {}
comments_by_hour = {}
date_format = '%m/%d/%Y %H:%M' # change & with %
for row in result_list:
date = row[0]
comment = row[1]
time = dt.datetime.strptime(date, date_format).strftime("%H")
``` I want to extract the Hour only```
if time not in counts_by_hour:
counts_by_hour[time] = 1
comments_by_hour[time] = comment
else:
counts_by_hour[time] += 1
comments_by_hours[time] += comment

strptime how to parse : 01/Jul/1995:00:00:01-0400

I have tried a wealth of options and got it down with like some hacked together parsing but I am curious how to do this with strptime?
item = "01/Jul/1995:00:00:01-0400"
checkdate = datetime.strptime(item,"%Y-%m-%dT:%H:%M%S%z")
checkdate = datetime.strptime(item,"%Y/%m/%dT:%H:%M:%S%z")
checkdate = datetime.strptime(item,"%Y-%b-%d:%H:%M%S%z")
checkdate = datetime.strptime(item,"%Y-%b-%dT:%H:%M:%S%z")
checkdate = datetime.strptime(item,"%Y/%m/%d:%H:%M:%S%z")
what i get for each attempt is :
ValueError: time data '01/Jul/1995:00:00:01-0400' does not match format '%Y/%m/%d:%H:%M:%S%z'
what is the correct strptime formatting for this?
EDIT:
so you were correct and i did a small test
def split_date (stringdate):
datepart = []
monthDict = {'Jan':'01','Feb':'02','Mar':'03','Apr':'04','May':'05',
'Jun':'06','Jul':'07','Aug':'08','Sep':'09','Oct':'10','Nov':'11','Dec':'12'}
split1 = [part for part in stringdate.split('/')]
day = split1[0]
month = split1[1]
month = monthDict.get(month)
split2 = [part for part in split1[2].split(":")]
year = split2[0]
hour = split2[1]
minute = split2[2]
split3 = [part for part in split2[3].split('-')]
second = split3[0]
timezone = split3[1]
return datetime(int(year), int(month), int(day), int(hour), int(minute), int(second), int(timezone)
datetime_received_split = []
datetime_received_strp = []
s = time.time()
for date in data.time_received:
try:
datetime_received_split.append(split_date(date))
except:
split_fail.append(date)
e = time.time()
print ('split took {} s '.format(e-s))
s = time.time()
for date in data.time_received:
try:
datetime_received_strp.append(datetime.strptime(item,"%d/%b/%Y:%H:%M:%S- %f"))
except:
strp_fail.append(date)
e = time.time()
print ('strp took {} s'.format(e-s))
and i found that the manual split was actually faster by a large margin?
I fixed your date conversion. What's great is %f is supported in both Python 2.7 and 3.x.
from datetime import datetime
item = "01/Jul/1995:00:00:01-0400"
checkdate = datetime.strptime(item,"%d/%b/%Y:%H:%M:%S-%f")
print(checkdate)
%z is supported in Python 3.2+.
So for Python2.x, have a look at How to parse dates with -0400 timezone string in python?
If you're using Python3.x you can try this:
from datetime import datetime
item = "01/Jul/1995:00:00:01-0400"
checkdate = datetime.strptime(item,"%d/%b/%Y:%H:%M:%S%z")
print(checkdate)
Result:
1995-07-01 00:00:01-04:00
See more details from strftime() and strptime() Behavior

Categories

Resources