So I got this error raised
ValueError: time data '8/16/2016 9:55' does not match format '%m/&d/%Y
%H:%M'.
I know that %m is the format for month with two digits (zero-padded). And as we can see that '8' (August) does not have zero padded. Is that the problem for this error? And how I fix this?
import datetime as dt
result_list = []
for a in ask_posts:
result_list.append([a[6], int(a[4])])
counts_by_hour = {}
comments_by_hour = {}
date_format = '%m/&d/%Y %H:%M'
for row in result_list:
date = row[0]
comment = row[1]
time = dt.datetime.strptime(date, date_format).strftime("%H")
``` I want to extract the Hour only```
if time not in counts_by_hour:
counts_by_hour[time] = 1
comments_by_hour[time] = comment
else:
counts_by_hour[time] += 1
comments_by_hours[time] += comment
you have an error in your dateformat % not &
import datetime as dt
result_list = []
for a in ask_posts:
result_list.append([a[6], int(a[4])])
counts_by_hour = {}
comments_by_hour = {}
date_format = '%m/%d/%Y %H:%M' # change & with %
for row in result_list:
date = row[0]
comment = row[1]
time = dt.datetime.strptime(date, date_format).strftime("%H")
``` I want to extract the Hour only```
if time not in counts_by_hour:
counts_by_hour[time] = 1
comments_by_hour[time] = comment
else:
counts_by_hour[time] += 1
comments_by_hours[time] += comment
Related
I need to parse exactly those publications where the publication date coincides with today's date. I only need to match the day. For example, 20 = 20. Here's what I did, but this is not good code:
today = date.today()
d2 = today.strftime("%B %d, %Y")
today_day = d2[8] + d2[9]
for el in items:
title = el.select('.card-stats > div')
p = title[1].text
space = p.replace(" ","")
day = space[1] + space[2]
if day == today_day:
data_id = el.get('data-id')
I think that the mistake could be in the second part of your code where you throw a variable, day, you didn't define before.
Try:
today = date.today()
d2 = today.strftime("%B %d, %Y")
today_day = d2[8] + d2[9]
for el in items:
title = el.select('.card-stats > div')
p = title[1].text
space = p.replace(" ","")
day = space[1] + space[2]
if today == today_day:
data_id = el.get('data-id')
I have this code in Python:
import datetime
import re
import pymongo
from datetime import timedelta, date
def daterange(d, d1):
for n in range(int ((d1 - d).days)):
yield d + timedelta(n)
#conect to db
uri = "mongodb://127.0.0.1:27017"
client = pymongo.MongoClient(uri)
database = client['db']
collection = database['currency']
d = input('Insert beginning date (yyyy-mm-dd): ')
d1 = input('Insert end date (yyyy-mm-dd): ')
#search db
item = collection.find_one({"date" : d})
item1 = collection.find_one({"date" : d1})
datas = item['date']
datas1 = item1['date']
#convert string to object
dataObject = datetime.datetime.strptime(datas, "%Y-%m-%d")
dataObject1 = datetime.datetime.strptime(datas1, "%Y-%m-%d")
#range
mylist = []
for single_date in daterange(dataObject, dataObject1):
mylist.append(single_date.strftime("%Y-%m-%d"))
print(single_date.strftime("%Y-%m-%d"))
print(mylist)
item = collection.find_one({"date" : mylist[0]})
print(item)
If a user inserts a beginning date like 2018-05-07 and an end date like 2018-05-11 it will print:
2018-05-07
2018-05-08
2018-05-09
2018-05-10
In this case it will only print until the 10th day, how should I do to print also the end date (2018-05-11)?
There are many solutions to your question, however, I believe the easiest would be to adjust the daterange(d, d1) function, by simply adding 1 to the range(int ((d1 - d).days)).
def daterange(d, d1):
for n in range(int ((d1 - d).days) + 1):
yield d + timedelta(n)
The reason for this is that, as per documentation, range(stop) does not output the stop value, but only the values 'before' it.
I have tried a wealth of options and got it down with like some hacked together parsing but I am curious how to do this with strptime?
item = "01/Jul/1995:00:00:01-0400"
checkdate = datetime.strptime(item,"%Y-%m-%dT:%H:%M%S%z")
checkdate = datetime.strptime(item,"%Y/%m/%dT:%H:%M:%S%z")
checkdate = datetime.strptime(item,"%Y-%b-%d:%H:%M%S%z")
checkdate = datetime.strptime(item,"%Y-%b-%dT:%H:%M:%S%z")
checkdate = datetime.strptime(item,"%Y/%m/%d:%H:%M:%S%z")
what i get for each attempt is :
ValueError: time data '01/Jul/1995:00:00:01-0400' does not match format '%Y/%m/%d:%H:%M:%S%z'
what is the correct strptime formatting for this?
EDIT:
so you were correct and i did a small test
def split_date (stringdate):
datepart = []
monthDict = {'Jan':'01','Feb':'02','Mar':'03','Apr':'04','May':'05',
'Jun':'06','Jul':'07','Aug':'08','Sep':'09','Oct':'10','Nov':'11','Dec':'12'}
split1 = [part for part in stringdate.split('/')]
day = split1[0]
month = split1[1]
month = monthDict.get(month)
split2 = [part for part in split1[2].split(":")]
year = split2[0]
hour = split2[1]
minute = split2[2]
split3 = [part for part in split2[3].split('-')]
second = split3[0]
timezone = split3[1]
return datetime(int(year), int(month), int(day), int(hour), int(minute), int(second), int(timezone)
datetime_received_split = []
datetime_received_strp = []
s = time.time()
for date in data.time_received:
try:
datetime_received_split.append(split_date(date))
except:
split_fail.append(date)
e = time.time()
print ('split took {} s '.format(e-s))
s = time.time()
for date in data.time_received:
try:
datetime_received_strp.append(datetime.strptime(item,"%d/%b/%Y:%H:%M:%S- %f"))
except:
strp_fail.append(date)
e = time.time()
print ('strp took {} s'.format(e-s))
and i found that the manual split was actually faster by a large margin?
I fixed your date conversion. What's great is %f is supported in both Python 2.7 and 3.x.
from datetime import datetime
item = "01/Jul/1995:00:00:01-0400"
checkdate = datetime.strptime(item,"%d/%b/%Y:%H:%M:%S-%f")
print(checkdate)
%z is supported in Python 3.2+.
So for Python2.x, have a look at How to parse dates with -0400 timezone string in python?
If you're using Python3.x you can try this:
from datetime import datetime
item = "01/Jul/1995:00:00:01-0400"
checkdate = datetime.strptime(item,"%d/%b/%Y:%H:%M:%S%z")
print(checkdate)
Result:
1995-07-01 00:00:01-04:00
See more details from strftime() and strptime() Behavior
My function is to read data from a file that consists of dates with times a tweet was written, and sentiments (good, bad or neutral) it's classified as; select date with times, and sentiments between a start and end date; and finally create three dictionaries (positive, negative and neutral) that use the date as key, and number of positive, negative or neutral tweets made in a day.
The problems I have are:
a) How do I get only date to display, and not date and time?.
b) How do I get my program to include both start and end date?
c) How do I separate a key and value with a semi-colon in a dictionary?
def get_sentiment_dates(start_date, end_date):
positive_dict = {}
negative_dict = {}
neutral_dict = {}
f = open("BAC2_answer.csv", "r")
tweets = f.readlines()
bin_use =[]
bin_trash =[]
bin_use_senti = []
bin_trash_senti = []
start_date_obj = datetime.strptime(start_date, '%Y-%m-%d')
end_date_obj = datetime.strptime(end_date, '%Y-%m-%d')
for i in tweets:
specs = i.split(',')
t_and_d = specs[0]
dt_obj = datetime.strptime(t_and_d, "%Y-%m-%d %H:%M:%S")
chars_body = specs[1].strip()
if ((dt_obj >= start_date_obj) and dt_obj <= (end_date_obj)):
bin_use.append(dt_obj)
bin_use_senti.append(chars_body)
else:
bin_trash.append(dt_obj)
bin_trash_senti.append(chars_body)
num_of_pos = 0
num_of_neg = 0
num_of_neut = 0
for i,j in zip(bin_use, bin_use_senti):
if j == 'Bullish':
num_of_pos +=1
positive_dict = (i, num_of_pos)
elif j == 'Bearish':
num_of_neg+=1
negative_dict = (i, num_of_neg)
else:
num_of_neut+=1
neutral_dict = (i, num_of_neut)
# print str(positive_dict) + "," + str(negative_dict) + "," + str(neutral_dict)
f.close()
return [positive_dict,negative_dict,neutral_dict]
I'm working on converting a Twitter created_at date into an integer and using the following code, but i'm getting an out of index error:
creation_date = [time.strftime('%Y-%m-%d %H:%M:%S',time.strptime(status['created_at']
,'%a %b %d %H:%M:%S +0000 %Y')) for status in statuses]
for x in range(len(creation_date)):
year = int(creation_date[x][0:4])
month = int(creation_date[x][5:7])
day = int(creation_date[x][8:10])
newCreationDate = []
newCreationDate[x] = datetime(year,month,day)
try this:
newCreationDate = []
for x in range(len(creation_date)):
year = int(creation_date[x][0:4])
month = int(creation_date[x][5:7])
day = int(creation_date[x][8:10])
newCreationDate.append(datetime(year,month,day))
You're emptying the newCreationDate array each time through the loop.
newCreationDate = []
for d in creation_date:
year = int(d[0:4])
month = int(d[5:7])
day = int(d[8:10])
newCreationDate.append(datetime(year,month,day))