I have a large text file on the web that I am using requests to obtain and parse data from. The text file begins each line with a format like [Mon Oct 10 08:58:26 2022]. How can I get the latest 7 days or convert only the datetime to an object or string for storing and parsing later? I simply want to extract the timestamps from the log and print them
You can use TimedRotatingFileHandler for daily or 7-days logs.
read more about timed rotating file handler here
and
read more about extracting timestamps from files
Can you tell me if this snippet solves your problem?
from datetime import datetime
log_line = "[Sun Oct 09 06:14:26 2022] Wiladoc is browsing your wares."
_datetime = log_line[1:25]
_datetime_strp = datetime.strptime(_datetime, '%a %b %d %H:%M:%S %Y')
print(_datetime)
print(_datetime_strp)
Related
I went through datetime python page, and other related pages, but unable to get this thing to work.
I have the following string that I want to convert to python date object.
May 29, 2018 10:40:06 CDT AM:
I use the following to match, but python2.7 is giving me doesnt match error.
datetime_object = datetime.strptime(line, '%B %d, %Y %I:%M:%S %Z %p:')
It seems as though CDT is not a valid timezone name as it works with GMT.
>>> str(datetime.strptime('May 29, 2018 10:40:06 GMT AM:', '%B %d, %Y %I:%M:%S %Z %p:'))
'2018-05-29 10:40:06'
from dateutil import parser
print (parser.parse("May 29 13:40:06 CDT 2018"))
Output
2018-05-29 13:40:06
Reference: python-dateutil
Building a Twitter scraper I'm stuck at converting tweet creation datetime (I get it as local timezone) into UTC.
Creation date from data-original-title -attribute's format is 12:17 AM - 8 Apr 2018. How can I convert it to UTC?
First of all you need to convert your string into a python datetime format then i recommend you using pytz module to change the timezone used into UTC timezone like this example:
import datetime
import pytz
a = '12:17 AM - 8 Apr 2018'
final = datetime.datetime.strptime(a, '%I:%M %p - %d %b %Y').replace(tzinfo=pytz.UTC)
print(final)
# 2018-04-08 00:17:00+00:00
Also, if you want to check the converted time into a string representation, you can do :
str_time = final.strftime('%d/%m/%Y %H:%M:%S')
print(str_time)
# '08/04/2018 00:17:00'
Ps: If you don't have pytz module installed in your PC, you can install it by :
sudo pip install pytz
Try Below:
import pandas as pd
datestr = '12:17 AM - 8 Apr 2018'
utcDate = pd.to_datetime(datestr, format='%H:%M %p - %d %b %Y', utc=True)
I am currently working on a programme within the django environment which operates off a json api provided by a third party. There is an object within that API which I want however the string of information it provides is too much for me.
The data I want is the created_at tag from the twitter api using tweepy. This created_at contains data in the following format:
"created_at": "Mon Aug 27 17:21:03 +0000 2012"
This is all fine however this will return the date AND time whereas I simply want the the time part of the above example i.e. 17:21:03.
Is there any way I can just take this part of the created_at response string and store it in a separate variable?
You can use the dateutil module
from dateutil import parser
created_at = "Mon Aug 27 17:21:03 +0000 2012"
created_at = parser.parse(created_at)
print created_at.time()
Output:
17:21:03
Try below code.
my_datetime = response_from_twitter['created_at']
my_time = my_datetime.split(' ')[3]
# my_time will now contain time part.
You could just split the string into a list and take the 4th element:
time = source['created_at'].split(' ')[3]
What about a regular expression with re.search():
>>> import re
>>> d = {"created_at": "Mon Aug 27 17:21:03 +0000 2012"}
>>> re.search('\d{2}:\d{2}:\d{2}', d['created_at']).group(0)
'17:21:03'
I'm trying to get a date for an event from a user.
The input is just a simple html text input.
My main problem is that I don't know how to parse the date.
If I try to pass the raw string, I get a TypeError, as expected.
Does Django have any date-parsing modules?
If you are using django.forms look at DateField.input_formats. This argument allows to define several date formats. DateField tries to parse raw data according to those formats in order.
Django doesn't, so to speak, by Python does. It seems I'm wrong here, as uptimebox's answer shows.
Say you're parsing this string: 'Wed Apr 21 19:29:07 +0000 2010' (This is from Twitter's JSON API)
You'd parse it into a datetime object like this:
import datetime
JSON_time = 'Wed Apr 21 19:29:07 +0000 2010'
my_time = datetime.datetime.strptime(JSON_time, '%a %b %d %H:%M:%S +0000 %Y')
print type(my_time)
You'd get this, confirming it is a datetime object:
<type 'datetime.datetime'>
More information on strptime() can be found here.
(In 2017), you could now use django.utils.dateparse
The DateField can be used outside of Django forms.
Example, when used {{ value|date:"SHORT_DATE_FORMAT" }} in template:
from django.forms import DateField
from django.utils import formats
# need '%d.%m.%Y' instead of 'd.m.Y' from get_format()
dformat = ('.' + formats.get_format("SHORT_DATE_FORMAT", lang=request.LANGUAGE_CODE)).replace('.', '.%').replace('-', '-%').replace('/', '/%')[1:]
dfield = DateField(input_formats=(dformat,))
<date> = dfield.to_python(<string>)
This question already has answers here:
Python FTP get the most recent file by date
(5 answers)
Closed 4 years ago.
How do I determine the most recently modified file from an ftp directory listing? I used the max function on the unix timestamp locally, but the ftp listing is harder to parse. The contents of each line is only separated by a space.
from ftplib import FTP
ftp = FTP('ftp.cwi.nl')
ftp.login()
data = []
ftp.dir(data.append)
ftp.quit()
for line in data:
print line
output:
drwxrwsr-x 5 ftp-usr pdmaint 1536 Mar 20 09:48 .
dr-xr-srwt 105 ftp-usr pdmaint 1536 Mar 21 14:32 ..
-rw-r--r-- 1 ftp-usr pdmaint 5305 Mar 20 09:48 INDEX
Just to make some corrections:
date_str = ' '.join(line.split()[5:8])
time.strptime(date_str, '%b %d %H:%M') # import time
And to find the most recent file
for line in data:
col_list = line.split()
date_str = ' '.join(line.split()[5:8])
if datePattern.search(col_list[8]):
file_dict[time.strptime(date_str, '%b %d %H:%M')] = col_list[8]
date_list = list([key for key, value in file_dict.items()])
s = file_dict[max(date_list)]
print s
If the FTP server supports the MLSD command (and quite possibly it does), you can use the FTPDirectory class from that answer in a related question.
Create an ftplib.FTP instance (eg aftp) and an FTPDirectory instance (eg aftpdir), connect to the server, .cwd to the directory you want, and read the files using aftpdir.getdata(aftp). After that, you get name of the freshest file as:
import operator
max(aftpdir, key=operator.attrgetter('mtime')).name
To parse the date, you can use (from version 2.5 onwards):
datetime.datetime.strptime('Mar 21 14:32', '%b %d %H:%M')
You can split each line and get the date:
date_str = ' '.join(line.split(' ')[5:8])
Then parse the date (check out egenix mxDateTime package, specifically the DateTimeFromString function) to get comparable objects.