I am trying to parse only the timestamp from a specific line in a log file using python. This is the line from the file:
Mar 29 06:12:42 10.11.100.22 [info.events] [WARNING] 10.11.100.22:
event, 1234
How do I only get the timestamp from this? This is the code I am using at the minute, which finds the line from the file which has the word 'WARNING' in it, and then gets the timestamp.
def is_Warning(self,line):
if line.find("WARNING") >= 0:
ts = time.strptime(line, "%b %d %H:%M:%S")
print "==================== %s" % ts
When I run this I get a 'ValueError: unconverted data remains: 10.11.100.22 [info.events] [WARNING] 10.11.100.22: event, 1234'
Can anyone help?
Use Regex.
import re
...
def is_warning(self,line):
if line.find("WARNING") >= 0:
date = re.match(r"[A-Za-z]{3} \d{1,2} \d{2}:\d{2}:\d{2}",line).group()
ts = time.strptime(date, "%b %d %H:%M:%S")
print("===================== %s" % ts
Note that time is a really old module. You should use datetime.datetime.strptime(date, format).time() if you need to get JUST a time.
The strptime should match the entire string and not just the beginning. Since you know the line's length, you can do this:
ts = time.strptime(line[:15].strip(), "%b %d %H:%M:%S")
The [:15] method will only return the first 15 characters from the string, which are the only characters you need.
Related
My input in file consists below text:
"Tested on: wed Mar 31 09::34:00 CST 2021 "
Looking to extract date field after this bit of text "Tested on:" and convert into date value in "dd/mm/yyyy" format e.g 31/Mar/2021
Used below regex, but getting error when doing:
import re
import os
import dateparser
basepath = (r"C:\Users\xyz\test")
with os.scandir(basepath) as entries:
for entry in entries:
if entry.is_file():
fn = entry.name
def f(fn):
with open(fn) as f:
for s in f:
m = re.search(r'Tested on: \S+ (\S+) (\d+) \d+::\d+:\d+ AET (\d+)', s)
subst = "\\2 \\1 \\3"
result = re.sub(m, subst, s, 0, re.MULTILINE)
if result:
#Using strptime
dt = dataetime.strptime(result.group(1), '%B %d %Y')
dt_out = dt.strftime('%d/%m/%Y')
return (fn, dt_out)
else:
return None
if __name__=="__main__":
folder = (r"C:\Users\xyz\test/")
_, _, filenames = next(os.walk(basepath))
dates = []
#dateparser.parse(dates = [])
for i in filenames:
print(f(folder + i), dates)
Above code return attributerror - 'str' object has no attribute 'group'.
regex doesn't work here. Seems, need to map each month into respective numbers, like march --> 3.
re.sub returns a string, that's why you can't use .group() on it.
Normally result should return the string you want so you can just remove the .group(1) and change the format string accordingly to result
so the line would be dt = datetime.strptime(result, '%d %b %Y')
I am trying to create a message that changes over time in Python. I know I will have to use a carriage return to write over the message. I want to print the following message:
It has been (n) minutes since you ran this
Time started: (date when program started)
Time: (date now)
But when I try to print it, I get this:
It has been 1 minute(s) since the past iteration. Since then, some pretty amazing stuff happened, like:
TIME STARTED: Tue Oct 27 2020, 11:14:41 PM
It has been 2 minute(s) since the past iteration. Since then, some pretty amazing stuff happened, like:
TIME STARTED: Tue Oct 27 2020, 11:14:41 PM
Here is my code:
import psutil #psutil - https://github.com/giampaolo/psutil
import time
import sys
import datetime
execute_shell_or_command = sys.executable
found_executable = False
executable_to_find = 'DragonManiaLegends.exe'
iteration = 1
then = time.time()
# Get a list of all running processes
while not found_executable:
list = psutil.pids()
# Go though list and check each processes executeable name for 'putty.exe'
for i in range(0, len(list)):
try:
p = psutil.Process(list[i])
if p.cmdline()[0].find(executable_to_find) != -1:
# DML found. Kill it
now = time.time()
print(f"Found {executable_to_find}. Killing execution...")
p.kill()
datetime_str_now = datetime.datetime.fromtimestamp(now).strftime('%a %b %d %Y, %I:%M:%S %p')
datetime_str_then = datetime.datetime.fromtimestamp(then).strftime('%a %b %d %Y, %I:%M:%S %p')
print(f"On {datetime_str_then}, you have started your quest to t e r m i n a t e {executable_to_find}"
f"\nOn {datetime_str_now}, you have completed that quest, and it has been terminated."
f"\nIt only took you {round(now - then)} seconds to elimate {executable_to_find}!")
# found_executable = True # Comment this out to loop FOREVER
# break
except:
pass
if 'python.exe' in execute_shell_or_command:
now = time.time()
datetime_str_now = datetime.datetime.fromtimestamp(now).strftime('%a %b %d %Y, %I:%M:%S %p')
datetime_str_then = datetime.datetime.fromtimestamp(then).strftime('%a %b %d %Y, %I:%M:%S %p')
sys.stdout.write(f"\rIt has been {iteration} minute(s) since the past iteration. Since then, some pretty amazing stuff happened, like:\nTIME STARTED: {datetime_str_then}.\nTIME: {datetime_str_now}")
sys.stdout.flush()
# print(f"\rIt has been {iteration} minute(s) since the past iteration. Since then, some pretty amazing stuff happened, like: TIME STARTED: {datetime_str_then}. TIME: {datetime_str_now}", end="", flush=True)
iteration += 1
time.sleep(60)
P.S. I am using Window's command prompt to run the code
Try with two \n, because one will just insert a line break and another will show a blank line:
sys.stdout.write(f"\rIt has been {iteration} minute(s) since the past iteration. Since then, some pretty amazing stuff happened, like:\n\nTIME STARTED: {datetime_str_then}.\n\nTIME: {datetime_str_now}")
I'm trying to make a dynamic function: I give two datetime values and it could read the log between those datetime values, for example:
start_point = "2019-04-25 09:30:46.781"
stop_point = "2019-04-25 10:15:49.109"
I'm thinking of algorithm that checks:
if the dates are equal:
check if the start hour 0 char (09 -> 0) is higher or less than stop hour 0 char (10 -> 1);
same check with the hour 1 char ((start) 09 -> 9, (stop) 10 -> 0);
same check with the minute 0 char;
same check with the minute 1 char;
if the dates differ:
some other checks...
I don't know if I'm not inventing a wheel again, but I'm really lost, I'll list things I tried:
1.
...
cmd = subprocess.Popen(['egrep "2019-04-19 ([0-1][0-9]|2[0-3]):[0-5][0-9]:[0-5][0-9].[0-9]{3}" file.log'], shell=True, stdout=subprocess.PIPE)
cmd_result = cmd.communicate()[0]
for i in str(cmd_result).split("\n"):
print(i)
...
The problem with this one: I added the values from the example and it couldn't work, because it has invalid ranges like hour 1 chars it creates range [9-0], minute char 0 as well [3-1] and etc.
2.
Tried the following solutions from The best way to filter a log by a dates range in python
Any help is appreciated.
EDIT
the log line structure:
...
2019-04-25 09:30:46.781 text text text ...
2019-04-25 09:30:46.853 text text text ...
...
EDIT 2
So I tried the code:
from datetime import datetime as dt
s1 = "2019-04-25 09:34:11.057"
s2 = "2019-04-25 09:59:43.534"
start = dt.strptime('2019-04-25 09:34:11.057','%Y-%m-%d %H:%M:%S.%f')
stop = dt.strptime('2019-04-25 09:59:43.534', '%Y-%m-%d %H:%M:%S.%f')
start_1 = dt.strptime('09:34:11.057','%H:%M:%S.%f')
stop_1 = dt.strptime('09:59:43.534','%H:%M:%S.%f')
with open('file.out','r') as file:
for line in file:
ts = dt.strptime(line.split()[1],'%H:%M:%S.%f')
if (ts > start_1) and (ts < stop_1):
print line
and I got the error
ValueError: time data 'Platform' does not match format '%H:%M:%S.%f'
So it seems I found the other problem it contains sometimes non datetime at line start. Is there a way to provide a regex in which I provide the datetime format?
EDIT 3
Fixed the issue when the string appears at the start of the line which causes ValueError and fixed index out of range error when maybe the other values occur:
try:
ts = dt.strptime(line.split()[1],'%H:%M:%S.%f')
if (ts > start_1) and (ts < stop_1):
print line
except IndexError as err:
continue
except ValueError as err:
continue
So now it lists not in the range I provide, now it read the log
FROM 2019-02-27 09:38:46.229TO 2019-02-28 09:57:11.028. Any thoughts?
Your edit 2 had the right idea. You need to put exception handling in to catch lines which are not formatted correctly and skip them, for example blank lines, or lines that do not have the timestamp. This can be done as follows:
from datetime import datetime
s1 = "2019-04-25 09:24:11.057"
s2 = "2019-04-25 09:59:43.534"
fmt = '%Y-%m-%d %H:%M:%S.%f'
start = datetime.strptime(s1, fmt)
stop = datetime.strptime(s2, fmt)
with open('file.out', 'r') as file:
for line in file:
line = line.strip()
try:
ts = datetime.strptime(' '.join(line.split(' ', maxsplit=2)[:2]), fmt)
if start <= ts <= stop:
print(line)
except:
pass
The whole of the timestamp is used to create ts, this was so it can be correctly compared with start and stop.
Each line first has the trailing newline removed. It is then split on spaces up to twice. The first two splits are then joined back together and converted into a datetime object. If this fails, it implies that you do not have a correctly formatted line.
You can parse a string representing a time, in Python, by using the strptime method. There are numerous working examples on stackoverflow:
Converting string into datetime
However, what if your string represented a time range, as opposed to a specific time; how could you parse the string using the strptime method?
For example, let’s say you have a user input a start and finish time.
studyTime = input("Please enter your study period (start time – finish time)")
You could prompt, or even force, the user to enter the time in a specific format.
studyTime = input("Please enter your study period (hh:mm - hh:mm): ")
Let’s say the user enters 03:00 PM – 05:00 PM. How can we then parse this string using strptime?
formatTime = datetime.datetime.strptime(studyTime, "%I:%M %p")
The above formatTime would only work on a single time, i.e. 03:00 PM, not a start – finish time, 03:00 – 05:00. And the following would mean excess format data and a ValueError would be raised.
formatTime = datetime.datetime.strptime(studyTime, “%I:%M %p - %I:%M %p”)
Of course there are alternatives, such as having the start and finish times as separate strings. However, my question is specifically, is there a means to parse one single string, that contains more than one time representation, using something akin to the below.
formatTime = datetime.datetime.strptime(studyTime, “%I:%M %p - %I:%M %p”)
strptime() can only parse a single datetime string representation.
You have to split the input string by - and load each item with strptime():
>>> from datetime import datetime
>>>
>>> s = "03:00 PM - 05:00 PM"
>>> [datetime.strptime(item, "%I:%M %p") for item in s.split(" - ")]
[datetime.datetime(1900, 1, 1, 15, 0), datetime.datetime(1900, 1, 1, 17, 0)]
Also checked the popular third-parties: dateutil, delorean and arrow - don't think they provide a datetime range parsing functionality. The dateutil's fuzzy_with_tokens() looked promising, but it is throwing errors:
>>> from dateutil.parser import parse
>>> s = "03:00 PM - 05:00 PM"
>>> parse(s, fuzzy_with_tokens=True)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/user/.virtualenvs/so/lib/python2.7/site-packages/dateutil/parser.py", line 1008, in parse
return DEFAULTPARSER.parse(timestr, **kwargs)
File "/Users/user/.virtualenvs/so/lib/python2.7/site-packages/dateutil/parser.py", line 390, in parse
res, skipped_tokens = self._parse(timestr, **kwargs)
TypeError: 'NoneType' object is not iterable
which probably means it is not supposed to parse multiple datetimes too.
I've been searching a solution for this problem all over the web with no luck. I am trying to concatenate a string with a datetime object to form a .json format file however for some reason there is an error while doing so.
This is the code:
data = '{"gpio":"00000000","timestamp":'+str(int(time()))+',"formatted_time":"'+ **str(datetime.datetime.now().strftime("%A %b %d %X"))**+'""","time_zone":"'+str(read_tz_file())+'","firmware":"0.0"}
The even weirder scenario is that when adding any key after the method call it seems to be ok.
If you're writing/reading json, use the json library:
import json
print json.dumps(
dict(gpio='00000000',
timestamp=time(),
formatted_time=datetime.datetime.now().strftime("%A %b %d %X"),
time_zone=read_tz_file(),
firmware='0.0'))
for starters, it might help to put the code in a readable form,
preferrably multi-line (e.g. one json element per line).
this makes it easy to spot quoting errors.
data = ('{' +
'"gpio":"00000000",'+
'"timestamp":'+str(int(time()))+','+
'"formatted_time":"'+ str(datetime.datetime.now().strftime("%A %b %d %X")) +','+
'""",'+
'"time_zone":"'+str(read_tz_file())+'",'+
'"firmware":"0.0"'+
'}')
then try to debug the errors one by one.
e.g. str(int(time())) bails out at me, with a:
Traceback (most recent call last): File "", line 1, in
TypeError: 'module' object is not callable
that's because time is a module not a function, the proper function would be time.time():
data = ('' +
'{"gpio":"00000000",'+
'"timestamp":'+str(int(time.time()))+','+
'"formatted_time":"'+ str(datetime.datetime.now().strftime("%A %b %d %X")) +','+
'""",'+
'"time_zone":"'+str(read_tz_file())+'",'+
'"firmware":"0.0"'+
'}')
this gives me a valid string (after providing a dummy implementation of read_tz_file(), but it is invalid JSON (what's that """ supposed to do`)
a better way would be to construct a dictionary first, and convert that do json:
import json
d={
"gpio": 0,
"timestamp": int(time.time()),
"formatted_time": (datetime.datetime.now().strftime("%A %b %d %X"),
"time-zone": read_tz_file(),
"firmware": "0.0"
}
s=json.dumps()
print(s)
Use json module, to generate json text. Use the same Unix time for timestamp and formatted_time:
import json
import time
ts = int(time.time())
json_text = json.dumps(dict(
gpio="00000000",
timestamp=ts,
formatted_time=time.strftime("%A %b %d %X", time.localtime(ts)),
time_zone=read_tz_file(),
firmware="0.0"))
Note: in general, time.localtime(ts) may provide more info than datetime.now() e.g. in Python 2:
>>> import time
>>> from datetime import datetime
>>> ts = time.time()
>>> time.strftime('%Z%z')
'CEST+0200'
>>> time.strftime('%Z%z', time.localtime(ts))
'CEST+0000'
>>> datetime.now().strftime('%Z%z')
''
>>> datetime.fromtimestamp(ts).strftime('%Z%z')
''
Notice: only time.strftime('%Z%z') provides complete info for the local timezone on my machine, see python time.strftime %z is always zero instead of timezone offset.
On Python 3, datetime.now() too does not provide info about the local timezone:
>>> import time
>>> from datetime import datetime
>>> ts = time.time()
>>> time.strftime('%Z%z')
'CEST+0200'
>>> time.strftime('%Z%z', time.localtime(ts))
'CEST+0200'
>>> datetime.now().strftime('%Z%z')
''
>>> datetime.fromtimestamp(ts).strftime('%Z%z')
''
You could workaround it:
>>> from datetime import timezone
>>> datetime.now(timezone.utc).astimezone().strftime('%Z%z')
'CEST+0200'
>>> datetime.fromtimestamp(ts, timezone.utc).astimezone().strftime('%Z%z')
'CEST+0200'
If you want to work with datetime in Python 3; your code could look like:
#!/usr/bin/env python3
import json
from datetime import datetime, timedelta, timezone
epoch = datetime(1970, 1, 1, tzinfo=timezone.utc)
local_time = datetime.now(timezone.utc).astimezone()
json_text = json.dumps(dict(
gpio="00000000",
timestamp=(local_time - epoch) // timedelta(seconds=1),
formatted_time=local_time.strftime("%A %b %d %X"),
time_zone=read_tz_file(),
firmware="0.0"))