Regular expression replace function includes too much text - python

I'm a python newbie. My script (below) contains a function named
"fn_regex_raw_date_string" that is intended to convert
a "raw" date string like this: Mon, Oct 31, 2011 at 8:15 PM
into a date string like this: _2011-Oct-31_PM_8-15_
Question No. 1: When the "raw" date string contains extraneous
characters eg (xxxxxMon, Oct 31, 2011 at 8:15 PMyyyyyy), how should
I modify my regular expression routine to exclude the extraneous characters?
I was tempted to remove my comments from the script below to make it
simpler to read, but I thought it might be more helpful for me to leave
them in the script.
Question No. 2: I suspect that I should code another function that will
replace the "Oct" in "2011-Oct-31_PM_8-15_ " with "11". But I can't
help wondering if there is some way to include that functionality in
my fn_regex_raw_date_string function.
Any help would be much appreciated.
Thank you,
Marceepoo
import sys
import re, pdb
#pdb.set_trace()
def fn_get_datestring_sysarg():
this_scriptz_FULLName = sys.argv[0]
try:
date_string_raw = sys.argv[1]
#except Exception, e:
except Exception:
date_string_raw_error = this_scriptz_FULLName + ': sys.argv[1] error: No command line argument supplied'
print date_string_raw_error
#returnval = this_scriptz_FULLName + '\n' + date_string_raw
returnval = date_string_raw
return returnval
def fn_regex_raw_date_string(date_string_raw):
# Do re replacements
# p:\Data\VB\Python_MarcsPrgs\Python_ItWorks\FixCodeFromLegislaturezCalifCode_MikezCode.py
# see also (fnmatch) p:\Data\VB\Python_MarcsPrgs\Python_ItWorks\bookmarkPDFs.aab.py
#srchstring = r"(.?+)(Sun|Mon|Tue|Wed|Thu|Fri|Sat)(, )(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)( )([\d]{1,2})(, )([\d]{4})( at )([\d]{1,2})(\:)([\d]{1,2})( )(A|P)(M)(.?+)"
srchstring = r"(Sun|Mon|Tue|Wed|Thu|Fri|Sat)(, )(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)( )([\d]{1,2})(, )([\d]{4})( at )([\d]{1,2})(\:)([\d]{1,2})( )(A|P)(M)"
srchstring = re.compile(srchstring)
replacement = r"_\7-\3-\5_\13M_\9-\11_"
#replacement = r"_\8-\4-\6_\14M_\10-\12_"
regex_raw_date_string = srchstring.sub(replacement, date_string_raw)
return regex_raw_date_string
# Mon, Oct 31, 2011 at 8:15 PM
if __name__ == '__main__':
try:
this_scriptz_FULLName = sys.argv[0]
date_string_raw = fn_get_datestring_sysarg()
date_string_mbh = fn_regex_raw_date_string(date_string_raw)
print date_string_mbh
except:
print 'error occurred - fn_get_datestring_sysarg()'

You probably want to use python's standard datetime stuff:
http://docs.python.org/library/time.html#time.strptime
http://mail.python.org/pipermail/tutor/2006-March/045729.html

This code uses a regular expression that replaces everything at the start of a string before an abbreviated weekday is matched, and then everything to the end of the string after matching either AM or PM.
Then it calls datetime.strptime(date_str, date_format) which does the hard work of parsing and gives us a datetime instance:
from datetime import datetime
import calendar
import re
# -------------------------------------
# _months = "|".join(calendar.month_abbr[1:])
_weekdays = "|".join(calendar.day_abbr)
_clean_regex = re.compile(r"""
^
.*?
(?=""" + _weekdays + """)
|
(?<=AM|PM)
.*?
$
""", re.X)
# -------------------------------------
def parseRawDateString(raw_date_str):
try:
date_str = _clean_regex.sub("", raw_date_str)
return datetime.strptime(date_str, "%a, %b %d, %Y at %I:%M %p")
except ValueError as ex:
print("Error parsing date from '{}'!".format(raw_date_str))
raise ex
# -------------------------------------
if __name__ == "__main__":
from sys import argv
s = argv[1] if len(argv) > 1 else "xxxxxMon, Oct 31, 2011 at 8:15 PMyyyyyy"
print("Raw date: '{}'".format(s))
d = parseRawDateString(s)
print("datetime object:")
print(d)
print("Formatted date: '{}'".format(d.strftime("%A, %d %B %Y # %I:%M %p")))

Related

How do I convert a string to a datetime without many nested try...excepts?

I'm trying to check user input of a date/time in several allowable formats. (I know about the dateutil library. It's not what I'm looking for in this case.)
If some user input was accepted, the function must return a datetime object.
If ALL "try...except" fail — the function must return NONE. But I have 30-50 different date/time formats that I need to check.
I'm confused by the huge indentation in my code! How do I organize this format checking in a good style with GOOD performance?
# Test format check program
import datetime
def datetime_format_check(str):
try:
dt = datetime.datetime.strptime(str, "%y-%m-%d %H:%M")
return dt
except:
try:
dt = datetime.datetime.strptime(str, "%Y-%m-%d %H:%M")
return dt
except:
try:
dt = datetime.datetime.strptime(str, "%y-%m-%d")
return dt
except:
try:
dt = datetime.datetime.strptime(str, "%Y-%m-%d")
return dt
except:
try:
dt = datetime.datetime.strptime(str, "%H:%M")
return dt
except:
try:
# . . .
# many many try...except blocks )))
# . . .
return None # last except far far away from a screen border. ))))
while True:
str = input("Input date: ")
print("Result: ", datetime_format_check(str))
Repetitive code? Well, that just begs to be replaced with a loop.
Put all of the formats in a list and iterate over it, checking each format:
def datetime_format_check(s):
formats = ["%y-%m-%d %H:%M", "%Y-%m-%d %H:%M", "%y-%m-%d"] # etc
for format in formats:
try:
dt = datetime.datetime.strptime(s, format)
return dt
except ValueError:
pass
return None
Some minor corrections I made to your code:
Don't name your argument str; it shadows the builtin.
Don't use a bare except:, always catch the specific exception.

Where is argument read by Python parse_args?

I want to call a Python module from the command line to convert a time in my timezone to UTC time like this:
$ dt-la-utc.py "2017-10-14 12:10:00"
When I execute the module shown below, the convert_la_utc function works correctly if I hard-code the date and time. However, I want to feed it the date and time as input on the command line. But the parse_args function isn't working. If I run the Python debugger and examine the "args" variable, there's nothing in it. What am I doing wrong?
#!/usr/bin/env python
import argparse
import datetime
from pdb import set_trace as debug
import pytz
import sys
def parse_args():
"""Parse arguments."""
parser = argparse.ArgumentParser(description="Convert LA time to UTC time.")
parser.add_argument("dt", help="LA date and time in format: YYYY-MM-DD HH:MM:SS")
args = parser.parse_args()
debug()
return args
def convert_la_utc():
"""Convert time in Los Angeles to UTC time."""
date = '2017-10-12'
time = '20:45:00'
date_time = date + ' ' + time
datetime_format = '%Y-%m-%d %H:%M:%S'
local = pytz.timezone("America/Los_Angeles")
naive = datetime.datetime.strptime(date_time, datetime_format)
local_dt = local.localize(naive, is_dst=None)
utc_dt = local_dt.astimezone(pytz.utc)
print "Datetime in Los Angeles: {0}".format(date_time)
print "UTC equivalent datetime: {0}".format(utc_dt.strftime("%Y-%m-%d %H:%M:%S"))
def main():
args = parse_args()
convert_la_utc()
if __name__ == '__main__':
sys.exit(main())
You need to further retrieve your argument, for example:
def main():
args = parse_args()
dt = args.dt
What parser.parse_args() returns is an argparse.Namespace object - you can verify it by adding print type(args) in your def main(). More explanation can be found here.

How to insert a string on a specific line in a Sublime Text Plugin?

I have the following plugin that puts a time stamp at the top of the document on line 1 but I'd like it to insert the string on a different line, like line 6. At first I thought the insert method was 0 indexed but that doesn't seem to be the case. How would I tell the insert method which line to insert the signature string at?
import sublime, sublime_plugin
import datetime, getpass
class SignatureCommand(sublime_plugin.TextCommand):
def run(self, edit):
signature = "[%s]\n" % (datetime.datetime.now().strftime("%A, %B %d %I:%M %p"))
self.view.insert(edit, 0, signature)
Thanks for your help :)
Update: thanks to Enteleform for the wonderful answer, I added a line_num variable for added clarity :)
import sublime, sublime_plugin
import datetime, getpass
class SignatureOnSpecificLineCommand(sublime_plugin.TextCommand):
def run(self, edit):
line_num = 6 # line number that signature will go on
signature = "[%s]\n" % (datetime.datetime.now().strftime("%A, %B %d %I:%M %p"))
line6_column0 = self.view.text_point(line_num - 1, 0)
self.view.insert(edit, line6_column0, signature)
view.insert() takes a point as it's location argument.
Points are essentially sequential character positions within a document.
For example, in the following document:
Hello
World
a caret at the end of World would be at point 11
5 characters in Hello
1 NewLine character after Hello
5 characters in World
In order to calculate the point of a particular row & column, use:
view.text_point(row, column)
Example:
import sublime, sublime_plugin
import datetime, getpass
class SignatureCommand(sublime_plugin.TextCommand):
def run(self, edit):
signature = "[%s]\n" % (datetime.datetime.now().strftime("%A, %B %d %I:%M %p"))
line = 6
point = self.view.text_point(line - 1, 0)
self.view.insert(edit, point, signature)
Note:
rows start at 0 and thus are offset from the displayed lines in SublimeText by -1, which is why I included line - 1 in view.text_point()

Proper way to use python's datetime's strftime method

I am using FUSE (a virtual file system) to try and implement a read call that will give me the current date/time as a string.
import os
import sys
import errno
import datetime
from fuse import FUSE, FuseOSError, Operations
class FileSys(Operations):
def __init__(self, root):
self.root = root
def _full_path(self, partial):
if partial.startswith("/"):
partial = partial[1:]
path = os.path.join(self.root, partial)
return path
# allows us to set attributes
def getattr(self, path, fh= None):
full_path = self._full_path(path)
st = os.lstat(full_path)
return dict((key, getattr(st, key)) for key in ('st_atime', 'st_ctime',
'st_gid', 'st_mode', 'st_mtime', 'st_nlink', 'st_size', 'st_uid'))
# allows us to see files
def readdir(self, path, fh):
#logging.info("Enter readdir")
full_path = self._full_path(path)
dirents = ['.', '..']
if(os.path.isdir(full_path)):
dirents.extend(os.listdir(full_path))
for r in dirents:
yield r
def read(self, path, length, offset, fh= None):
date = datetime.datetime.today()
date = date.strftime("%a %b %d %H:%M:%S %Z %Y")
return date
def main(root, mountpoint):
FUSE(FileSys(root), mountpoint, foreground= True)
if __name__ == '__main__':
main('/home/user/mydir', '/mnt/dummy')
However, my output is printing like this
Tue May 2
When I really want something like this
Tue May 27 14:43:06 CDT 2014
So, only getting up to the first digit of the day. Anyone see what I am doing wrong? I looked at the strftime documentation and I am sure all of my letters are corresponding to the correct pieces of the formatted string.

Linking Python to AppleScript for datetime

Basically my goal is to be able to use a start and end date from Python for arguments in this AppleScript..
import commands
cmd = """osascript -e 'tell application "Calendar"
set all_calendars to title of every calendar
if "SimpleCal" is in all_calendars then
set primary_calendar to "SimpleCal"
else
create calendar with name "SimpleCal"
set primary_calendar to "SimpleCal"
end if
set start_date_mod to date %s
set end_date_mod to date "Wednesday, August 14, 2013 at 8:10:00 PM"
tell calendar primary_calendar
set new_event to make new event at end with properties {description:"Imported with App", summary:"event_type", location:"", start date: start_date_mod, end date:end_date_mod}
tell new_event
make new display alarm at end with properties {trigger interval:-5}
end tell
end tell
end tell
'""" % ("Wednesday, August 14, 2013 at 8:00:00 PM")
status = commands.getoutput(cmd)
print status
Use strftime to convert the python date into a suitable string that Applescript can coerce into a date object:
import datetime
>>> AS_DATE_FORMAT = "%A, %B %d, %Y %I:%M:%S %p"
>>> right_now = datetime.datetime.now()
>>> date_string = right_now.strftime(AS_DATE_FORMAT)
'Wednesday, August 14, 2013 11:01:10 AM'
Then, in your AppleScript section, you just add the date:
"set start_date_mod to date %s
set end_date_mod to date " + date_string

Categories

Resources