retrieve different values from a single string with regular expressions and python - python

Suggest that I have this:
valuestringdate = "24/6/2010"
and I want to get something like this from the variable
day = 24
month = 6
year = 2010

Use the .split() method.
In this case,
dateList = valuestringdate.split("/")
Which would produce list: dateList = ["24","6","2010]
Using indexes:
day = dateList[0] would set day = "24"
From there you can use day = int(day) to convert the day from a string to an integer.
You should be able to figure it out from there.

You could just split the string:
day, month, year = valuestringdate.split('/')

Related

Split a list of 'dd.mm.yyyy hh' strings into four seperate lists of integers

I'm very new to programming. I've tried to search the website for similar problems, but can't find the information I need.
I have a list that contains multiple strings showing year, month, day and hour. I need to split this list into four lists of years, months, days and hours. The values have to be integers in the four lists.
The time format is: 'dd.mm.yyyy hh', example: '01.11.2020 02'
I'm able to split the string '01.11.2020 02' using this code:
timeStamp = '01.11.2020 02'
def getYear(timeStampStr):
yearStr = timeStampStr[6:10]
year = int(yearStr)
return year
def getMonth(timeStampStr):
monthStr = timeStampStr[3:5]
month = int(monthStr)
return month
def getDay(timeStampStr):
dayStr = timeStampStr[0:2]
day = int(dayStr)
return day
def getHour(timeStampStr):
hourStr = timeStampStr[11:13]
hour = int(hourStr)
return hour
I can then get the wanted result with:
print(getMonth(timeStamp))
However, this doesnt work when timeStamp is a list;
timeStamp = ['01.11.2020 00:00', '01.11.2020 01:00', '01.11.2020 02:00', etc].
What can I do to split it into four?
For list type you can use map like this
timeStamp = ['01.11.2020 00:00', '01.11.2020 01:00', '01.11.2020 02:00']
print(list(map(getMonth, timeStamp)))
you can also use split function to divide the string and then label them
timeStamp.split('.')

Changing an 8 character string in the form "yyyymmdd" into year, month, and day integer values

I am trying to write a function that converts an 8 character string of the form "yyyymmdd" into integer values of year, month, and day based on the string. The function parses a given string and returns integer year, month and day. I wrote the code for it, but I am having difficulties with returning the correct integer values in the right format. For example: y, m, d = parseDate("19700218") should return the integer values 1970 for y, 2 for m, and 18 for d.
My code is not correct, but I think that the start of it is correct:
def parseDate(datestr):
datestr.split()
datestr.strip()
return datestr
y,m,d = parseDate("19700218")
I hope that this is just an easy fix.
I suggest using strptime and get values from parsed date/datetime object.
This is out of the box, flexible and extendable solution.
Code
s = "19700218"
y = s[0:4]
m = s[5:6]
d = s[6:8]
Output
y=1970
m=02
d=18
Hope it helps.
Try something like this:
def parse_date(string_date):
year = int(string_date[0:4])
month = int(string_date[4:6])
day = int(string_date[6:8])
return year, month, day
year, month, day = parse_date(date)
Test Code
date = '19901131'
def parse_date(string_date):
year = int(string_date[0:4])
month = int(string_date[4:6])
day = int(string_date[6:8])
return year, month, day
year, month, day = parse_date(date)
print(year)
print(month)
print(day)
Output
1990
11
31
Since we are converting to int, leading 0s will be dropped. Let me know if u want to include a format that includes leadings 0s

Python Function running on a random date

#first and last day of every month
s_january, e_january = ("1/1/2017"), ("1/31/2017")
s_february, e_february = ("2/1/2017"), ("2/28/2017")
s_march, e_march = ("3/1/2017"), ("3/31/2017")
s_april, e_april = ("4/1/2017"), ("4/30/2017")
s_may, e_may = ("5/1/2017"), ("5/31/2017")
s_june, e_june = ("6/1/2017"), ("6/30/2017")
s_july, e_july = ("7/1/2017"), ("7/31/2017")
s_august, e_august = ("8/1/2017"), ("8/31/2017")
s_September, e_September = ("9/1/2017"), ("9/30/2017")
s_october, e_october = ("10/1/2017"), ("10/31/2017")
s_november, e_november = ("11/1/2017"), ("11/30/2017")
s_december, e_december = ("12/1/2017"), ("12/31/2017")
def foo(s_date, e_date):
does stuff
foo(s_january, e_january)
foo(s_february, e_february)
foo(s_march, e_march)
foo(s_april, e_april)
foo(s_may, e_may)
foo(s_june, e_june)
foo(s_july, e_july)
foo(s_august, e_august)
foo(s_september, e_september)
foo(s_october, e_october)
foo(s_november, e_november)
foo(s_december, e_december)
I have a function that on a random date does stuff, but I have to call the function for every month, if I put the range for year I don't get the result that I want.
Is there any better way to avoid running it 12 times?
Set up your dates in a dictionary rather than 24 variables, and make life easier for yourself by computing the first and last day of each month. It would be useful also to represent your dates as datetimes not strings, since it's clear from your question header that you want to do computation on them.
import datetime
from dateutil import relativedelta
year = 2017
dates = {}
for month in range(1,13):
dates[(year,month)] = (
datetime.date(year,month,1),
datetime.date(year,month,1)
+ relativedelta.relativedelta(months=1)
- relativedelta.relativedelta(days=1))
The first element in each tuple is computed straightforwardly as the first day of the month. The second date is the same date, but with one month added (first day of the next month) and then one day subtracted, to get the last day of the month.
Then you can do:
for (year,month),(start,end) in dates.items():
print(year, month, foo (start,end))
You could use a dictionary to keep all start end end dates:
import calendar
import datetime as dt
def foo(s_date, e_date):
print ("Doing something between {} and {}".format(s_date.strftime('%d/%m/%Y'), e_date.strftime('%d/%m/%Y')))
def getMonths(year):
result = {}
for month in range(1, 13):
lastDayOfMonth = calendar.monthrange(year, month)[1]
result[month] = (dt.datetime(year, month, 1), dt.datetime(year, month, lastDayOfMonth))
return result
for month, start_end_dates in getMonths(2018).items():
foo(*start_end_dates)
Prints:
Doing something between 01/01/2018 and 31/01/2018
Doing something between 01/02/2018 and 28/02/2018
Doing something between 01/03/2018 and 31/03/2018
...
What do you mean by putting the range for year?
You could consider putting your dates to a dictionary or nested lists.

How to add variables together into a new variable where you control the separation

Let's say i've declared three variables which are a date, how can I combine them into a new variable where i can print them in the correct 1/2/03 format by simply printing the new variable name.
month = 1
day = 2
year = 03
date = month, day, year <<<<< What would this code have to be?
print(date)
I know i could set the sep='/' argument in the print statement if i call all three variables individually, but this means i can't add addition text into the print statement without it also being separated by a /. therefore i need a single variable i can call.
The .join() method does what you want (assuming the input is strings):
>>> '/'.join((month, day, year))
1/2/03
As does all of Python's formatting options, e.g.:
>>> '%s/%s/%s' % (month, day, year)
1/2/03
But date formatting (and working with dates in general) is tricky, and there are existing tools to do it "right", namely the datetime module, see date.strftime().
>>> date = datetime.date(2003, 1, 2)
>>> date.strftime('%m/%d/%y')
'01/02/03'
>>> date.strftime('%-m/%-d/%y')
'1/2/03'
Note the - before the m and the d to suppress leading zeros on the month and date.
You can use the join method. You can also use a list comprehension to format the strings so they are each 2 digits wide.
>>> '/'.join('%02d' % i for i in [month, day, year])
'01/02/03'
You want to read about the str.format() method:
https://docs.python.org/3/library/stdtypes.html#str.format
Or if you're using Python 2:
https://docs.python.org/2/library/stdtypes.html#str.format
The join() function will also work in this case, but learning about str.format() will be more useful to you in the long run.
The correct answer is: use the datetime module:
import datetime
month = 1
day = 2
year = 2003
date = datetime(year, month, day)
print(date)
print(date.strftime("%m/%d/%Y"))
# etc
Trying to handle dates as tuples is just a PITA, so don't waste your time.

Parsing date string in Python

How can I rewrite following clause:
if u'января' in date_category['title']:
month = 1
elif u'февраля' in date_category['title']:
month = 2
elif u'марта' in date_category['title']:
month = 3
elif u'апреля' in date_category['title']:
month = 4
elif u'мая' in date_category['title']:
month = 5
elif u'июня' in date_category['title']:
month = 6
elif u'июля' in date_category['title']:
month = 7
elif u'августа' in date_category['title']:
month = 8
elif u'сентября' in date_category['title']:
month = 9
elif u'октября' in date_category['title']:
month = 10
elif u'ноября' in date_category['title']:
month = 11
elif u'декабря' in date_category['title']:
month = 12
It just looks ugly.
To solve this, parse your date information using the python datetime module. It has support for locales, and will sort this out. If genitive forms are really the issue, then just map those forms to the dative, then map back on output.
To answer your actual question -
Consider that you are pairing up data with integers. If you can transform your data into the format (month, integer), you can drive your code with data.
for (n, month) in enumerate(u'января, feb, mar, apri, may, jun, jul, aug, sep, oct, nov, декабря'.split(', '), 1):# the parameter 1 specifies to start counting from 1. h/t #san4ez
if month in date_category['title']: return n
Get system support for Russian internationalization: OS locale support for use in Python
Use the locale module to set the locale: How to get unicode month name in Python?
Use strptime: Handling international dates in python
Demonstration: Locale troubles
Alternatively, just do:
months = ['Jan', 'Feb', 'Mar', ...]
monthToNumber = dict((name,i+1) for i,name in enumerate(months))
monthToNumber[date_category['title']]
months = (u'января', u'февраля', u'марта', u'апреля', u'мая', u'июня', u'июля', u'августа', u'сентября', u'октября', u'ноября', u'декабря')
month = next(i for i,name in enumerate(months,1) if name in date_category['title'])
You may also use the index function in list to get a mapping of months to months number
months = (u'января', u'февраля', u'марта', u'апреля', u'мая', u'июня', u'июля', u'августа', u'сентября', u'октября', u'ноября', u'декабря')
month = months.index(date_category['title'])+1
Create a dict of the months, with corresponding indices:
months = [u'января', u'февраля', u'марта', u'апреля', u'мая', u'июня', u'июля', u'августа', u'сентября', u'октября', u'ноября', u'декабря']
Then you can simply do:
month = max(enumerate((date_category['title'].find(month) for month in months), start = 1), key = lambda x: x[1])[0]
Simply create a hash array which maps your string to corresponding month number, then you can simply iterate through it, check where the key is in date_category['title'], and in all cases, set month to the corresponding value. Good luck.
Here's something a bit more perverse, because I enjoy proving the futility of saying "there should be only one way to do it":
pattern = re.compile(u'(января)|(февраля)|(марта)|(апреля)|(мая)|(июня)|(июля)|(августа)|(сентября)|(октября)|(ноября)|(декабря)')
month = map(bool, x.search(date_category['title']).groups()).index(True) + 1

Categories

Resources