How can I rewrite following clause:
if u'января' in date_category['title']:
month = 1
elif u'февраля' in date_category['title']:
month = 2
elif u'марта' in date_category['title']:
month = 3
elif u'апреля' in date_category['title']:
month = 4
elif u'мая' in date_category['title']:
month = 5
elif u'июня' in date_category['title']:
month = 6
elif u'июля' in date_category['title']:
month = 7
elif u'августа' in date_category['title']:
month = 8
elif u'сентября' in date_category['title']:
month = 9
elif u'октября' in date_category['title']:
month = 10
elif u'ноября' in date_category['title']:
month = 11
elif u'декабря' in date_category['title']:
month = 12
It just looks ugly.
To solve this, parse your date information using the python datetime module. It has support for locales, and will sort this out. If genitive forms are really the issue, then just map those forms to the dative, then map back on output.
To answer your actual question -
Consider that you are pairing up data with integers. If you can transform your data into the format (month, integer), you can drive your code with data.
for (n, month) in enumerate(u'января, feb, mar, apri, may, jun, jul, aug, sep, oct, nov, декабря'.split(', '), 1):# the parameter 1 specifies to start counting from 1. h/t #san4ez
if month in date_category['title']: return n
Get system support for Russian internationalization: OS locale support for use in Python
Use the locale module to set the locale: How to get unicode month name in Python?
Use strptime: Handling international dates in python
Demonstration: Locale troubles
Alternatively, just do:
months = ['Jan', 'Feb', 'Mar', ...]
monthToNumber = dict((name,i+1) for i,name in enumerate(months))
monthToNumber[date_category['title']]
months = (u'января', u'февраля', u'марта', u'апреля', u'мая', u'июня', u'июля', u'августа', u'сентября', u'октября', u'ноября', u'декабря')
month = next(i for i,name in enumerate(months,1) if name in date_category['title'])
You may also use the index function in list to get a mapping of months to months number
months = (u'января', u'февраля', u'марта', u'апреля', u'мая', u'июня', u'июля', u'августа', u'сентября', u'октября', u'ноября', u'декабря')
month = months.index(date_category['title'])+1
Create a dict of the months, with corresponding indices:
months = [u'января', u'февраля', u'марта', u'апреля', u'мая', u'июня', u'июля', u'августа', u'сентября', u'октября', u'ноября', u'декабря']
Then you can simply do:
month = max(enumerate((date_category['title'].find(month) for month in months), start = 1), key = lambda x: x[1])[0]
Simply create a hash array which maps your string to corresponding month number, then you can simply iterate through it, check where the key is in date_category['title'], and in all cases, set month to the corresponding value. Good luck.
Here's something a bit more perverse, because I enjoy proving the futility of saying "there should be only one way to do it":
pattern = re.compile(u'(января)|(февраля)|(марта)|(апреля)|(мая)|(июня)|(июля)|(августа)|(сентября)|(октября)|(ноября)|(декабря)')
month = map(bool, x.search(date_category['title']).groups()).index(True) + 1
Related
I am a novice working on a short program with the purpose of detecting dates and printing out whether the given dates are valid or not. Here's how it looks like :
dateRegex = re.compile(r'''(
(0[1-9]|[12]\d|30|31)
[.\\ /]
(0[1-9]|1[0-2])
[.\\ /]
([1-2][0-9]{3})
)''', re.VERBOSE)
def dateValidation(date):
text = str(pyperclip.paste())
mo = date.findall(text)
for groups in mo:
day = groups[1]
month = groups[2]
year = groups[3]
leapyear = ''
if ( month == '04' or month == '06' or month == '09' or month == '11' ) and ( int(day) > 30 ):
print(f'The {groups[0]} string is not a date.')
continue
if int(year) % 4 == 0:
leapyear += year
if int(year) % 100 == 0:
leapyear = ''
if ( int(year) % 100 == 0 ) and ( int(year) % 400 == 0 ):
leapyear += year
if month == '02' and leapyear == year:
if int(day) > 29:
print(f'The {groups[0]} string is not a date.')
continue
elif month == '02' and leapyear != year:
if int(day) > 28:
print(f'The {groups[0]} string is not a date.')
continue
print(f'The {groups[0]} string is a date.')
dateValidation(dateRegex)
I know a lot of the code isn't clean or practical, so I'm open to suggestions about optimizing it, of course ( I'm fairly new to this after all, and apparently doing horribly ), but the question is mainly regarding the output of the program.
I copied 01.02.2016 21.6.2003 26.7.1999 to clipboard and expected to get a result regarding all three dates. Instead, the output was only ''The 01.02.2016 string is a date.'' Did I overlook something ? What could've gone wrong ?
If it isn't obvious from the code, here is a detailed description of what the program is supposed to do :
Write a regular expression that can detect dates in the DD/MM/YYYY format. Assume that the days range from 01 to 31, the months range from 01 to 12, and the years range from 1000 to 2999. Note that if the day or month is a single digit, it’ll have a leading zero.
The regular expression doesn’t have to detect correct days for each month or for leap years; it will accept nonexistent dates like 31/02/2020 or 31/04/2021. Then store these strings into variables named month, day, and year, and write additional code that can detect if it is a valid date. April, June, September, and November have 30 days, February has 28 days, and the rest of the months have 31 days. February has 29 days in leap years. Leap years are every year evenly divisible by 4, except for years evenly divisible by 100, unless the year is also evenly divisible by 400. Note how this calculation makes it impossible to make a reasonably sized regular expression that can detect a valid date.
Thanks in advance.
I think the problem with the regular expression follows from the format of the dates in the text. Since some of the dates are given as 21.6.2003 and not 21.06.2003, your regex misses that.
For the dates you can use the following one:
r'(0*[0-9]|1[0-9]|2[0-9]|3[0-1])\.(0*[0-9]|1[0-2])\.[1-2][0-9]{3})'
Here,
(0*[0-9]|1[0-9]|2[0-9]|3[0-1]) matches the days ranging in 00-31. In the first case, 0* tells regex to match zero or more of the preceding token. So, if the date is given in 06 or 6 format, it can catch both cases
Similar approach also follows in (0*[0-9]|1[0-2]), which finds the month in the range 00-12
I am trying to write a function that converts an 8 character string of the form "yyyymmdd" into integer values of year, month, and day based on the string. The function parses a given string and returns integer year, month and day. I wrote the code for it, but I am having difficulties with returning the correct integer values in the right format. For example: y, m, d = parseDate("19700218") should return the integer values 1970 for y, 2 for m, and 18 for d.
My code is not correct, but I think that the start of it is correct:
def parseDate(datestr):
datestr.split()
datestr.strip()
return datestr
y,m,d = parseDate("19700218")
I hope that this is just an easy fix.
I suggest using strptime and get values from parsed date/datetime object.
This is out of the box, flexible and extendable solution.
Code
s = "19700218"
y = s[0:4]
m = s[5:6]
d = s[6:8]
Output
y=1970
m=02
d=18
Hope it helps.
Try something like this:
def parse_date(string_date):
year = int(string_date[0:4])
month = int(string_date[4:6])
day = int(string_date[6:8])
return year, month, day
year, month, day = parse_date(date)
Test Code
date = '19901131'
def parse_date(string_date):
year = int(string_date[0:4])
month = int(string_date[4:6])
day = int(string_date[6:8])
return year, month, day
year, month, day = parse_date(date)
print(year)
print(month)
print(day)
Output
1990
11
31
Since we are converting to int, leading 0s will be dropped. Let me know if u want to include a format that includes leadings 0s
I can not figure out how to take the year, day and week to return the month. Right now I am just trying to develop a Python Script that will do this. The goal after finishing this script is to use it for a Spark SQL Query to find the month since in my data I am given a day, year and week in each row.
As of now my python code looks like so. This code only works for the statement I have into the print(getmonth(2, 30 ,2018) returning 7. I have tried other dates and the output is only "None". I have tried variables also, but no success there.
import datetime
def month(day, week, year):
for month in range(1,13):
try:
date = datetime.datetime(year, month, day)
except ValueError:
iso_year, iso_weeknum, iso_weekday = date.isocalendar()
if iso_weeknum == week:
return date.month
print(getmonth(2, 30, 2018))
#iso_(year,weeknum,weekday) are the classes for ISO. Year is 1-9999, weeknum is 0-52 or 53, and weekday is 0-6
#isocaldenar is a tuple (year, week#, weekday)
I don't really understand your questions, but i think datetime will work... sorce: Get date from ISO week number in Python:
>>> from datetime import datetime
>>> day = 28
>>> week = 30
>>> year = 2018
>>> t = datetime.strptime('{}_{}_{}{}'.format(day,week,year,-0), '%d_%W_%Y%w')
>>> t.strftime('%W')
'30'
>>> t.strftime('%m')
'07'
>>>
A simpler solution can be created using the pendulum library. As in your code, loop through month numbers, create dates, compare the weeks for these dates against the desired date. If found halt the loop; if the date is not seen then exit the loop with, say, a -1.
>>> import pendulum
>>> for month in range(1,13):
... date = pendulum.create(2018, month, 28)
... if date.week_of_year == 30:
... break
... else:
... month = -1
...
>>> month
7
>>> date
<Pendulum [2018-07-28T00:00:00+00:00]>
Here is a brute force method that loops through the days of the year (It expects the day as Monday being 0 and Sunday being 6, it also returns the Month 0 indexed, January being 0 and December being 11):
import datetime
def month(day, week, year):
#Generate list of No of days of the month
months = [31,28,31,30,31,30,31,31,30,31,30,31]
if((year % 4 == 0 and year % 100 != 0) or year % 400 == 0): months[1] += 1
#ISO wk1 of the yr is the first wk with a thursday, otherwise it's wk53 of the previous yr
currentWeek = 1 if day < 4 else 0
#The day that the chosen year started on
currentDay = datetime.datetime(year, 1, 1).weekday()
#Loop over every day of the year
for i in range(sum(months)):
#If the week is correct and day is correct you're done
if day == currentDay and week == currentWeek:
return months.index(next(filter(lambda x: x!=0, months)))
#Otherwise, go to next day of wk/next wk of yr
currentDay = (currentDay + 1) % 7
if currentDay == 0:
currentWeek += 1
#And decrement counter for current month
months[months.index(next(filter(lambda x: x!=0, months)))]-=1
print(month(2, 30, 2018)) # 6 i.e. July
months.index(next(filter(lambda x: x!=0, months))) is used to get the first month of that we haven't used all of the days of, i.e. the month you're currently in.
Suggest that I have this:
valuestringdate = "24/6/2010"
and I want to get something like this from the variable
day = 24
month = 6
year = 2010
Use the .split() method.
In this case,
dateList = valuestringdate.split("/")
Which would produce list: dateList = ["24","6","2010]
Using indexes:
day = dateList[0] would set day = "24"
From there you can use day = int(day) to convert the day from a string to an integer.
You should be able to figure it out from there.
You could just split the string:
day, month, year = valuestringdate.split('/')
I need to find total objects created in
1. current year
2. current month
3. last month
4. last year
I am thinking like this
this_year = datetime.now().year
last_year = datetime.now().year -1
this_month = datetime.now().month
last month = (datetime.today() - timedelta(days=30)).month
Use like
Order.objects.filter(created_at__month=this_month)
The problem is
last_month i want is calendar month not 30 days back
i am not sure whether created_at__month=this_month will match current month or same month in previous year
is it possible to get all counts in single query
today = datetime.datetime.now()
1 Current year
Order.objects.filter(created_at__year=today.year)
2 Current month
Order.objects.filter(created_at__year=today.year, created_at__month=today.month)
3 Last month
last_month = today.month - 1 if today.month>1 else 12
last_month_year = today.year if today.month > last_month else today.year - 1
Order.objects.filter(created_at__year=last_month_year, created_at__month=last_month)
4 Last year
last_year = today.year - 1
Order.objects.filter(created_at__year=last_year)
5 Single Query
As last year + current year includes last month and current month, and all orders>= last_year includes current year, the query is super simple:
Order.objects.filter(created_at__year__gte=last_year)
I don't think you'll be able to just match the "month" or "year" part of a date field without some significant fiddling or annotating. Most likely, your simplest solution is to define the start and end of the range you want and search against that. And that might involve a little bit of work.
For example, last calendar month would be:
today = datetime.now()
if today.month == 1:
last_month_start = datetime.date(today.year-1, 12, 1)
last_month_end = datetime.date(today.year-1, 12, 31)
else:
last_month_start = datetime.date(today.year, today.month -1, 1)
last_month_end = datetime.date(today.year, today.month, 1) - datetime.timedelta(days=1)
Order.objects.filter(created_at__gte=last_month_start, created_at__lte=last_month_end)
GTE and LTE are "greater than or equal" and "less than or equal". Also worth noting, we use timedelta to figure out what the day before the first of this month is rather than go through all the different cases of whether the previous month had 28, 29, 30 or 31 days.
If you want it in separate queries, do something like that.
from_this_year = Order.objects.filter(created_at__year=this_year)
from_last_year = Order.objects.filter(created_at__year=last_year)
from_june = Order.objects.filter(created_at__month='06',created_at__year=this_year)
from_this_month = Order.objects.filter(created_at__month=this_month,created_at__year=this.year)
note: in my example, I put '06' that is June, but you can change it.