Does a module exist that can fix a faulty time format? - python

I have data in the format as follows:
12/07/2018 23:00
12/07/2018 24:00
13/07/2018 1:00
and wanted to know if there exists a module in python that can change the 12/07/2018 24:00 to 13/07/2018 0:00

This should cover all cases for you assuming your dateformat string is static:
from datetime import datetime, timedelta
def fix_time(time_string):
if '24:00' in time_string: # Does time_string contain silly format
_date, _ = time_string.split() # Ignore time part since it will default to 00:00
calendar_date= datetime.strptime(_date, '%d/%m/%Y')
corrected_time = calendar_date + timedelta(days=1) # Add one day to get correct date
time_string = corrected_time.strftime('%d/%m/%Y %H:%M') # Convert back to str
return time_string
Sample output:
fix_time('31/12/2018 24:00')>'01/01/2019 00:00'
Code could be made more concise but this should be a good start point.

Related

Split URL at - With Python

Does anyone know how I can extract the end 6 characters in a absoloute URL e.g
/es/ideas-de-trading-y-noticias/el-ibex-35-insiste-en-buscar-los-7900-puntos-a-la-espera-de-las--221104
This is not a typical URL sometimetimes it ends -221104
Also, is there a way to turn 221104 into the date 04 11 2022 easily?
Thanks in advance
Mark
You should use the datetime module for parsing strings into datetimes, like so.
from datetime import datetime
url = 'https://www.ig.com/es/ideas-de-trading-y-noticias/el-ibex-35-insiste-en-buscar-los-7900-puntos-a-la-espera-de-las--221104'
datetime_string = url.split('--')[1]
date = datetime.strptime(datetime_string, '%y%m%d')
print(f"{date.day} {date.month} {date.year}")
the %y%m%d text tells the strptime method that the string of '221104' is formatted in the way that the first two letters are the year, the next two are the month, and the final two are the day.
Here is a link to the documentation on using this method:
https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior
If the url always has this structure (that is it has the date at the end after a -- and only has -- once), you can get the date with:
str_date = str(url).split("--")[1]
Relaxing the assumption to have only one --, we can have the code working by just taking the last element of the splitted list (again assuming the date is always at the end):
str_date = str(url).split("--")[-1]
(Thanks to #The Myth for pointing that out)
To convert the obtained date into a datetime.date object and get it in the format you want:
from datetime import datetime
datetime_date = datetime.strptime(str_date, "%y%m%d")
formatted_date = datetime_date.strftime("%d %m %Y")
print(formatted_date) # 04 11 2022
Docs:
strftime
strptime
behaviour of the above two functions and format codes
Taking into consideration the date is constant in the format yy-mm-dd. You can split the URL by:
url = "https://www.ig.com/es/ideas-de-trading-y-noticias/el-ibex-35-insiste-en-buscar-los-7900-puntos-a-la-espera-de-las--221104"
time = url[-6:] # Gets last 6 values
To convert yy-mm-dd into dd mm yy we will use the DateTime module:
import datetime as dt
new_time = dt.datetime.strptime(time, '%y%m%d') # Converts your date into datetime using the format
format_time = dt.datetime.strftime(new_time, '%d-%m-%Y') # Format
print(format_time)
The whole code looks like this:
url = "https://www.ig.com/es/ideas-de-trading-y-noticias/el-ibex-35-insiste-en-buscar-los-7900-puntos-a-la-espera-de-las--221104"
time = url[-6:] # Gets last 6 values
import datetime as dt
new_time = dt.datetime.strptime(time, '%y%m%d') # Converts your date into datetime using the format
format_time = dt.datetime.strftime(new_time, '%d %m %Y') # Format
print(format_time)
Learn more about datetime
You can use python built-in split function.
date = url.split("--")[1]
It gives us 221104
then you can modify the string by rearranging it
date_string = f"{date[4:6]} {date[2:4]} {date[0:2]}"
this gives us 04 11 22
Assuming that -- will only be there as it is in the url you posted, you can do something as follows:
You can split the URL at -- & extract the element
a = 'https://www.ig.com/es/ideas-de-trading-y-noticias/el-ibex-35-insiste-en-buscar-los-7900-puntos-a-la-espera-de-las--221104'
desired_value = a.split('--')[1]
& to convert:
from datetime import datetime
converted_date = datetime.strptime(desired_value , "%y%m%d")
formatted_date = datetime.strftime(converted_date, "%d %m %Y")

extract date, month and year from string in python

I have this column where the string has date, month, year and also time information. I need to take the date, month and year only.
There is no space in the string.
The string is on this format:
date
Tuesday,August22022-03:30PMWIB
Monday,July252022-09:33PMWIB
Friday,January82022-09:33PMWIB
and I expect to get:
date
2022-08-02
2022-07-25
2022-01-08
How can I get the date, month and year only and change the format into yyyy-mm-dd in python?
thanks in advance
Use strptime from datetime library
var = "Tuesday,August22022-03:30PMWIB"
date = var.split('-')[0]
formatted_date = datetime.strptime(date, "%A,%B%d%Y")
print(formatted_date.date()) #this will get your output
Output:
2022-08-02
You can use the standard datetime library
from datetime import datetime
dates = [
"Tuesday,August22022-03:30PMWIB",
"Monday,July252022-09:33PMWIB",
"Friday,January82022-09:33PMWIB"
]
for text in dates:
text = text.split(",")[1].split("-")[0]
dt = datetime.strptime(text, '%B%d%Y')
print(dt.strftime("%Y-%m-%d"))
An alternative/shorter way would be like this (if you want the other date parts):
for text in dates:
dt = datetime.strptime(text[:-3], '%A,%B%d%Y-%I:%M%p')
print(dt.strftime("%Y-%m-%d"))
The timezone part is tricky and works only for UTC, GMT and local.
You can read more about the format codes here.
strptime() only accepts certain values for %Z:
any value in time.tzname for your machine’s locale
the hard-coded values UTC and GMT
You can convert to datetime object then get string back.
from datetime import datetime
datetime_object = datetime.strptime('Tuesday,August22022-03:30PM', '%A,%B%d%Y-%I:%M%p')
s = datetime_object.strftime("%Y-%m-%d")
print(s)
You can use the datetime library to parse the date and print it in your format. In your examples the day might not be zero padded so I added that and then parsed the date.
import datetime
date = 'Tuesday,August22022-03:30PMWIB'
date = date.split('-')[0]
if not date[-6].isnumeric():
date = date[:-5] + "0" + date[-5:]
newdate = datetime.datetime.strptime(date, '%A,%B%d%Y').strftime('%Y-%m-%d')
print(newdate)
# prints 2022-08-02

Conversion of timezone of datetime.time

I am writing Python code using arrow library to convert timezone (which will be UTC by default) in the string (without date) to another timezone.
A sample string value is: 18:30+0000
The code snippet is :
start = '18:30+0000'
start_time = arrow.get(start,'hh:mmZ')
print(start_time.format('hh:mm A ZZ'))
print(start_time.to('Australia/Melbourne').format('hh:mm A ZZ'))
# This should print 05:30 AM +11:00 - but showing 04:09 AM +09:39
I have also tried to convert to other timezones as well.
print(start_time.to('Europe/London').format('hh:mm A ZZ'))
# This should print 06:30 PM +00:00 - but showing 06:28 PM -00:01
Getting UTC now and then converting it to different timezone working perfectly fine
print(arrow.utcnow().to('Australia/Melbourne').format('hh:mm A ZZ'))
SOLUTION:
We have to add temporary date value for conversion if we don't have date value.
def convert_timezone(time_to_convert,timezone):
time_to_convert='20111111 '+time_to_convert
return arrow.get(time_to_convert).to(timezone)
print(convert_timezone('18:30+0000','Australia/Melbourne').format('hh:mm A ZZ'))
add an appropriate date to the input string and this works as you expect:
import arrow
start = '2021-02-01 18:30+0000'
start_time = arrow.get(start)
print(start_time.to('Australia/Melbourne').format('hh:mm A ZZ'))
# 05:30 AM +11:00

ValueError: time data '10/11/2006 24:00' does not match format '%d/%m/%Y %H:%M'

I tried:
df["datetime_obj"] = df["datetime"].apply(lambda dt: datetime.strptime(dt, "%d/%m/%Y %H:%M"))
but got this error:
ValueError: time data '10/11/2006 24:00' does not match format
'%d/%m/%Y %H:%M'
How to solve it correctly?
The reason why this does not work is because the %H parameter only accepts values in the range of 00 to 23 (both inclusive). This thus means that 24:00 is - like the error says - not a valid time string.
I think therefore we have not much other options than convert the string to a valid format. We can do this by first replacing 24:00 with 00:00, and then later increment the day for these timestamps.
Like:
from datetime import timedelta
import pandas as pd
df['datetime_zero'] = df['datetime'].str.replace('24:00', '0:00')
df['datetime_er'] = pd.to_datetime(df['datetime_zero'], format='%d/%m/%Y %H:%M')
selrow = df['datetime'].str.contains('24:00')
df['datetime_obj'] = df['datetime_er'] + selrow * timedelta(days=1)
The last line thus adds one day to the rows that contain 24:00, such that '10/11/2006 24:00' gets converted to '11/11/2006 24:00'. Note however that the above is rather unsafe since depending on the format of the timestamp this will/will not work. For the above it will (probably) work, since there is only one colon. But if for example the datetimes have seconds as well, the filter could get triggered for 00:24:00, so it might require some extra work to get it working.
Your data doesn't follow the conventions used by Python / Pandas datetime objects. There should be only one way of storing a particular datetime, i.e. '10/11/2006 24:00' should be rewritten as '11/11/2006 00:00'.
Here's one way to approach the problem:
# find datetimes which have '24:00' and rewrite
twenty_fours = df['strings'].str[-5:] == '24:00'
df.loc[twenty_fours, 'strings'] = df['strings'].str[:-5] + '00:00'
# construct datetime series
df['datetime'] = pd.to_datetime(df['strings'], format='%d/%m/%Y %H:%M')
# add one day where applicable
df.loc[twenty_fours, 'datetime'] += pd.DateOffset(1)
Here's some data to test:
dateList = ['10/11/2006 24:00', '11/11/2006 00:00', '12/11/2006 15:00']
df = pd.DataFrame({'strings': dateList})
Result after transformations described above:
print(df['datetime'])
0 2006-11-11 00:00:00
1 2006-11-11 00:00:00
2 2006-11-12 15:00:00
Name: datetime, dtype: datetime64[ns]
As indicated in the documentation (https://docs.python.org/2/library/datetime.html#strftime-strptime-behavior), hours go from 00 to 23. 24:00 is then an error.

Python - Time delta from string and now()

I have spent some time trying to figure out how to get a time delta between time values. The only issue is that one of the times was stored in a file. So I have one string which is in essence str(datetime.datetime.now()) and datetime.datetime.now().
Specifically, I am having issues getting a delta because one of the objects is a datetime object and the other is a string.
I think the answer is that I need to get the string back in a datetime object for the delta to work.
I have looked at some of the other Stack Overflow questions relating to this including the following:
Python - Date & Time Comparison using timestamps, timedelta
Comparing a time delta in python
Convert string into datetime.time object
Converting string into datetime
Example code is as follows:
f = open('date.txt', 'r+')
line = f.readline()
date = line[:26]
now = datetime.datetime.now()
then = time.strptime(date)
delta = now - then # This does not work
Can anyone tell me where I am going wrong?
For reference, the first 26 characters are acquired from the first line of the file because this is how I am storing time e.g.
f.write(str(datetime.datetime.now())
Which would write the following:
2014-01-05 13:09:42.348000
time.strptime returns a struct_time.
datetime.datetime.now() returns a datetime object.
The two can not be subtracted directly.
Instead of time.strptime you could use datetime.datetime.strptime, which returns a datetime object. Then you could subtract now and then.
For example,
import datetime as DT
now = DT.datetime.now()
then = DT.datetime.strptime('2014-1-2', '%Y-%m-%d')
delta = now - then
print(delta)
# 3 days, 8:17:14.428035
By the way, you need to supply a date format string to time.strptime or DT.datetime.strptime.
time.strptime(date)
should have raised a ValueError.
It looks like your date string is 26 characters long. That might mean you have a date string like 'Fri, 10 Jun 2011 11:04:17 '.
If that is true, you may want to parse it like this:
then = DT.datetime.strptime('Fri, 10 Jun 2011 11:04:17 '.strip(), "%a, %d %b %Y %H:%M:%S")
print(then)
# 2011-06-10 11:04:17
There is a table describing the available directives (like %Y, %m, etc.) here.
Try this:
import time
import datetime
d = datetime.datetime.now()
now = time.mktime(d.timetuple())
And then apply the delta
if you have the year,month,day of 'then' you may use:
year = 2013
month = 1
day = 1
now_date = datetime.datetime.now()
then_date = now_date.replace(year = year, month = month, day = day)
delta = now_date - then_date

Categories

Resources