Split URL at - With Python - python

Does anyone know how I can extract the end 6 characters in a absoloute URL e.g
/es/ideas-de-trading-y-noticias/el-ibex-35-insiste-en-buscar-los-7900-puntos-a-la-espera-de-las--221104
This is not a typical URL sometimetimes it ends -221104
Also, is there a way to turn 221104 into the date 04 11 2022 easily?
Thanks in advance
Mark

You should use the datetime module for parsing strings into datetimes, like so.
from datetime import datetime
url = 'https://www.ig.com/es/ideas-de-trading-y-noticias/el-ibex-35-insiste-en-buscar-los-7900-puntos-a-la-espera-de-las--221104'
datetime_string = url.split('--')[1]
date = datetime.strptime(datetime_string, '%y%m%d')
print(f"{date.day} {date.month} {date.year}")
the %y%m%d text tells the strptime method that the string of '221104' is formatted in the way that the first two letters are the year, the next two are the month, and the final two are the day.
Here is a link to the documentation on using this method:
https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior

If the url always has this structure (that is it has the date at the end after a -- and only has -- once), you can get the date with:
str_date = str(url).split("--")[1]
Relaxing the assumption to have only one --, we can have the code working by just taking the last element of the splitted list (again assuming the date is always at the end):
str_date = str(url).split("--")[-1]
(Thanks to #The Myth for pointing that out)
To convert the obtained date into a datetime.date object and get it in the format you want:
from datetime import datetime
datetime_date = datetime.strptime(str_date, "%y%m%d")
formatted_date = datetime_date.strftime("%d %m %Y")
print(formatted_date) # 04 11 2022
Docs:
strftime
strptime
behaviour of the above two functions and format codes

Taking into consideration the date is constant in the format yy-mm-dd. You can split the URL by:
url = "https://www.ig.com/es/ideas-de-trading-y-noticias/el-ibex-35-insiste-en-buscar-los-7900-puntos-a-la-espera-de-las--221104"
time = url[-6:] # Gets last 6 values
To convert yy-mm-dd into dd mm yy we will use the DateTime module:
import datetime as dt
new_time = dt.datetime.strptime(time, '%y%m%d') # Converts your date into datetime using the format
format_time = dt.datetime.strftime(new_time, '%d-%m-%Y') # Format
print(format_time)
The whole code looks like this:
url = "https://www.ig.com/es/ideas-de-trading-y-noticias/el-ibex-35-insiste-en-buscar-los-7900-puntos-a-la-espera-de-las--221104"
time = url[-6:] # Gets last 6 values
import datetime as dt
new_time = dt.datetime.strptime(time, '%y%m%d') # Converts your date into datetime using the format
format_time = dt.datetime.strftime(new_time, '%d %m %Y') # Format
print(format_time)
Learn more about datetime

You can use python built-in split function.
date = url.split("--")[1]
It gives us 221104
then you can modify the string by rearranging it
date_string = f"{date[4:6]} {date[2:4]} {date[0:2]}"
this gives us 04 11 22

Assuming that -- will only be there as it is in the url you posted, you can do something as follows:
You can split the URL at -- & extract the element
a = 'https://www.ig.com/es/ideas-de-trading-y-noticias/el-ibex-35-insiste-en-buscar-los-7900-puntos-a-la-espera-de-las--221104'
desired_value = a.split('--')[1]
& to convert:
from datetime import datetime
converted_date = datetime.strptime(desired_value , "%y%m%d")
formatted_date = datetime.strftime(converted_date, "%d %m %Y")

Related

extract date, month and year from string in python

I have this column where the string has date, month, year and also time information. I need to take the date, month and year only.
There is no space in the string.
The string is on this format:
date
Tuesday,August22022-03:30PMWIB
Monday,July252022-09:33PMWIB
Friday,January82022-09:33PMWIB
and I expect to get:
date
2022-08-02
2022-07-25
2022-01-08
How can I get the date, month and year only and change the format into yyyy-mm-dd in python?
thanks in advance
Use strptime from datetime library
var = "Tuesday,August22022-03:30PMWIB"
date = var.split('-')[0]
formatted_date = datetime.strptime(date, "%A,%B%d%Y")
print(formatted_date.date()) #this will get your output
Output:
2022-08-02
You can use the standard datetime library
from datetime import datetime
dates = [
"Tuesday,August22022-03:30PMWIB",
"Monday,July252022-09:33PMWIB",
"Friday,January82022-09:33PMWIB"
]
for text in dates:
text = text.split(",")[1].split("-")[0]
dt = datetime.strptime(text, '%B%d%Y')
print(dt.strftime("%Y-%m-%d"))
An alternative/shorter way would be like this (if you want the other date parts):
for text in dates:
dt = datetime.strptime(text[:-3], '%A,%B%d%Y-%I:%M%p')
print(dt.strftime("%Y-%m-%d"))
The timezone part is tricky and works only for UTC, GMT and local.
You can read more about the format codes here.
strptime() only accepts certain values for %Z:
any value in time.tzname for your machine’s locale
the hard-coded values UTC and GMT
You can convert to datetime object then get string back.
from datetime import datetime
datetime_object = datetime.strptime('Tuesday,August22022-03:30PM', '%A,%B%d%Y-%I:%M%p')
s = datetime_object.strftime("%Y-%m-%d")
print(s)
You can use the datetime library to parse the date and print it in your format. In your examples the day might not be zero padded so I added that and then parsed the date.
import datetime
date = 'Tuesday,August22022-03:30PMWIB'
date = date.split('-')[0]
if not date[-6].isnumeric():
date = date[:-5] + "0" + date[-5:]
newdate = datetime.datetime.strptime(date, '%A,%B%d%Y').strftime('%Y-%m-%d')
print(newdate)
# prints 2022-08-02

How to search for string between whitespace and marker? Python

My problem is the following:
I have the string:
datetime = "2021/04/07 08:30:00"
I want to save in the variable hour, 08 and
I want to save in the variable minutes, 30
What I've done is the following:
import re
pat = re.compile(' (.*):')
hour = re.search(pat, datetime)
minutes = re.search(pat, datetime)
print(hour.group(1))
print(minutes.group(1))
What I obtain from the prints is
08:30 and 30, so the minutes are correct but for some reason that I'm not understanding, in the hours the first : is skipped and takes everything from the whitespace to the second :.
What am I doing wrong? Thank you.
Please use strptime from datetime module which is recommended way to handle string dates in python.
strptime returns a datetime object from the string date, and this datetime object comes with all sorts of goodies like date, time, hour, isoformat, timestamp etc which makes working with datetimes breeze.
datetime.datetime.strptime("2021/04/07 08:30:00", "%Y/%m/%d %H:%M:%S")
datetime.datetime(2021, 4, 7, 8, 30)
datetime.datetime.strptime("2021/04/07 08:30:00", "%Y/%m/%d %H:%M:%S").hour
8
datetime.datetime.strptime("2021/04/07 08:30:00", "%Y/%m/%d %H:%M:%S").second
0
Ah, no no, python has a much better approach with datetime.strptime
https://www.programiz.com/python-programming/datetime/strptime
So for you:
from datetime import datetime
dt_string = "2021/04/07 08:30:00"
# Considering date is in dd/mm/yyyy format
dt_object1 = datetime.strptime(dt_string, "%Y/%m/%d %H:%M:%S")
You want hours?
hours = dt_object1.hour
or minutes?
mins = dt_object1.minute
Now, if what you have presented is just an example of where you need to work around whitespace, then you could split the string up. Again with dt_string:
dt_string1 = dt_string.split(" ")
dateString = dt_string1.split("/") # A list in [years, months, days]
timeString = dt_string2.split(":") # A list in [hours, minutes, seconds]
Wildcard . matches any single character, even the :. So .* matches the 08:30.
Use:
hour = re.search('\ ([0-9]*):', datetime)
Output:
>>> hour.group(1)
'08'
You can try below regex to make it non greedy and stop at first :
hour = re.search(' (.*?):', datetime)
An alternative to what you are doing is to split the original datetime by the space into a variable such as dates, which will give you ['2021/04/07', '08:30:00']. You can then access the second value of the list variable dates and split it again by ':', to get the individual time, and access the parts of the list varaible time for hours, minutes, and seconds, from the variable time.
datetime = "2021/04/07 08:30:00"
dates = datetime.split(" ")
print(dates)
time = dates[1].split(":")
print(time)
Printing the code will give you
print(dates) --> ['2021/04/07', '08:30:00']
print(time) --> ['08', '30', '00']
You can access individual parts of time with time[0] for '08', time[1] for '30' etc.
I would use re.compile with named capture groups and iterate:
inp = "Hello World 2021/04/07 08:30:00 Goodbye"
r = re.compile(r'\b\d{4}/\d{2}/\d{2} (?P<hour>\d{2}):(?P<minute>\d{2}):\d{2}\b')
output = [m.groupdict() for m in r.finditer(inp)]
print(output[0]['hour']) # 08
print(output[0]['minute']) # 30
This is a simple datetime question. Python already has the ability to do exactly what you need. 3 steps:
use strptime to generate your date time from your string.
you can get the formatting options here
return just the hour or minute from the datetime object
from datetime import datetime
dt_string = "2021/04/07 08:30:00"
dt_object = datetime.strptime(dt_string, "%Y/%m/%d %H:%M:%S")
print(dt_object)
print(dt_object.hour, dt_object.minute)
# 2021-04-07 08:30:00
# 8 30
You can do that thing using Striptime in some cases if you need to do with string you can search those thing using +[what are the things inside here you can make it here example 0-7 or a-z or A-Z or symbols]+
import re
datetime = "2021/04/07 08:30:00"
#here you need to make a change
hour = re.search(' (.*):+[0-7]+:', datetime)
minutes = re.search(':(.*):', datetime)
print(hour.group(1))
print(minutes.group(1))

Remove time in date format in Python

I using:
s = "20200113"
final = datetime.datetime.strptime(s, '%Y%m%d')
I need convert a number in date format (2020-01-13)
but when I print final:
2020-01-13 00:00:00
Tried datetime.date(s, '%Y%m%d') but It's returns a error:
an integer is required (got type str)
Is there any command to get only date without hour?
Once you have a datetime object just use strftime
import datetime
d = datetime.datetime.now() # Some datetime object.
print(d.strftime('%Y-%m-%d'))
which gives
2020-02-20
You can use strftime to convert back in the format you need :
import datetime
s = "20200113"
temp = datetime.datetime.strptime(s, '%Y%m%d')
# 2020-01-13 00:00:00
final = temp.strftime('%Y-%m-%d')
print(final)
# 2020-01-13
Use datetime.date(year, month, day). Slice your string and convert to integers to get the year, month and day. Now it is a datetime.date object, you can use it for other things. Here, however, we use .strftime to convert it back to text in your desired format.
s = "20200113"
year = int(s[:4]) # 2020
month = int(s[4:6]) # 1
day = int(s[6:8]) # 13
>>> datetime.date(year, month, day).strftime('%Y-%m-%d')
'2020-01-13'
You can also convert directly via strings.
>>> f'{s[:4]}-{s[4:6]}-{s[6:8]}'
'2020-01-13'
You can use .date() on datetime objects to 'remove' the time.
my_time_str = str(final.date())
will give you the wanted result

how to subtract date from date from sql in python

I run a sql query that returns a date in the format '2015-03-01T17:09:00.000+0000' I want to subtract this from today's date.
I am getting today's date with the following:
import datetime
now = datetime.datetime.now()
The formats don't seem to line up and I can't figure out a standardize format.
You can use strptime from datetime module to get python compatible date time from your query result using a format string. (You might have to play with the format string a bit to suit your case)
ts = '2015-03-01T17:09:00.000+0000' to a format string like
f = '%Y-%m-%dT%H:%M:%S.%f%z'
date_from_sql = datetime.datetime.strptime(ts, f)
now = datetime.datetime.now()
delta = date_from_sql - now
The .000 is probably microseconds (denoted by %f in the format string) and the +0000 is the utc offset (denoted by %z in the format string). Check this out for more formatting options: https://docs.python.org/2/library/datetime.html#strftime-strptime-behavior
Check out this thread for an example: what is the proper way to convert between mysql datetime and python timestamp?
Checkout this for more on strptime https://docs.python.org/2/library/datetime.html#datetime.datetime.strptime
Getting the delta between two datetime objects in Python is really simple, you simply subtract them.
import datetime
d1 = datetime.datetime.now()
d2 = datetime.datetime.now()
delta = d2 - d1
print delta.total_seconds()
d2 - d1 returns a datetime.timedelta object, from which you can get the total second difference between the two dates.
As for formatting the dates, you can read about formatting strings into datetime objects, and datetime objects into string here
You'll read about the strftime() and strptime() functions, and with them you can get yourself two datetime objects which you can subtract from each other.

Python - Time delta from string and now()

I have spent some time trying to figure out how to get a time delta between time values. The only issue is that one of the times was stored in a file. So I have one string which is in essence str(datetime.datetime.now()) and datetime.datetime.now().
Specifically, I am having issues getting a delta because one of the objects is a datetime object and the other is a string.
I think the answer is that I need to get the string back in a datetime object for the delta to work.
I have looked at some of the other Stack Overflow questions relating to this including the following:
Python - Date & Time Comparison using timestamps, timedelta
Comparing a time delta in python
Convert string into datetime.time object
Converting string into datetime
Example code is as follows:
f = open('date.txt', 'r+')
line = f.readline()
date = line[:26]
now = datetime.datetime.now()
then = time.strptime(date)
delta = now - then # This does not work
Can anyone tell me where I am going wrong?
For reference, the first 26 characters are acquired from the first line of the file because this is how I am storing time e.g.
f.write(str(datetime.datetime.now())
Which would write the following:
2014-01-05 13:09:42.348000
time.strptime returns a struct_time.
datetime.datetime.now() returns a datetime object.
The two can not be subtracted directly.
Instead of time.strptime you could use datetime.datetime.strptime, which returns a datetime object. Then you could subtract now and then.
For example,
import datetime as DT
now = DT.datetime.now()
then = DT.datetime.strptime('2014-1-2', '%Y-%m-%d')
delta = now - then
print(delta)
# 3 days, 8:17:14.428035
By the way, you need to supply a date format string to time.strptime or DT.datetime.strptime.
time.strptime(date)
should have raised a ValueError.
It looks like your date string is 26 characters long. That might mean you have a date string like 'Fri, 10 Jun 2011 11:04:17 '.
If that is true, you may want to parse it like this:
then = DT.datetime.strptime('Fri, 10 Jun 2011 11:04:17 '.strip(), "%a, %d %b %Y %H:%M:%S")
print(then)
# 2011-06-10 11:04:17
There is a table describing the available directives (like %Y, %m, etc.) here.
Try this:
import time
import datetime
d = datetime.datetime.now()
now = time.mktime(d.timetuple())
And then apply the delta
if you have the year,month,day of 'then' you may use:
year = 2013
month = 1
day = 1
now_date = datetime.datetime.now()
then_date = now_date.replace(year = year, month = month, day = day)
delta = now_date - then_date

Categories

Resources