Create a date range in julian date python - python

I have to create a range between two dates, interval between dates of some minutes, in Julian date, i create a code, but is taking a lot of time about(15 minutes, for the ex.)
my code is:
from astropy.time import Time
import pandas as pd
timedelta = "600s"
start = "2018-01-01"
end = "2018-06-30"
dateslist = pd.date_range(start,end, freq =timedelta ).tolist()
dates = pd.DataFrame({'col':dateslist})
dates["col2"] =""
for i in range(len(dateslist)):
#print(i," / ", len(dateslist))
dates["col2"][i] = (Time(str(dateslist[i]).replace(" ", "T"), format="fits").jd)
I tried using Time without for, but is getting error
time = str(list(dates['col'])).replace("[Timestamp('","").replace(" Timestamp('","").replace("')","").replace(" ","T").split(",")
time
Time(time, format="fits")
ValueError: Input values did not match the format class fits
Is there some way of doing this quickly?
Thanks for now,

Use DatetimeIndex.to_julian_date:
dates["col2"] = pd.date_range(start,end, freq = timedelta).to_julian_date()

The equivalent way in astropy would be:
from astropy.time import Time
import astropy.units as u
timedelta = 600 * u.s
start = "2018-01-01"
end = "2018-06-30"
dates["col2"] = np.arange(Time(start).jd, Time(end).jd, timedelta.to_value('day'))
An alternate (perhaps more idiomatic way in astropy) is:
start = Time("2018-01-01")
end = Time("2018-06-30")
timedelta = 600 * u.s
dates = start + timedelta * np.arange((end - start) / timedelta)
This gives you a vector Time object, which you could convert to JD via the jd attribute.

Related

Creating datetime series in python

I have a recording with the start time (t0) and length of the recording (N). How do I create a time series vector (ts) in python for every 10s increments?
For example:
t0 = 2017-06-12T11:05:10.00
N=1000
So there should be an array of 100 (N/10)values such that:
ts = [2017-06-12T11:05:10.00, 2017-06-12T11:05:20.00,2017-06-12T11:05:30.00 and so on...]
You can use the datetime module of Python.
First you need to convert your string into a date with dateteime.strptime:
t0 = datetime.datetime.strptime("2017-06-12T11:05:10.00", "%Y-%m-%dT%H:%M:%S.00")
where the "%Y-%m-%dT%H:%M:%S.00" part is the description of your string format (see documentation)
Then you can increment a datetime object by adding a timedelta to it. Build a sequence like this:
delta = datetime.timedelta(seconds=10)
ts = [t0 + i*delta for i in range(N)]
You can also recover dates as strings by using datetime.strftime with a similar syntax to strptime.
The whole thing would look like
from datetime import datetime, timedelta
date_format = "%Y-%m-%dT%H:%M:%S.00"
t0 = datetime.strptime("2017-06-12T11:05:10.00", date_format)
delta = timedelta(seconds=10)
ts = [datetime.strftime(t0 + i * delta, date_format) for i in range(100)]
import datetime
import numpy as np
t0=datetime.datetime(2017, 6, 12, 11, 5, 10)
dt=datetime.timedelta(seconds=10)
ts=np.arange(100)*dt+t0
There might be some easier way. I've never tried to find one. But this is how I do it.

How to correctly generate list of UTC timestamps, by hour, between two datetimes Python?

I'm new to Python. After a couple days researching and trying things out, I've landed on a decent solution for creating a list of timestamps, for each hour, between two dates.
Example:
import datetime
from datetime import datetime, timedelta
timestamp_format = '%Y-%m-%dT%H:%M:%S%z'
earliest_ts_str = '2020-10-01T15:00:00Z'
earliest_ts_obj = datetime.strptime(earliest_ts_str, timestamp_format)
latest_ts_str = '2020-10-02T00:00:00Z'
latest_ts_obj = datetime.strptime(latest_ts_str, timestamp_format)
num_days = latest_ts_obj - earliest_ts_obj
num_hours = int(round(num_days.total_seconds() / 3600,0))
ts_raw = []
for ts in range(num_hours):
ts_raw.append(latest_ts_obj - timedelta(hours = ts + 1))
dates_formatted = [d.strftime('%Y-%m-%dT%H:%M:%SZ') for d in ts_raw]
# Need timestamps in ascending order
dates_formatted.reverse()
dates_formatted
Which results in:
['2020-10-01T00:00:00Z',
'2020-10-01T01:00:00Z',
'2020-10-01T02:00:00Z',
'2020-10-01T03:00:00Z',
'2020-10-01T04:00:00Z',
'2020-10-01T05:00:00Z',
'2020-10-01T06:00:00Z',
'2020-10-01T07:00:00Z',
'2020-10-01T08:00:00Z',
'2020-10-01T09:00:00Z',
'2020-10-01T10:00:00Z',
'2020-10-01T11:00:00Z',
'2020-10-01T12:00:00Z',
'2020-10-01T13:00:00Z',
'2020-10-01T14:00:00Z',
'2020-10-01T15:00:00Z',
'2020-10-01T16:00:00Z',
'2020-10-01T17:00:00Z',
'2020-10-01T18:00:00Z',
'2020-10-01T19:00:00Z',
'2020-10-01T20:00:00Z',
'2020-10-01T21:00:00Z',
'2020-10-01T22:00:00Z',
'2020-10-01T23:00:00Z']
Problem:
If I change earliest_ts_str to include minutes, say earliest_ts_str = '2020-10-01T19:45:00Z', the resulting list does not increment the minute intervals accordingly.
Results:
['2020-10-01T20:00:00Z',
'2020-10-01T21:00:00Z',
'2020-10-01T22:00:00Z',
'2020-10-01T23:00:00Z']
I need it to be:
['2020-10-01T20:45:00Z',
'2020-10-01T21:45:00Z',
'2020-10-01T22:45:00Z',
'2020-10-01T23:45:00Z']
Feels like the problem is in the num_days and num_hours calculation, but I can't see how to fix it.
Ideas?
if you don't mind to use a 3rd party package, have a look at pandas.date_range:
import pandas as pd
earliest, latest = '2020-10-01T15:45:00Z', '2020-10-02T00:00:00Z'
dti = pd.date_range(earliest, latest, freq='H') # just specify hourly frequency...
l = dti.strftime('%Y-%m-%dT%H:%M:%SZ').to_list()
print(l)
# ['2020-10-01T15:45:00Z', '2020-10-01T16:45:00Z', '2020-10-01T17:45:00Z', '2020-10-01T18:45:00Z', '2020-10-01T19:45:00Z', '2020-10-01T20:45:00Z', '2020-10-01T21:45:00Z', '2020-10-01T22:45:00Z', '2020-10-01T23:45:00Z']
import datetime
from datetime import datetime, timedelta
timestamp_format = '%Y-%m-%dT%H:%M:%S%z'
earliest_ts_str = '2020-10-01T00:00:00Z'
ts_obj = datetime.strptime(earliest_ts_str, timestamp_format)
latest_ts_str = '2020-10-02T00:00:00Z'
latest_ts_obj = datetime.strptime(latest_ts_str, timestamp_format)
ts_raw = []
while ts_obj <= latest_ts_obj:
ts_raw.append(ts_obj)
ts_obj += timedelta(hours=1)
dates_formatted = [d.strftime('%Y-%m-%dT%H:%M:%SZ') for d in ts_raw]
print(dates_formatted)
EDIT:
Here is example with Maya
import maya
earliest_ts_str = '2020-10-01T00:00:00Z'
latest_ts_str = '2020-10-02T00:00:00Z'
start = maya.MayaDT.from_iso8601(earliest_ts_str)
end = maya.MayaDT.from_iso8601(latest_ts_str)
# end is not included, so we add 1 second
my_range = maya.intervals(start=start, end=end.add(seconds=1), interval=60*60)
dates_formatted = [d.iso8601() for d in my_range]
print(dates_formatted)
Both output
['2020-10-01T00:00:00Z',
'2020-10-01T01:00:00Z',
... some left out ...
'2020-10-01T23:00:00Z',
'2020-10-02T00:00:00Z']
Just change
num_hours = num_days.days*24 + num_days.seconds//3600
The problem is that num_days only takes integer values, so if it is not a multiple of 24h you will get the floor value (i.e for your example you will get 0). So in order to compute the hours you need to use both, days and seconds.
Also, you can create the list directly in the right order, I am not sure if you are doing it like this for some reason.
ts_raw.append(earliest_ts_obj + timedelta(hours = ts + 1))

How to find the time interval for the dataset with Python?

I need to find the time interval in Python.
So far, I did:
from datetime import datetime
timestamp1 = 737029.3541666665
earliest_date = datetime.fromtimestamp(timestamp1)
timestamp2 = 737036.3527777778
most_recent_date = datetime.fromtimestamp(timestamp2)
But I don't know how to go further.
Thank you.
If you have two timestamps like (based in your example):
from datetime import datetime
timestamp1 = 737029.3541666665
timestamp2 = 737036.3527777778
earliest_date = datetime.fromtimestamp(timestamp1)
most_recent_date = datetime.fromtimestamp(timestamp2)
The most simple way to get the time interval between these two date is like:
interval = most_recent_date - earliest_date
# this is a datetime.timedelta object but you can use str() and get some human readable string
str(interval)
# returns: '0:00:06.998611'

Python convert GTFS time to datetime

It is common for a GTFS time to exceed 23:59:59 due to the timetable cycle. Ie, the last time may be 25:20:00 (01:20:00 the next day), so when you convert the times to datetime, you will get an error when these times are encountered.
Is there a way to convert the GTFS time values into standard datetime format, without splitting the hour out and then converting back to a string in the correct format, to then convert it to a datetime.
t = ['24:22:00', '24:30:00', '25:40:00', '26:27:00']
'0'+str(pd.to_numeric(t[0].split(':')[0])%24)+':'+':'.join(t[0].split(':')[1:])
For the above examples, i would expect to just see
['00:22:00', '00:30:00', '01:40:00', '02:27:00']
from datetime import datetime, timedelta
def gtfs_time_to_datetime(gtfs_date, gtfs_time):
hours, minutes, seconds = tuple(
int(token) for token in gtfs_time.split(":")
)
return (
datetime.strptime(gtfs_date, "%Y%m%d") + timedelta(
hours=hours, minutes=minutes, seconds=seconds
)
)
gives the following result
>>> gtfs_time_to_datetime("20191031", "24:22:00")
datetime.datetime(2019, 11, 1, 0, 22)
>>> gtfs_time_to_datetime("20191031", "24:22:00").time().isoformat()
'00:22:00'
>>> t = ['24:22:00', '24:30:00', '25:40:00', '26:27:00']
>>> [ gtfs_time_to_datetime("20191031", tt).time().isoformat() for tt in t]
['00:22:00', '00:30:00', '01:40:00', '02:27:00']
I didn't find an easy way, so i just wrote a function to do it.
If anyone else wants the solution, here is mine:
from datetime import timedelta
import pandas as pd
def list_to_real_datetime(time_list, date_exists=False):
'''
Convert a list of GTFS times to real datetime list
:param time_list: GTFS times
:param date_exists: Flag indicating if the date exists in the list elements
:return: An adjusted list of time to conform with real date times
'''
# new list of times to be returned
new_time = []
for time in time_list:
plus_day = False
hour = int(time[0:2])
if hour >= 24:
hour -= 24
plus_day = True
# reset the time to a real format
time = '{:02d}'.format(hour)+time[2:]
# Convert the time to a datetime
if not date_exists:
time = pd.to_datetime('1970-01-01 '+time, format='%Y-%m-%d')
if plus_day:
time = time + timedelta(days=1)
new_time.append(time)
return new_time

Generate a random date between two other dates

How would I generate a random date that has to be between two other given dates?
The function's signature should be something like this:
random_date("1/1/2008 1:30 PM", "1/1/2009 4:50 AM", 0.34)
^ ^ ^
date generated has date generated has a random number
to be after this to be before this
and would return a date such as: 2/4/2008 7:20 PM
Convert both strings to timestamps (in your chosen resolution, e.g. milliseconds, seconds, hours, days, whatever), subtract the earlier from the later, multiply your random number (assuming it is distributed in the range [0, 1]) with that difference, and add again to the earlier one. Convert the timestamp back to date string and you have a random time in that range.
Python example (output is almost in the format you specified, other than 0 padding - blame the American time format conventions):
import random
import time
def str_time_prop(start, end, time_format, prop):
"""Get a time at a proportion of a range of two formatted times.
start and end should be strings specifying times formatted in the
given format (strftime-style), giving an interval [start, end].
prop specifies how a proportion of the interval to be taken after
start. The returned time will be in the specified format.
"""
stime = time.mktime(time.strptime(start, time_format))
etime = time.mktime(time.strptime(end, time_format))
ptime = stime + prop * (etime - stime)
return time.strftime(time_format, time.localtime(ptime))
def random_date(start, end, prop):
return str_time_prop(start, end, '%m/%d/%Y %I:%M %p', prop)
print(random_date("1/1/2008 1:30 PM", "1/1/2009 4:50 AM", random.random()))
from random import randrange
from datetime import timedelta
def random_date(start, end):
"""
This function will return a random datetime between two datetime
objects.
"""
delta = end - start
int_delta = (delta.days * 24 * 60 * 60) + delta.seconds
random_second = randrange(int_delta)
return start + timedelta(seconds=random_second)
The precision is seconds. You can increase precision up to microseconds, or decrease to, say, half-hours, if you want. For that just change the last line's calculation.
example run:
from datetime import datetime
d1 = datetime.strptime('1/1/2008 1:30 PM', '%m/%d/%Y %I:%M %p')
d2 = datetime.strptime('1/1/2009 4:50 AM', '%m/%d/%Y %I:%M %p')
print(random_date(d1, d2))
output:
2008-12-04 01:50:17
Updated answer
It's even more simple using Faker.
Installation
pip install faker
Usage:
from faker import Faker
fake = Faker()
fake.date_between(start_date='today', end_date='+30y')
# datetime.date(2025, 3, 12)
fake.date_time_between(start_date='-30y', end_date='now')
# datetime.datetime(2007, 2, 28, 11, 28, 16)
# Or if you need a more specific date boundaries, provide the start
# and end dates explicitly.
import datetime
start_date = datetime.date(year=2015, month=1, day=1)
fake.date_between(start_date=start_date, end_date='+30y')
Old answer
It's very simple using radar
Installation
pip install radar
Usage
import datetime
import radar
# Generate random datetime (parsing dates from str values)
radar.random_datetime(start='2000-05-24', stop='2013-05-24T23:59:59')
# Generate random datetime from datetime.datetime values
radar.random_datetime(
start = datetime.datetime(year=2000, month=5, day=24),
stop = datetime.datetime(year=2013, month=5, day=24)
)
# Just render some random datetime. If no range is given, start defaults to
# 1970-01-01 and stop defaults to datetime.datetime.now()
radar.random_datetime()
A tiny version.
import datetime
import random
def random_date(start, end):
"""Generate a random datetime between `start` and `end`"""
return start + datetime.timedelta(
# Get a random amount of seconds between `start` and `end`
seconds=random.randint(0, int((end - start).total_seconds())),
)
Note that both start and end arguments should be datetime objects. If
you've got strings instead, it's fairly easy to convert. The other answers point
to some ways to do so.
This is a different approach - that sort of works..
from random import randint
import datetime
date=datetime.date(randint(2005,2025), randint(1,12),randint(1,28))
BETTER APPROACH
startdate=datetime.date(YYYY,MM,DD)
date=startdate+datetime.timedelta(randint(1,365))
Since Python 3 timedelta supports multiplication with floats, so now you can do:
import random
random_date = start + (end - start) * random.random()
given that start and end are of the type datetime.datetime. For example, to generate a random datetime within the next day:
import random
from datetime import datetime, timedelta
start = datetime.now()
end = start + timedelta(days=1)
random_date = start + (end - start) * random.random()
To chip in a pandas-based solution I use:
import pandas as pd
import numpy as np
def random_date(start, end, position=None):
start, end = pd.Timestamp(start), pd.Timestamp(end)
delta = (end - start).total_seconds()
if position is None:
offset = np.random.uniform(0., delta)
else:
offset = position * delta
offset = pd.offsets.Second(offset)
t = start + offset
return t
I like it, because of the nice pd.Timestamp features that allow me to throw different stuff and formats at it. Consider the following few examples...
Your signature.
>>> random_date(start="1/1/2008 1:30 PM", end="1/1/2009 4:50 AM", position=0.34)
Timestamp('2008-05-04 21:06:48', tz=None)
Random position.
>>> random_date(start="1/1/2008 1:30 PM", end="1/1/2009 4:50 AM")
Timestamp('2008-10-21 05:30:10', tz=None)
Different format.
>>> random_date('2008-01-01 13:30', '2009-01-01 4:50')
Timestamp('2008-11-18 17:20:19', tz=None)
Passing pandas/datetime objects directly.
>>> random_date(pd.datetime.now(), pd.datetime.now() + pd.offsets.Hour(3))
Timestamp('2014-03-06 14:51:16.035965', tz=None)
Convert your dates into timestamps and call random.randint with the timestamps, then convert the randomly generated timestamp back into a date:
from datetime import datetime
import random
def random_date(first_date, second_date):
first_timestamp = int(first_date.timestamp())
second_timestamp = int(second_date.timestamp())
random_timestamp = random.randint(first_timestamp, second_timestamp)
return datetime.fromtimestamp(random_timestamp)
Then you can use it like this
from datetime import datetime
d1 = datetime.strptime("1/1/2018 1:30 PM", "%m/%d/%Y %I:%M %p")
d2 = datetime.strptime("1/1/2019 4:50 AM", "%m/%d/%Y %I:%M %p")
random_date(d1, d2)
random_date(d2, d1) # ValueError because the first date comes after the second date
If you care about timezones you should just use date_time_between_dates from the Faker library, where I stole this code from, as a different answer already suggests.
Here is an answer to the literal meaning of the title rather than the body of this question:
import time
import datetime
import random
def date_to_timestamp(d) :
return int(time.mktime(d.timetuple()))
def randomDate(start, end):
"""Get a random date between two dates"""
stime = date_to_timestamp(start)
etime = date_to_timestamp(end)
ptime = stime + random.random() * (etime - stime)
return datetime.date.fromtimestamp(ptime)
This code is based loosely on the accepted answer.
You can Use Mixer,
pip install mixer
and,
from mixer import generators as gen
print gen.get_datetime(min_datetime=(1900, 1, 1, 0, 0, 0), max_datetime=(2020, 12, 31, 23, 59, 59))
Just to add another one:
datestring = datetime.datetime.strftime(datetime.datetime( \
random.randint(2000, 2015), \
random.randint(1, 12), \
random.randint(1, 28), \
random.randrange(23), \
random.randrange(59), \
random.randrange(59), \
random.randrange(1000000)), '%Y-%m-%d %H:%M:%S')
The day handling needs some considerations. With 28 you are on the secure site.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""Create random datetime object."""
from datetime import datetime
import random
def create_random_datetime(from_date, to_date, rand_type='uniform'):
"""
Create random date within timeframe.
Parameters
----------
from_date : datetime object
to_date : datetime object
rand_type : {'uniform'}
Examples
--------
>>> random.seed(28041990)
>>> create_random_datetime(datetime(1990, 4, 28), datetime(2000, 12, 31))
datetime.datetime(1998, 12, 13, 23, 38, 0, 121628)
>>> create_random_datetime(datetime(1990, 4, 28), datetime(2000, 12, 31))
datetime.datetime(2000, 3, 19, 19, 24, 31, 193940)
"""
delta = to_date - from_date
if rand_type == 'uniform':
rand = random.random()
else:
raise NotImplementedError('Unknown random mode \'{}\''
.format(rand_type))
return from_date + rand * delta
if __name__ == '__main__':
import doctest
doctest.testmod()
# needed to create data for 1000 fictitious employees for testing code
# code relating to randomly assigning forenames, surnames, and genders
# has been removed as not germaine to the question asked above but FYI
# genders were randomly assigned, forenames/surnames were web scrapped,
# there is no accounting for leap years, and the data stored in mySQL
import random
from datetime import datetime
from datetime import timedelta
for employee in range(1000):
# assign a random date of birth (employees are aged between sixteen and sixty five)
dlt = random.randint(365*16, 365*65)
dob = datetime.today() - timedelta(days=dlt)
# assign a random date of hire sometime between sixteenth birthday and today
doh = datetime.today() - timedelta(days=random.randint(0, dlt-365*16))
print("born {} hired {}".format(dob.strftime("%d-%m-%y"), doh.strftime("%d-%m-%y")))
Convert your input dates to numbers
(int, float, whatever is best for
your usage)
Choose a number between your two date numbers.
Convert this number back to a date.
Many algorithms for converting date to and from numbers are already available in many operating systems.
What do you need the random number for? Usually (depending on the language) you can get the number of seconds/milliseconds from the Epoch from a date. So for a randomd date between startDate and endDate you could do:
compute the time in ms between
startDate and endDate
(endDate.toMilliseconds() -
startDate.toMilliseconds())
generate a number between 0 and the number you obtained in 1
generate a new Date with time offset = startDate.toMilliseconds() + number obtained in 2
The easiest way of doing this is to convert both numbers to timestamps, then set these as the minimum and maximum bounds on a random number generator.
A quick PHP example would be:
// Find a randomDate between $start_date and $end_date
function randomDate($start_date, $end_date)
{
// Convert to timetamps
$min = strtotime($start_date);
$max = strtotime($end_date);
// Generate random number using above bounds
$val = rand($min, $max);
// Convert back to desired date format
return date('Y-m-d H:i:s', $val);
}
This function makes use of strtotime() to convert a datetime description into a Unix timestamp, and date() to make a valid date out of the random timestamp which has been generated.
It's modified method of #(Tom Alsberg). I modified it to get date with milliseconds.
import random
import time
import datetime
def random_date(start_time_string, end_time_string, format_string, random_number):
"""
Get a time at a proportion of a range of two formatted times.
start and end should be strings specifying times formated in the
given format (strftime-style), giving an interval [start, end].
prop specifies how a proportion of the interval to be taken after
start. The returned time will be in the specified format.
"""
dt_start = datetime.datetime.strptime(start_time_string, format_string)
dt_end = datetime.datetime.strptime(end_time_string, format_string)
start_time = time.mktime(dt_start.timetuple()) + dt_start.microsecond / 1000000.0
end_time = time.mktime(dt_end.timetuple()) + dt_end.microsecond / 1000000.0
random_time = start_time + random_number * (end_time - start_time)
return datetime.datetime.fromtimestamp(random_time).strftime(format_string)
Example:
print TestData.TestData.random_date("2000/01/01 00:00:00.000000", "2049/12/31 23:59:59.999999", '%Y/%m/%d %H:%M:%S.%f', random.random())
Output: 2028/07/08 12:34:49.977963
Here's a solution modified from emyller's approach which returns an array of random dates at any resolution
import numpy as np
def random_dates(start, end, size=1, resolution='s'):
"""
Returns an array of random dates in the interval [start, end]. Valid
resolution arguments are numpy date/time units, as documented at:
https://docs.scipy.org/doc/numpy-dev/reference/arrays.datetime.html
"""
start, end = np.datetime64(start), np.datetime64(end)
delta = (end-start).astype('timedelta64[{}]'.format(resolution))
delta_mat = np.random.randint(0, delta.astype('int'), size)
return start + delta_mat.astype('timedelta64[{}]'.format(resolution))
Part of what's nice about this approach is that np.datetime64 is really good at coercing things to dates, so you can specify your start/end dates as strings, datetimes, pandas timestamps... pretty much anything will work.
Alternative way to create random dates between two dates using np.random.randint(), pd.Timestamp().value and pd.to_datetime() with for loop:
# Import libraries
import pandas as pd
# Initialize
start = '2020-01-01' # Specify start date
end = '2020-03-10' # Specify end date
n = 10 # Specify number of dates needed
# Get random dates
x = np.random.randint(pd.Timestamp(start).value, pd.Timestamp(end).value,n)
random_dates = [pd.to_datetime((i/10**9)/(60*60)/24, unit='D').strftime('%Y-%m-%d') for i in x]
print(random_dates)
Output
['2020-01-06',
'2020-03-08',
'2020-01-23',
'2020-02-03',
'2020-01-30',
'2020-01-05',
'2020-02-16',
'2020-03-08',
'2020-02-09',
'2020-01-04']
Get random date between start_date and end_date.
If any of them is None, then get random date between
today and past 100 years.
class GetRandomDateMixin:
def get_random_date(self, start_date=None, end_date=None):
"""
get random date between start_date and end_date.
If any of them is None, then get random date between
today and past 100 years.
:param start_date: datetime obj.
eg: datetime.datetime(1940, 1, 1).date()
:param end_date: datetime obj
:return: random date
"""
if start_date is None or end_date is None:
end_date = datetime.datetime.today().date()
start_date = end_date - datetime.timedelta(
days=(100 * 365)
)
delta = end_date - start_date
random_days = random.randint(1, delta.days)
new_date = start_date + datetime.timedelta(
days=random_days
)
return new_date
Building off of #Pieter Bos 's answer:
import random
import datetime
start = datetime.date(1980, 1, 1)
end = datetime.date(2000, 1, 1)
random_date = start + (end - start) * random.random()
random_date = datetime.datetime.combine(random_date, datetime.datetime.min.time())
Use my randomtimestamp module. It has 3 functions, randomtimestamp, random_time, and random_date.
Below is the signature of randomtimestamp function. It can generate a random timestamp between two years, or two datetime objects (if you like precision).
There's option to get the timestamp as a datetime object or string. Custom patterns are also supported (like strftime)
randomtimestamp(
start_year: int = 1950,
end_year: int = None,
text: bool = False,
start: datetime.datetime = None,
end: datetime.datetime = None,
pattern: str = "%d-%m-%Y %H:%M:%S"
) -> Union[datetime, str]:
Example:
>>> randomtimestamp(start_year=2020, end_year=2021)
datetime.datetime(2021, 1, 10, 5, 6, 19)
>>> start = datetime.datetime(2020, 1, 1, 0, 0, 0)
>>> end = datetime.datetime(2021, 12, 31, 0, 0, 0)
>>> randomtimestamp(start=start, end=end)
datetime.datetime(2020, 7, 14, 14, 12, 32)
Why not faker?
Because randomtimestamp is lightweight and fast. As long as random timestamps are the only thing you need, faker is an overkill and also heavy (being feature rich).
Conceptually it's quite simple. Depending on which language you're using you will be able to convert those dates into some reference 32 or 64 bit integer, typically representing seconds since epoch (1 January 1970) otherwise known as "Unix time" or milliseconds since some other arbitrary date. Simply generate a random 32 or 64 bit integer between those two values. This should be a one liner in any language.
On some platforms you can generate a time as a double (date is the integer part, time is the fractional part is one implementation). The same principle applies except you're dealing with single or double precision floating point numbers ("floats" or "doubles" in C, Java and other languages). Subtract the difference, multiply by random number (0 <= r <= 1), add to start time and done.
In python:
>>> from dateutil.rrule import rrule, DAILY
>>> import datetime, random
>>> random.choice(
list(
rrule(DAILY,
dtstart=datetime.date(2009,8,21),
until=datetime.date(2010,10,12))
)
)
datetime.datetime(2010, 2, 1, 0, 0)
(need python dateutil library – pip install python-dateutil)
I made this for another project using random and time. I used a general format from time you can view the documentation here for the first argument in strftime(). The second part is a random.randrange function. It returns an integer between the arguments. Change it to the ranges that match the strings you would like. You must have nice arguments in the tuple of the second arugment.
import time
import random
def get_random_date():
return strftime("%Y-%m-%d %H:%M:%S",(random.randrange(2000,2016),random.randrange(1,12),
random.randrange(1,28),random.randrange(1,24),random.randrange(1,60),random.randrange(1,60),random.randrange(1,7),random.randrange(0,366),1))
Pandas + numpy solution
import pandas as pd
import numpy as np
def RandomTimestamp(start, end):
dts = (end - start).total_seconds()
return start + pd.Timedelta(np.random.uniform(0, dts), 's')
dts is the difference between timestamps in seconds (float). It is then used to create a pandas timedelta between 0 and dts, that is added to the start timestamp.
Based on the answer by mouviciel, here is a vectorized solution using numpy. Convert the start and end dates to ints, generate an array of random numbers between them, and convert the whole array back to dates.
import time
import datetime
import numpy as np
n_rows = 10
start_time = "01/12/2011"
end_time = "05/08/2017"
date2int = lambda s: time.mktime(datetime.datetime.strptime(s,"%d/%m/%Y").timetuple())
int2date = lambda s: datetime.datetime.fromtimestamp(s).strftime('%Y-%m-%d %H:%M:%S')
start_time = date2int(start_time)
end_time = date2int(end_time)
random_ints = np.random.randint(low=start_time, high=end_time, size=(n_rows,1))
random_dates = np.apply_along_axis(int2date, 1, random_ints).reshape(n_rows,1)
print random_dates
start_timestamp = time.mktime(time.strptime('Jun 1 2010 01:33:00', '%b %d %Y %I:%M:%S'))
end_timestamp = time.mktime(time.strptime('Jun 1 2017 12:33:00', '%b %d %Y %I:%M:%S'))
time.strftime('%b %d %Y %I:%M:%S',time.localtime(randrange(start_timestamp,end_timestamp)))
refer
What about
import datetime
import random
def random_date(begin: datetime.datetime, end: datetime.datetime):
epoch = datetime.datetime(1970, 1, 1)
begin_seconds = int((begin - epoch).total_seconds())
end_seconds = int((end - epoch).total_seconds())
dt_seconds = random.randint(begin_seconds, end_seconds)
return datetime.datetime.fromtimestamp(dt_seconds)
Haven't tried it with "epoch" years different than 1970 but it does the job
Generates random dates between last 50 yrs to last 30 years. And generates date only.
import random
from datetime import date, timedelta
from dateutil.relativedelta import relativedelta
start_date = date.today() - relativedelta(years=50)
end_date = date.today() - relativedelta(years=20)
delta = end_date - start_date
print(delta.days)
random_number = random.randint(1, delta.days)
new_date = start_date + timedelta(days=random_number)
print (new_date)

Categories

Resources