Convert days data in years data in a list - python

I want to do a time serie with temperature data from 1850 to 2014. And I have an issue because when I plot the time series the start is 0 and it corresponds to day 1 of January 1850 and it stops day 60 230 with the 31 December of 2014.
I try to do a loop to create a new list with the time in month-years but it didn't succeed, and to create the plot with this new list and my initial temperature list.
This is the kind of loop that I tested :
days = list(range(1,365+1))
years = []
y = 1850
years.append(y)
while y<2015:
for i in days:
years.append(y+i)
y = y+1
del years [-1]
dsetyears = Dataset(years)
I also try with the tool called "datetime" but it didn't work also (maybe this tool is better because it will take into account the bissextile years...).
day_number = "0"
year = "1850"
res = datetime.strptime(year + "-" + day_number, "%Y-%j").strftime("%m-%d-%Y")
If anyone has a clue or a lead I can look into I'm interested.
Thanks by advance !

You can achieve that using datetime module. Let's declare starting and ending date.
import datetime
dates = []
starting_date = datetime.datetime(1850, 1, 1)
ending_date = datetime.datetime(2014, 1, 1)
Then we can create a while loop and check if the ending date is greater or equal to starting date and add 1-day using timedelta function for every iteration. before iteration, we will append the formatted date as a string to the dates list.
while starting_date <= ending_date:
dates.append(starting_date.strftime("%m-%d-%Y"))
starting_date += datetime.timedelta(days=1)

Related

update Pandas DataFrame time column based on a date range

I have uploaded a big file and created a DataFrame for it.
Now i want to update some of the columns containing timestamps as well if possible update columns with dates based on that.
The reason is that i want to adjust for daylight saving time, and the list i am working with is GMT time so i need to adjust the timestamps on it.
Example that works:
df_winter2['Confirmation_Time'] = pd.to_datetime(df_winter2['Confirmation_Time'].astype(str)) + pd.DateOffset(hours=7)
df_summer['Confirmation_Time'] = pd.to_datetime(df_summer['Confirmation_Time'].astype(str)) + pd.DateOffset(hours=6)
I want to write a function that first add the 6 or 7 hours to the DataFrame based on if it is summertime or wintertime.
If it is possible as well i want to update the date column if the timestamp is > 16:00 with + 1 day,
the date column is called df['Creation_Date']
This should work for the function if it is wintertime.
def wintertime(date_time):
year, month, day = dt.timetuple()[0:3]
if (month < 3) or (month == 12 and day < 21):
return True
else:
return False
Now I am guessing you also want to loop through your df and update the time respectively which you could do with the following:
for i, length in enumerate (df):
date_time = df['Confirmation_Time'][i]
if wintertime(date_time):
df['Confirmation_Time'][i] = pd.to_datetime(df['Confirmation_Time'][i].astype(str)) + pd.DateOffset(hours=7)
else:
df['Confirmation_Time'][i] = pd.to_datetime(df['Confirmation_Time'][i].astype(str)) + pd.DateOffset(hours=6)
return df

Python Pandas: Loop through dates and add them as new rows to the dataframe?

I have a basic dataframe that is read into pandas, with a few rows of existing data that don't matter much.
df = pd.read_csv('myfile.csv')
df['Date'] = pd.to_datetime(df['Date'])
I need to be able to come up with a method that will allow me to loop through between two dates and add these as new rows. These dates are on a cycle, 21 days out of 28 day cycle. So if the start date was 4/1/13 and my end date was 6/1/19, I want to be able to add a row for each date, 21 days on and off for a week.
Desired output:
A, Date
x, 4/1/13
x, 4/2/13
x, 4/3/13
x, 4/4/13
x, 4/5/13
... cont'd
x, 4/21/13
y, 4/29/13
y, 4/30/13
... cont'd
You can see that between x and y there was a new cycle.
I think I am supposed to use Datetime for this but please correct me if I am wrong. I am not sure where to start.
EDIT
I started with this:
import datetime
# The size of each step in days
day_delta = datetime.timedelta(days=1)
start_date = datetime.date(2013, 4, 1)
end_date = start_date + 21*day_delta
for i in range((end_date - start_date).days):
print(start_date + i*day_delta)
And got this:
2013-04-01
2013-04-02
2013-04-03
2013-04-04
2013-04-05
2013-04-06
2013-04-07
2013-04-08
2013-04-09
2013-04-10
2013-04-11
2013-04-12
2013-04-13
2013-04-14
2013-04-15
2013-04-16
2013-04-17
2013-04-18
2013-04-19
2013-04-20
2013-04-21
But I am not sure how to implement the cycle in here.
TYIA!
Interesting question, I spent almost half an hour on this.
Yes, you will need the datetime module for this.
base = datetime.datetime.today()
date_list = [base - datetime.timedelta(days=x) for x in range(100)]
I made a list of dates as you did. This is a list of datetime.timedelta objects. I recommend you convert all your dates into this format to make calculations easier. We set a base date (the first day) to compare with the rest later on in a loop.
date_list_filtered = []
for each in enumerate(date_list):
date_list_filtered.append(each[1].strftime('%d/%m/%y'))
strftime() changes the datetime.datetime object into a readable date, my own preference is using the dd/mm/yy format. You can look up different formats online.
df = pd.DataFrame({'Raw':date_list,'Date':date_list_filtered})
Here I made a loop to count the difference in days between each date in the loop and the base date, changing the base date every time it hits -21.
Edit: Oops I did 21 days instead of 28, but I'm sure you can tweak it
base = df['Raw'][0]
unique_list = []
no21 = 0
for date in df['Raw'].values:
try:
res = (date-base).days
except:
res = (date-base).astype('timedelta64[D]')/np.timedelta64(1, 'D')
if res==-21.0:
base = date
#print(res)
unique_list.append(string.ascii_letters[no21])
no21+=1
else:
unique_list.append(string.ascii_letters[no21])
I used the string library to get the unique letters I wanted.
Lastly, put it in the data frame.
df['Unique'] = unique_list
Thanks for asking this question, it was really fun.
You can floor divide the difference in days from the start date by 28 to get the number of cycles.
date_start = datetime.datetime(2013, 4, 1)
date1 = datetime.datetime(2013, 5, 26)
And to check the difference
diff_days = (date1-date_start).days
diff_days
55
cycle = (date1-date_start).days//28
cycle
1
Then you can sum over the dates within the same cycle.

Python Function running on a random date

#first and last day of every month
s_january, e_january = ("1/1/2017"), ("1/31/2017")
s_february, e_february = ("2/1/2017"), ("2/28/2017")
s_march, e_march = ("3/1/2017"), ("3/31/2017")
s_april, e_april = ("4/1/2017"), ("4/30/2017")
s_may, e_may = ("5/1/2017"), ("5/31/2017")
s_june, e_june = ("6/1/2017"), ("6/30/2017")
s_july, e_july = ("7/1/2017"), ("7/31/2017")
s_august, e_august = ("8/1/2017"), ("8/31/2017")
s_September, e_September = ("9/1/2017"), ("9/30/2017")
s_october, e_october = ("10/1/2017"), ("10/31/2017")
s_november, e_november = ("11/1/2017"), ("11/30/2017")
s_december, e_december = ("12/1/2017"), ("12/31/2017")
def foo(s_date, e_date):
does stuff
foo(s_january, e_january)
foo(s_february, e_february)
foo(s_march, e_march)
foo(s_april, e_april)
foo(s_may, e_may)
foo(s_june, e_june)
foo(s_july, e_july)
foo(s_august, e_august)
foo(s_september, e_september)
foo(s_october, e_october)
foo(s_november, e_november)
foo(s_december, e_december)
I have a function that on a random date does stuff, but I have to call the function for every month, if I put the range for year I don't get the result that I want.
Is there any better way to avoid running it 12 times?
Set up your dates in a dictionary rather than 24 variables, and make life easier for yourself by computing the first and last day of each month. It would be useful also to represent your dates as datetimes not strings, since it's clear from your question header that you want to do computation on them.
import datetime
from dateutil import relativedelta
year = 2017
dates = {}
for month in range(1,13):
dates[(year,month)] = (
datetime.date(year,month,1),
datetime.date(year,month,1)
+ relativedelta.relativedelta(months=1)
- relativedelta.relativedelta(days=1))
The first element in each tuple is computed straightforwardly as the first day of the month. The second date is the same date, but with one month added (first day of the next month) and then one day subtracted, to get the last day of the month.
Then you can do:
for (year,month),(start,end) in dates.items():
print(year, month, foo (start,end))
You could use a dictionary to keep all start end end dates:
import calendar
import datetime as dt
def foo(s_date, e_date):
print ("Doing something between {} and {}".format(s_date.strftime('%d/%m/%Y'), e_date.strftime('%d/%m/%Y')))
def getMonths(year):
result = {}
for month in range(1, 13):
lastDayOfMonth = calendar.monthrange(year, month)[1]
result[month] = (dt.datetime(year, month, 1), dt.datetime(year, month, lastDayOfMonth))
return result
for month, start_end_dates in getMonths(2018).items():
foo(*start_end_dates)
Prints:
Doing something between 01/01/2018 and 31/01/2018
Doing something between 01/02/2018 and 28/02/2018
Doing something between 01/03/2018 and 31/03/2018
...
What do you mean by putting the range for year?
You could consider putting your dates to a dictionary or nested lists.

python get day if its in a dates range

I'm trying to check if first date of the month and the last date of the month lies in a range of dates (the range is 7 days window starting from current date) . below is an example for what I'm trying to achieve:
import datetime, calendar
today = datetime.date.today()
date_list = [today + datetime.timedelta(days=x) for x in range(0, 7)]
lastDayOfMonth = today.replace(day=calendar.monthrange(today.year,today.month)[-1])
if 1 in [ date_list[i].day for i in range(0, len(date_list))]:
print "we have first day of month in range"
elif lastDayOfMonth in [ date_list[i].day for i in range(0, len(date_list))]:
print " we have last date of month in the range"
I'm wondering if there is a cleaner way for doing that? I also want to print the exact date if I find it in the list but I don't know how without expanding the for loop in the if statement and save print date_list[i] if it matches my condition. so instead of printing the message when I find the first day in the range I should print the actual date. same for last date.
Thanks in advance!
The only thing I can come up with, without having to make use of iteration is:
import datetime, calendar
today = datetime.date.today()
week_from_today = today + datetime.timedelta(days=6)
last_day_of_month = today.replace(day=calendar.monthrange(today.year,today.month)[-1])
if today.month != week_from_today.month:
print datetime.date(week_from_today.year, week_from_today.month, 1)
elif today <= last_day_of_month <= week_from_today:
print last_day_of_month
since today it's 2016-06-02 it's hard to test the code.
Try changing the variable today to another day. I used the dates 2016-05-25 and 2016-05-26 to test the code.
to set a custom date: today = datetime.date(yyyy, m, d)

How to list next 24 months' start dates with python?

Please tell me how I can list next 24 months' start dates with python,
such as:
01May2014
01June2014
.
.
.
01Aug2015
and so on
I tried:
import datetime
this_month_start = datetime.datetime.now().replace(day=1)
for i in xrange(24):
print (this_month_start + i*datetime.timedelta(40)).replace(day=1)
But it skips some months.
Just increment the month value; I used datetime.date() types here as that's more than enough:
current = datetime.date.today().replace(day=1)
for i in xrange(24):
new_month = current.month % 12 + 1
new_year = current.year + current.month // 12
current = current.replace(month=new_month, year=new_year)
print current
The new month calculation picks the next month based on the last calculated month, and the year is incremented every time the previous month reached December.
By manipulating a current object, you simplify the calculations; you can do it with i as an offset as well, but the calculation gets a little more complicated.
It'll work with datetime.datetime() too.
To simplify arithmetics, try/except could be used:
from datetime import date
current = date.today().replace(day=1)
for _ in range(24):
try:
current = current.replace(month=current.month + 1)
except ValueError: # new year
current = current.replace(month=1, year=current.year + 1)
print(current.strftime('%d%b%Y'))

Categories

Resources