Adding up datetime.datetime that are in single list - python

I've looked everywhere and I can't seem to find what I need. I have a list that contains datetime that I need to combine to find the sum. This list is parsed from a file and can have any number of datetime items in it. It looks like such for an example.
[datetime.datetime(1900, 1, 1, 1, 19, 42, 89000), datetime.datetime(1900, 1, 1, 2, 8, 4, 396000), datetime.datetime(1900, 1, 1, 0, 43, 54, 883000), datetime.datetime(1900, 1, 1, 0, 9, 13, 343000)]
The code I'm using to get this is this
time = [i[8] for i in smaller_list]
try:
times = [datetime.datetime.strptime(x, "%H:%M:%S.%f") for x in time]
except ValueError:
times = [datetime.datetime.strptime(x, "%M:%S.%f") for x in time]
Time gets the varibles from a larger nested list that I created to separate lines of data. I've tried datetime.datetime.combine() but I'm not really sure how to use this for items in a single list. Do I need to create a nested list of datetimes and sum them up? How do I iterate though this list and combine all the times for a sum? If I do have to create a nested list, how do I iterate through it to add up the times? Trying to wrap my head around this.
When you print time this is what is returned so the example list directly helps me.
[datetime.datetime(1900, 1, 1, 1, 19, 42, 89000), datetime.datetime(1900, 1, 1, 2, 8, 4, 396000), datetime.datetime(1900, 1, 1, 0, 43, 54, 883000), datetime.datetime(1900, 1, 1, 0, 9, 13, 343000)]
This is what the original times look like. I need to add up times such as these for total time. Usually in minutes with micro seconds included and rarely hours.
25:21.442
09:52.149
28:03.604
27:12.113

If anyone else runs into this problem here is the code I used.
time = [i[8] for i in smaller_list]
sumtime = datetime.timedelta()
for i in time:
try:
(h, m, s) = i.split(':')
d = datetime.timedelta(hours=int(h), minutes=int(m), seconds=float(s))
except ValueError:
(m, s) = i.split(':')
d = datetime.timedelta(minutes=int(m), seconds=float(s))
sumtime += d
print(str(sumtime))
If you're learning python it's pretty confusing trying to wrap your mind around date time and time delta. For duration's you need to use timedelta. You have to split the values up and pass the correct values to time delta and then you can sum them up to find the duration. Hopefully this helps someone out there.
If you need to round the microseconds to seconds you can use this code in place of d.
d = datetime.timedelta(minutes=int(m), seconds=round(float(s)))

Related

Inserting a string with time into a fixed-length sorted list of times

I have a piece of code, as below, where if a time is less than a value in a list of 5 times I want to insert the new value in the list and delete 5th value in the list, essentially updating the list with the new set of top five times.
hours=str(input('enter hours:'))
minutes=int(input('enter minutes:'))
seconds=int(input('enter seconds:'))
if minutes <10:
minutes='0'+str(minutes)
if seconds <10:
seconds='0'+str(seconds)
currentTime=hours+':'+str(minutes)+':'+str(seconds)
print(currentTime)
hours=int(currentTime[0])
minutes=int(currentTime[2]+currentTime[3])
seconds=int(currentTime[5]+currentTime[6])
print(hours, minutes, seconds)
for i in range(1,6):
aTime=str(Times[0][i])
print(Times[0][i])
if hours==int(aTime[1]):
if minutes==int(aTime[3]+aTime[4]):
if seconds <=int(aTime[6]+aTime[7]):
#equal values deemed quicker
print('quicker time than time',str(i))
#insert value in Times[0][i]
break
elif minutes<int(aTime[3]+aTime[4]):
print('quicker time than time',str(i))
#insert value in Times[0][i]
break
elif hours<int(aTime[1]):
print('quicker time than time',str(i))
#insert value in Times[0][i]
break
Example of what I want this to achieve is if;
Times=['1:30:00','2:00:00','2:00:00','2:00:00','2:00:00']
and, for example, current Time='1:45:00'
That Times should become:
Times=['1:30:00','1:45:00','2:00:00','2:00:00','2:00:00']
Feel like this type of question has been asked before but couldn't find one.
Thanks In Advance.
Side note: I imagine that this is probably a very inefficient way of comparing the times but the method is correct just need to know how to insert a value into a list essentially.
This looks like a problem that would be best solved by coming up with a data structure.
We'll call it a MaxList since:
It only retains the minimum elements
It has a maximum number of elements
class MaxList:
def __init__(self, size, remaining=None):
self._size = size
self._contents = []
if remaining:
self._contents = remaining
def append(self, item):
for i, entry in enumerate(self._contents):
if item < entry:
self._contents.insert(i, item)
break
self._contents = self._contents[:self._size]
def __repr__(self):
return self._contents.__repr__()
This implementation assumes the list starts sorted, and works as follows:
>>> ls = MaxList(5, [1, 3, 5, 7, 9]) # 5 is the total length, rest is the list
>>> ls
[1, 3, 5, 7, 9]
>>> ls.append(8)
>>> ls
[1, 3, 5, 7, 8] # `9` was pushed out
>>> ls.append(10)
>>> ls
[1, 3, 5, 7, 8] # Notice `10` is not in the list
>>> ls.append(-1)
>>> ls
[-1, 1, 3, 5, 7]
>>> ls.append(-5)
>>> ls
[-5, -1, 1, 3, 5]
I'd recommend using a non-string representation for times: it will make it easier to compare and decide if one value is less than or greater than another. The datetime objects in Python's standard library could be a good place to start.
For example:
from datetime import time
times = MaxList(5, [
time(1, 30, 0),
time(2, 0, 0),
time(2, 0, 0),
time(2, 0, 0),
time(2, 0, 0),
])
print(times)
times.append(time(1, 45, 0))
print(times)
Output:
[datetime.time(1, 30), datetime.time(2, 0), datetime.time(2, 0), datetime.time(2, 0), datetime.time(2, 0)]
[datetime.time(1, 30), datetime.time(1, 45), datetime.time(2, 0), datetime.time(2, 0), datetime.time(2, 0)]
timeint = int(newtime.replace(':', ''))
for n, i in enumerate(list):
i = int(i.replace(':', ''))
if i > timeint:
list[n] = newtime
break
print(list)
where newtime is the new time you are trying to compare and list is the list of previous times
You can use the bisect module for this.
import bisect
Times=['1:30:00','2:00:00','2:00:00','2:00:00','2:00:00']
Time='1:45:00'
Times[bisect.bisect(Times, Time)]=Time
>>> Times
['1:30:00', '1:45:00', '2:00:00', '2:00:00', '2:00:00']
This only works, in general, for a list already sorted (yours is) with an element that that will compare to those elements in the desired way. String representations of times can be represented this way so that a lexical sorting can be maintained.
See ISO 8601. Time only strings are subset that can be treated the same way.
While strings will work, a better way to store and compare times and dates in Python is with the datetime module. You can use the bisect module with datetime objects to insert into a sorted list the same way.

Check if an element in a series is increasing with respect to the previous values in a series pandas, fast solution

I want to check if elements of my series are continuously increasing.
For example if I have the following numbers:
[7, 15, 23, 0, 32, 18]
my output should be
[0, 1, 2, 0, 1, 0]
If any value is greater than the previous value, then the output value will be output of previous value + 1, otherwise it resets to zero.
I have implemented a naive for loop solution in python, which is as follows:
def const_increasing(tmp):
inc_ser = np.zeros(len(tmp))
for i in range(1, len(tmp)):
if tmp[i] > tmp[i-1]:
inc_ser[i] = 1 + inc_ser[i-1]
return inc_ser
But this solution is quite slow, as I am working with pandas series of large sizes. Is there any efficient way of implementing it ? Maybe using expanding() function or any better way in pandas or numpy.
Any help in this regard would be really appreciated.
Since you tagged pandas:
s = pd.Series([7, 15, 23, 0, 32, 18] ).diff().gt(0)
s.groupby((~s).cumsum()).cumcount().to_list()
Output:
[0, 1, 2, 0, 1, 0]
Here is an answer that does not use a cumulative sum:
import numpy as np
a = np.array([7, 15, 23, 0, 32, 18])
c = np.append(np.array([False]), (a[1:] > a[:-1]))
result = np.concatenate([np.arange(x.size)
for x in np.split(c, np.where(c == False)[0][1:])])

datetime compare time python

I have 2 events A and B. I have already created a function get_schedule(event) to get the date and time of A and B occurring in datetime.
For example:
get_schedule(a) -> datetime.datetime(2016, 1, 1, 2, 25)
How can I create a function to check if my event occurs before/after a certain timing irregardless of the date? I thought about slicing the time and comparing but I don't know how to go about it.
You can get just the time object by calling the datetime.time() method. Compare that to another time object:
# scheduled after 12 noon.
get_schedule(a).time() > datetime.time(12, 0)
Demo:
>>> import datetime
>>> datetime.datetime(2016, 1, 1, 2, 25).time()
datetime.time(2, 25)
>>> datetime.datetime(2016, 1, 1, 2, 25).time() > datetime.time(12, 0)
False

Speed up code that doesn't use groupby()?

I have two pieces of code (doing the same job) which takes in array of datetime and produces clusters of datetime which have difference of 1 hour.
First piece is:
def findClustersOfRuns(data):
runClusters = []
for k, g in groupby(itertools.izip(data[0:-1], data[1:]),
lambda (i, x): (i - x).total_seconds() / 3600):
runClusters.append(map(itemgetter(1), g))
Second piece is:
def findClustersOfRuns(data):
if len(data) <= 1:
return []
current_group = [data[0]]
delta = 3600
results = []
for current, next in itertools.izip(data, data[1:]):
if abs((next - current).total_seconds()) > delta:
# Here, `current` is the last item of the previous subsequence
# and `next` is the first item of the next subsequence.
if len(current_group) >= 2:
results.append(current_group)
current_group = [next]
continue
current_group.append(next)
return results
The first code takes 5 minutes to execute while second piece takes few seconds. I am trying to understand why.
The data over which I am running the code has size:
data.shape
(13989L,)
The data contents is as:
data
array([datetime.datetime(2016, 10, 1, 8, 0),
datetime.datetime(2016, 10, 1, 9, 0),
datetime.datetime(2016, 10, 1, 10, 0), ...,
datetime.datetime(2019, 1, 3, 9, 0),
datetime.datetime(2019, 1, 3, 10, 0),
datetime.datetime(2019, 1, 3, 11, 0)], dtype=object)
How do I improve the first piece of code to make it run as fast?
Based on the size, it looks you are having a huge list of elements i.e. huge len. Your second code is having just one for loop where as your first approach has many. You see just one right? They are in the form of map(), groupby(). Multiple iteration on the huge list is adding huge cost to the time complexity. These are not just additional iterations, but also these are slower than the normal for loop.
I made a comparison for another post which you might find useful Comparing list comprehensions and explicit loops.
Also, the usage of lambda function is adding up extra time.
However, you may further improve the execution time of the code by storing results.append to a separate variable say my_func and make a call as my_func(current_group).
Few more comparisons are:
Python 3: Loops, list comprehension and map slower compared to Python 2
Speed/efficiency comparison for loop vs list comprehension vs other methods

Multiple loop control variables in Python for loop

I came across a situation where I need to implement a for loop with more than one loop control variable. Basically this is what I am trying to do
Java:
for (int i=0,j=n; i<n,j>=0; i++, j--)
do my stuff
How do I do this in Python?
for i in range(0,n), j in range(n-1,-1,-1):
do my stuff
But this doesn't work. What would be the right syntax here? Also, is there a more elegant(pythonic) construct for the use-case?
For your specific example, it doesn't look like you really need separate variables in the loop definition itself. You can just iterate over the i values and construct the j values by doing n-i:
for i in range(0, n):
j = n-i
# do stuff
In general, you can't specify multiple values in the for statement if they depend on each other. Instead, use the for to iterate over some "base" value or values from which the others can be derived.
You can specify multiple values in the for loop, but they must be drawn from a single iterable:
for i, j in zip(range(0, n), range(n, 0, -1)):
# do stuff
This takes i from the first range (0 to n-1) and j from the second (n to 1). zip creates a new iterable by componentwise pairing the elements of the iterables you give it.
The thing to remember is that Python for loops are not like loops in Java/C, which have an initialize/check end condition/update structure that repeatedly modifies some persistent loop index variable. Python for loops iterate over a "stream" of values provided by a "source" iterable, grabbing one value at a time from the source. (This is similar to foreach-type constructs in some other languages.) Whatever you want to iterate over, you need to get it into an iterable before you begin the loop. In other words, every Python for loop can be thought of as roughly analogous to something like:
for (i=iterable.getNextValue(); iterable.isNotEmpty(); i=iterable.getNextValue())
You can't have the loop initialization be different from the loop update, and you can't have those be any operation other than "get the next value", and you can't have the end condition be anything other than "is the iterable exhausted". If you want to do anything like that, you have to either do it yourself inside the loop (e.g., by assigning a secondary "loop variable" as in my example, or by checking for a custom loop exit condition and breaking out), or build it into the "source" that you're iterating over.
A lot depends on iterators you want. Here's a couple of options. What they have in common is that for... in... will traverse over lists, tuples, and anything else which supports iteration. So you could loop over a list of known values or a generator which produces an arbitrary series of values. The for... in... is the same regardless.
Here's some standard python tricks:
nested loops
for i in range(10):
for j in range (10):
print i, j
This is simple and clear, but for complex iterations it may get very deeply nested. Here range is a generator, which means it produces an iteration over a series (in this case, 10 numbers - but it could be any arbitrary stream of values))
zip
for multiple iterators you can use zip() which creates an iterable object that produces a value from each of several of iterables at the same time. You can use multple assignment inside the for loop to grab pieces out of the zip
a = [1,2,3,4]
b = ['a','b','c','d']
for number, letter in zip (a, b):
print letter, ":", number
# a : 1
# b : 2
# c : 3
# d : 4
zip will stop when the first member is exhausted:
a = [1,2]
b = ['a','b','c','d']
for number, letter in zip (a, b):
print letter, ":", number
# a : 1
# b : 2
zip also uses generators:
test = zip (range(10), range(10,20))
for item in test: print item
#(0, 10)
#(1, 11)
#(2, 12)
#(3, 13)
#(4, 14)
#(5, 15)
#(6, 16)
#(7, 17)
#(8, 18)
#(9, 19)
itertools
For more complex iterations there's a lot of great tools in the itertools module This is particularly nice for things like getting all of the products or permutations of multiple iterators. It's worth checking out but I think it's more than you need
You can create a list comprehension using more than one loop control variable:
>>> n = 10
>>> stuff = [i*j for i in range(n) for j in range(n-1,-1,-1)]
>>> stuff
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0, 18, 16, 14, 12, 10, 8, 6, 4, 2, 0, 27, 24, 21, 18, 15, 12, 9, 6, 3, 0, 36, 32, 28, 24, 20, 16, 12, 8, 4, 0, 45, 40, 35, 30, 25, 20, 15, 10, 5, 0, 54, 48, 42, 36, 30, 24, 18, 12, 6, 0, 63, 56, 49, 42, 35, 28, 21, 14, 7, 0, 72, 64, 56, 48, 40, 32, 24, 16, 8, 0, 81, 72, 63, 54, 45, 36, 27, 18, 9, 0]

Categories

Resources