Sort tuples by time interval? Python - python

How can I sort these tuples by time interval, say every hour?
[('172.18.74.146', datetime.time(11, 28, 58)), ('10.227.211.244',
datetime.time(11, 54, 19)), ('10.227.215.68', datetime.time(11, 54, 34)),
('10.227.209.139', datetime.time(12, 14, 47)), ('10.227.147.98',
datetime.time(14, 47, 25))]
The result should be:
[["172.18.74.146, 10.227.211.244, 10.227.215.68", "11-12"], etc...]
I tried to use group by, but doesnt get what I want:
for dd in data[1:]:
ips = dd[1].split(",")
dates = dd[2].split(",")
i = 0
while(i < len(dates)):
ips[i] = ips[i].strip()
hour, mins, second = dates[i].strip().split(":")
dates[i] = datetime.time(int(hour), int(mins), int(second))
i+=1
order = [(k, ', '.join(str(s[0]) for s in v)) for k, v in groupby(sorted(zip(ips, dates), key=operator.itemgetter(1)), lambda x: x[1].hour)]

In [17]: a = [('172.18.74.146', datetime.time(11, 28, 58)), ('10.227.211.244',
datetime.time(11, 54, 19)), ('10.227.215.68', datetime.time(11, 54, 34)),
('10.227.209.139', datetime.time(12, 14, 47)), ('10.227.147.98',
datetime.time(14, 47, 25))]
In [18]: [(k, ', '.join(str(s[0]) for s in v)) for k, v in groupby(a, lambda x: x[1].hour)]
Out[18]:
[(11, '172.18.74.146, 10.227.211.244, 10.227.215.68'),
(12, '10.227.209.139'),
(14, '10.227.147.98')]

This should work for you:
from __future__ import print_function
import datetime
import itertools
def iter_len(iterable):
return sum(1 for __ in iterable)
def by_hour(item): # Hour key
timestamp = item[1]
return '{}-{}'.format(timestamp.hour, (timestamp.hour+1) % 24)
def by_half_hour(item): # Half-hour key
timestamp = item[1]
half_hour = timestamp.hour + (0.5 * (timestamp.minute // 30))
return '{:.1f}-{:.1f}'.format(half_hour, (half_hour+0.5) % 24)
def get_results(data, key): # Name this more appropriately
data = sorted(data, key=key)
for key, grouper in itertools.groupby(data, key):
yield (key, iter_len(grouper))
data = [
('172.18.74.146', datetime.time(11, 28, 58)),
('10.227.211.244', datetime.time(11, 54, 19)),
('10.227.215.68', datetime.time(11, 54, 34)),
('10.227.209.139', datetime.time(12, 14, 47)),
('10.227.147.98', datetime.time(14, 47, 25)),
]
print('By Hour')
print(list(get_results(data, by_hour)))
print()
print("By Half Hour")
print(list(get_results(data, by_half_hour)))
Output:
$ ./SO_32081251.py
By Hour
[('11-12', 3), ('12-13', 1), ('14-15', 1)]
By Half Hour
[('11.0-11.5', 1), ('11.5-12.0', 2), ('12.0-12.5', 1), ('14.5-15.0', 1)]

This is almost what you want. Use the hour to group by:
for k,g in itertools.groupby(order, lambda x: x[1].hour):
print k,list(g)
Results in:
11 [('172.18.74.146', datetime.time(11, 28, 58)), ('10.227.211.244', datetime.time(11, 54, 19)), ('10.227.215.68', datetime.time(11, 54, 34))]
12 [('10.227.209.139', datetime.time(12, 14, 47))]
14 [('10.227.147.98', datetime.time(14, 47, 25))]

Related

Changing keys in dictionary

How can I change the keys of a dictionary of lists to display dates (outright) rather than datetime.date(2017, 1, 1) which is what they currently are formatted as?
Example of section of list:
{datetime.date(2017, 9, 7): [162.3, 163.24, 162.22, 163.18], datetime.date(2017, 7, 10): [160.44, 161.13, 160.44, 160.94],
I am rather new to python so any help would be much appreciated. Thanks.
call the strftime method for each datetime object:
import datetime
s = {datetime.date(2017, 9, 7): [162.3, 163.24, 162.22, 163.18], datetime.date(2017, 7, 10): [160.44, 161.13, 160.44, 160.94]}
new_s = {a.strftime("%Y-%m-%d"):b for a, b in s.items()}
Output:
{'2017-07-10': [160.44, 161.13, 160.44, 160.94], '2017-09-07': [162.3, 163.24, 162.22, 163.18]}
Try this:
import datetime
my_dict = {datetime.date(2017, 9, 7): [162.3, 163.24, 162.22, 163.18], datetime.date(2017, 7, 10): [160.44, 161.13, 160.44, 160.94]}
new_dict = {str(k): v for k,v in my_dict.items()}
Output:
{'2017-07-10': [160.44, 161.13, 160.44, 160.94],
'2017-09-07': [162.3, 163.24, 162.22, 163.18]}

Python: What is the best method to replace a string in a dictionary key

I have a python dictionary like this:
{('Live', '2017-Jan', '103400000', 'Amount'): 30,
('Live', '2017-Feb', '103400000', 'Amount'): 31,
('Live', '2017-Mar', '103400000', 'Amount'): 32,
('Live', '2017-Jan', '103401000', 'Amount'): 34
}
What is the best way to replace the 'Live' string with 'Live2' for all keys in the dictionary?
I tried already the following but it is throwing an error:
# cellset is the dictionary
for x in cellset:
x.replace('Live','Live1')
AttributeError: 'tuple' object has no attribute 'replace'
d = {
('Live', '2017-Jan', '103400000', 'Amount'): 30,
('Live', '2017-Feb', '103400000', 'Amount'): 31,
('Live', '2017-Mar', '103400000', 'Amount'): 32,
('Live', '2017-Jan', '103401000', 'Amount'): 34
}
new_d = {}
for k, v in d.items():
new_key = tuple('Live1' if el == 'Live' else el for el in k)
new_d[new_key] = v
print(new_d)
# Output:
# {('Live1', '2017-Jan', '103400000', 'Amount'): 30, ('Live1', '2017-Feb', '103400000', 'Amount'): 31, ('Live1', '2017-Mar', '103400000', 'Amount'): 32, ('Live1', '2017-Jan', '103401000', 'Amount'): 34}
Others have shown you how to create a new dictionary with 'Live' replaced by 'Live1'. If you want these replacement to take place in the original dictionary, a possible solution would look something like this
for (head, *rest), v in tuple(d.items()):
if head == "Live":
d[("Live1", *rest)] = v
del d[(head, *rest)]

Add a timestamp to data simulator

I am simulating time series data using Python TestData and trying to add a new key value (event_time) that includes a time stamp when the record is generated. The issue is that the field is not incrementing as the script runs, just at first execution. Is there a simple way to do this?
import testdata
import datetime
EVENT_TYPES = ["USER_DISCONNECT", "USER_CONNECTED", "USER_LOGIN", "USER_LOGOUT"]
class EventsFactory(testdata.DictFactory):
event_time = testdata.DateIntervalFactory(datetime.datetime.now(), datetime.timedelta(minutes=0))
start_time = testdata.DateIntervalFactory(datetime.datetime.now(), datetime.timedelta(minutes=12))
end_time = testdata.RelativeToDatetimeField("start_time", datetime.timedelta(minutes=20))
event_code = testdata.RandomSelection(EVENT_TYPES)
for event in EventsFactory().generate(100):
print event
Outputs:
{'start_time': datetime.datetime(2016, 6, 21, 17, 47, 50, 422020), 'event_code': 'USER_CONNECTED', 'event_time': datetime.datetime(2016, 6, 21, 17, 47, 50, 422006), 'end_time': datetime.datetime(2016, 6, 21, 18, 7, 50, 422020)}
{'start_time': datetime.datetime(2016, 6, 21, 17, 59, 50, 422020), 'event_code': 'USER_CONNECTED', 'event_time': datetime.datetime(2016, 6, 21, 17, 47, 50, 422006), 'end_time': datetime.datetime(2016, 6, 21, 18, 19, 50, 422020)}
{'start_time': datetime.datetime(2016, 6, 21, 18, 11, 50, 422020), 'event_code': 'USER_LOGOUT', 'event_time': datetime.datetime(2016, 6, 21, 17, 47, 50, 422006), 'end_time': datetime.datetime(2016, 6, 21, 18, 31, 50, 422020)}
So the timedelta() is how much into the future you want the event to happen. Notice that the timedelta(minutes=12) causes the time between each start_time generated to be 12 minutes from datetime.datetime.now() from the previous iteration of the for-loop (not the execution of the script). Similarly, the end_time is a relative timedelta(minutes=20) to start_time so it will always be 20 minutes in front of start_time. Your event_time is not incrementing because it has no delta (change) value for any time the code is run, and it will always use the datetime.datetime.now() from the time the script is run.
It if is test data, I think you would be looking for something like
import testdata
import datetime
EVENT_TYPES = ["USER_DISCONNECT", "USER_CONNECTED", "USER_LOGIN", "USER_LOGOUT"]
class EventsFactory(testdata.DictFactory):
start_time = testdata.DateIntervalFactory(datetime.datetime.now(), datetime.timedelta(minutes=12))
event_time = testdata.RelativeToDatetimeField("start_time", datetime.timedelta(minutes=10))
end_time = testdata.RelativeToDatetimeField("start_time", datetime.timedelta(minutes=20))
event_code = testdata.RandomSelection(EVENT_TYPES)
for event in EventsFactory().generate(100):
print event
Edit: if it doesn't have to do with the data provided:
So the testdata.DictFactory that you are passing in just creates a dictionary based on the instance variables you create as far as I can see.
You want an event_time instance variable that gets the time for every iteration of the for-loop, to do that it would look like:
import testdata
import datetime
EVENT_TYPES = ["USER_DISCONNECT", "USER_CONNECTED", "USER_LOGIN", "USER_LOGOUT"]
class EventsFactory(testdata.DictFactory):
start_time = testdata.DateIntervalFactory(datetime.datetime.now(), datetime.timedelta(minutes=12))
end_time = testdata.RelativeToDatetimeField("start_time", datetime.timedelta(minutes=20))
event_time = datetime.datetime.now()
event_code = testdata.RandomSelection(EVENT_TYPES)
for event in EventsFactory().generate(100):
print event
If I am understanding what you are wanting correctly, this should achieve it in the output.
Edit 2:
After looking at this again this may not achieve what you are wanting because EventsFactory().generate(100) seems to instantiate all 100 at the same time, and to get a dictionary key of event_time you would have to use the testdata.RelativeToDatetimeField() method to change the time
for event in EventsFactory().generate(10):
event["event_time"] = datetime.datetime.now()
print event

How can I sort an object in a list?

I have the following code
class Tupla:
def __init__(self, xid, prob):
self.xid = xid
self.prob = prob
and I append to a list some objects of this class
print myList1
>> [('X10', 11, ''), ('X9', 11, ''), ('X11', 11, ''), ('X2', 11, '')]
I try to order it
myList1 = sorted(myList1, key=lambda Tupla: Tupla.xid, reverse=True)
print myList1
>> [('X9', 11, ''), ('X2', 11, ''), ('X11', 11, ''), ('X10', 11, '')]
I am looking for human sorting. I need order the list like this:
[('X2', 11, ''), ('X9', 11, ''), ('X10', 11, ''), ('X11', 11, '')]
How can I do that?
def Tupla(xid, prob):
class _Tupla:
def __init__(self):
pass
def get_id(self):
return xid
def get_prob(self):
return prob
return _Tupla()
myList = map(lambda arg: Tupla(*arg), (('X10', 11), ('X9', 11), ('X11', 11), ('X2', 11)))
myList.sort(key = lambda obj: int(obj.get_id()[1:]))
print [(x.get_id(), x.get_prob()) for x in myList]
Output:
[('X2', 11), ('X9', 11), ('X10', 11), ('X11', 11)]

replace element from a tuple of tuples with empty

This my schema for the tuple:
(name, age, weight)
UserList = (('steve', 17, 178), ('Mike', 19, 178),('Pull', 24, 200),('Adam', 15, 154))
I want to check is the age is less than 18 I would like to replace the the tuple for that user with ( , , )
so the final result will looks like
(('', , ), ('Mike', 19, 178),('Pull', 24, 200),('', , ))
I tried
UserList = list(UserList)
for i,e in enumerate(UserList):
if e[1] < 18:
temp=list(UserList[i])
for f, tmp in enumerate(temp):
del temp[:]
But it didn't work, any thoughts or suggestions will be highly appreciated.
Thanks!
In [13]: UserList = tuple((n, a, w) if a >= 18 else ('', None, None) for (n, a, w) in UserList)
In [14]: UserList
Out[14]: (('', None, None), ('Mike', 19, 178), ('Pull', 24, 200), ('', None, None))

Categories

Resources