MongoDB Update N first elements in array - python

I have the yesterday price history in MongoDB collection:
{
ticker: 'APPL':
hist: [{'Time': datetime.datetime(2020, 7, 15, 0, 0),
'Close': 10.58,
'Volume': 71055},
{'Time': datetime.datetime(2020, 7, 14, 0, 0),
'Close': 10.28,
'Volume': 89012},
{'Time': datetime.datetime(2020, 7, 13, 0, 0),
'Close': 10.26,
'Volume': 12198}
}]
}
and today I have updated price:
{
'AAPL': [{'Time': datetime.datetime(2020, 7, 16, 0, 0),
'Close': 9.78,
'Volume': 8214},
{'Time': datetime.datetime(2020, 7, 15, 0, 0),
'Close': 11.03,
'Volume': 71033},
{'Time': datetime.datetime(2020, 7, 14, 0, 0),
'Close': 10.25,
'Volume': 89026}]
}
The expected result:
{
ticker: 'APPL':
hist: [{'Time': datetime.datetime(2020, 7, 16, 0, 0),
'Close': 9.78,
'Volume': 8214},
{'Time': datetime.datetime(2020, 7, 15, 0, 0),
'Close': 11.03,
'Volume': 71033},
{'Time': datetime.datetime(2020, 7, 14, 0, 0),
'Close': 10.25,
'Volume': 89026},
{'Time': datetime.datetime(2020, 7, 13, 0, 0),
'Close': 10.26,
'Volume': 12198}
}]
}
How to update the existing date and insert the new date the collection without deleting the n-first array first?
Another example, I have the 1-year length array of data, but I need to update the last 7 days in the array and also add today's data.
Any help would be greatly appreciated. Thank you.

Related

Not able to parse JSON data from text file using python script

I have a '.txt' file that contains JSON data like below.
[{'tradable': True, 'mode': 'full', 'instrument_token': 4708097, 'last_price': 178.65, 'last_traded_quantity': 5, 'average_traded_price': 180.1, 'volume_traded': 4581928, 'total_buy_quantity': 1282853, 'total_sell_quantity': 1673842, 'ohlc': {'open': 181.95, 'high': 181.95, 'low': 177.8, 'close': 181.0}, 'change': -1.2983425414364609, 'last_trade_time': datetime.datetime(2023, 1, 12, 13, 4, 58), 'oi': 0, 'oi_day_high': 0, 'oi_day_low': 0, 'exchange_timestamp': datetime.datetime(2023, 1, 12, 13, 5, 1), 'depth': {'buy': [{'quantity': 653, 'price': 178.6, 'orders': 8}, {'quantity': 2408, 'price': 178.55, 'orders': 15}, {'quantity': 6329, 'price': 178.5, 'orders': 22}, {'quantity': 9161, 'price': 178.45, 'orders': 24}, {'quantity': 7775, 'price': 178.4, 'orders': 17}], 'sell': [{'quantity': 5726, 'price': 178.7, 'orders': 8}, {'quantity': 4099, 'price': 178.75, 'orders': 11}, {'quantity': 23951, 'price': 178.8, 'orders': 25}, {'quantity': 7446, 'price': 178.85, 'orders': 21}, {'quantity': 11379, 'price': 178.9, 'orders': 21}]}}, {'tradable': True, 'mode': 'full', 'instrument_token': 871681, 'last_price': 972.55, 'last_traded_quantity': 1, 'average_traded_price': 973.85, 'volume_traded': 411290, 'total_buy_quantity': 152925, 'total_sell_quantity': 214765, 'ohlc': {'open': 971.75, 'high': 978.6, 'low': 969.0, 'close': 967.75}, 'change': 0.4959958667011061, 'last_trade_time': datetime.datetime(2023, 1, 12, 13, 4, 53), 'oi': 0, 'oi_day_high': 0, 'oi_day_low': 0, 'exchange_timestamp': datetime.datetime(2023, 1, 12, 13, 5, 4), 'depth': {'buy': [{'quantity': 6, 'price': 972.15, 'orders': 2}, {'quantity': 3, 'price': 972.1, 'orders': 2}, {'quantity': 15, 'price': 972.05, 'orders': 3}, {'quantity': 455, 'price': 972.0, 'orders': 16}, {'quantity': 14, 'price': 971.95, 'orders': 2}], 'sell': [{'quantity': 6, 'price': 972.5, 'orders': 3}, {'quantity': 49, 'price': 972.55, 'orders': 2}, {'quantity': 10, 'price': 972.6, 'orders': 1}, {'quantity': 27, 'price': 972.65, 'orders': 2}, {'quantity': 10, 'price': 972.7, 'orders': 1}]}}]
This data was written to a .txt file after it was recieved from zerodha websocket. Now, I want to read the data from the .txt file using my python script and want to load it as a json. But the json.loads() method in python throws the below error.
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 3 (char 2)
I have tried eval and ast.literal_eval methods in python as well but it didn't solve my problem. All I want is to be able to read the above data as a JSON to my python script. Any leads would be of great help.
Let me start off with THIS IS A BAD IDEA.
The comments on the question call out that this is not a JSON object in a file, rather the result of some other Python process printing the result and putting that in a file. The correct solution would be to modify the producer to use json.dumps() instead.
That aside, here's a DANGEROUS way to read that source file into a Python object.
import datetime # needed for `eval()`
with open('textfile.txt', 'r') as f:
data = eval(f.read())
from pprint import pprint
pprint(data)
This will produce the following output from that input:
[{'average_traded_price': 180.1,
'change': -1.2983425414364609,
'depth': {'buy': [{'orders': 8, 'price': 178.6, 'quantity': 653},
{'orders': 15, 'price': 178.55, 'quantity': 2408},
{'orders': 22, 'price': 178.5, 'quantity': 6329},
{'orders': 24, 'price': 178.45, 'quantity': 9161},
{'orders': 17, 'price': 178.4, 'quantity': 7775}],
'sell': [{'orders': 8, 'price': 178.7, 'quantity': 5726},
{'orders': 11, 'price': 178.75, 'quantity': 4099},
{'orders': 25, 'price': 178.8, 'quantity': 23951},
{'orders': 21, 'price': 178.85, 'quantity': 7446},
{'orders': 21, 'price': 178.9, 'quantity': 11379}]},
'exchange_timestamp': datetime.datetime(2023, 1, 12, 13, 5, 1),
'instrument_token': 4708097,
'last_price': 178.65,
'last_trade_time': datetime.datetime(2023, 1, 12, 13, 4, 58),
'last_traded_quantity': 5,
'mode': 'full',
'ohlc': {'close': 181.0, 'high': 181.95, 'low': 177.8, 'open': 181.95},
'oi': 0,
'oi_day_high': 0,
'oi_day_low': 0,
'total_buy_quantity': 1282853,
'total_sell_quantity': 1673842,
'tradable': True,
'volume_traded': 4581928},
{'average_traded_price': 973.85,
'change': 0.4959958667011061,
'depth': {'buy': [{'orders': 2, 'price': 972.15, 'quantity': 6},
{'orders': 2, 'price': 972.1, 'quantity': 3},
{'orders': 3, 'price': 972.05, 'quantity': 15},
{'orders': 16, 'price': 972.0, 'quantity': 455},
{'orders': 2, 'price': 971.95, 'quantity': 14}],
'sell': [{'orders': 3, 'price': 972.5, 'quantity': 6},
{'orders': 2, 'price': 972.55, 'quantity': 49},
{'orders': 1, 'price': 972.6, 'quantity': 10},
{'orders': 2, 'price': 972.65, 'quantity': 27},
{'orders': 1, 'price': 972.7, 'quantity': 10}]},
'exchange_timestamp': datetime.datetime(2023, 1, 12, 13, 5, 4),
'instrument_token': 871681,
'last_price': 972.55,
'last_trade_time': datetime.datetime(2023, 1, 12, 13, 4, 53),
'last_traded_quantity': 1,
'mode': 'full',
'ohlc': {'close': 967.75, 'high': 978.6, 'low': 969.0, 'open': 971.75},
'oi': 0,
'oi_day_high': 0,
'oi_day_low': 0,
'total_buy_quantity': 152925,
'total_sell_quantity': 214765,
'tradable': True,
'volume_traded': 411290}]
Again, I will restate, THIS IS A BAD IDEA.
Read more on the specifics of eval here: https://realpython.com/python-eval-function/
Your JSON is not valid. You can check that by using an online JSON validator like this one: https://jsonformatter.org/
After entering the “JSON”, you can see it is not in a valid format.
You could either export it right, which I would highly recommend, or you could replace the wrong chars.
Currently, you have three issues:
You are using single quotes instead of double quotes
You are not parsing your datetime object, it looks like you just insert the object, you have to serialize it
You are writing true values as True, but that is the python way and not the JSON way. You either have to write it as true, or you have to pass it as a string. I would recommend the first one.
It could look like this (but I didnt parse datetime right, I just stringified it):
[{"tradable": true, "mode": "full", "instrument_token": 4708097, "last_price": 178.65, "last_traded_quantity": 5, "average_traded_price": 180.1, "volume_traded": 4581928, "total_buy_quantity": 1282853, "total_sell_quantity": 1673842, "ohlc": {"open": 181.95, "high": 181.95, "low": 177.8, "close": 181.0}, "change": -1.2983425414364609, "last_trade_time": "datetime.datetime(2023, 1, 12, 13, 4, 58)", "oi": 0, "oi_day_high": 0, "oi_day_low": 0, "exchange_timestamp": "datetime.datetime(2023, 1, 12, 13, 5, 1)", "depth": {"buy": [{"quantity": 653, "price": 178.6, "orders": 8}, {"quantity": 2408, "price": 178.55, "orders": 15}, {"quantity": 6329, "price": 178.5, "orders": 22}, {"quantity": 9161, "price": 178.45, "orders": 24}, {"quantity": 7775, "price": 178.4, "orders": 17}], "sell": [{"quantity": 5726, "price": 178.7, "orders": 8}, {"quantity": 4099, "price": 178.75, "orders": 11}, {"quantity": 23951, "price": 178.8, "orders": 25}, {"quantity": 7446, "price": 178.85, "orders": 21}, {"quantity": 11379, "price": 178.9, "orders": 21}]}}, {"tradable": true, "mode": "full", "instrument_token": 871681, "last_price": 972.55, "last_traded_quantity": 1, "average_traded_price": 973.85, "volume_traded": 411290, "total_buy_quantity": 152925, "total_sell_quantity": 214765, "ohlc": {"open": 971.75, "high": 978.6, "low": 969.0, "close": 967.75}, "change": 0.4959958667011061, "last_trade_time": "datetime.datetime(2023, 1, 12, 13, 4, 53)", "oi": 0, "oi_day_high": 0, "oi_day_low": 0, "exchange_timestamp": "datetime.datetime(2023, 1, 12, 13, 5, 4)", "depth": {"buy": [{"quantity": 6, "price": 972.15, "orders": 2}, {"quantity": 3, "price": 972.1, "orders": 2}, {"quantity": 15, "price": 972.05, "orders": 3}, {"quantity": 455, "price": 972.0, "orders": 16}, {"quantity": 14, "price": 971.95, "orders": 2}], "sell": [{"quantity": 6, "price": 972.5, "orders": 3}, {"quantity": 49, "price": 972.55, "orders": 2}, {"quantity": 10, "price": 972.6, "orders": 1}, {"quantity": 27, "price": 972.65, "orders": 2}, {"quantity": 10, "price": 972.7, "orders": 1}]}}]

delete dict in list from dict in list

I have two lists (with dicts in it):
old_device_data_list = [{'_id': ObjectId('5f48c8e34545fac49fbff5'), 'device_id': 5, 'time': datetime.datetime(2020, 8, 26, 9, 5, 39, 827000), 'values': {'count': 100, 'late': 0, 'max': 0, 'min': 0, 'on_time': 100, 'sum': 100}}]
result = [{'_id': ObjectId('5f48c8e3997640fac49fbff5'), 'device_id': 5, 'time': datetime.datetime(2020, 8, 26, 9, 5, 39, 827000), 'values': {'count': 100, 'late': 0, 'max': 0, 'min': 0, 'on_time': 100, 'sum': 100}}, {'_id': ObjectId('5f48c8e3997640fac49fbff6'), 'device_id': 4, 'time': datetime.datetime(2020, 8, 26, 9, 5, 39, 827000), 'values': {'count': 180, 'late': 0, 'max': 0, 'min': 0, 'on_time': 180, 'sum': 180}}, {'_id': ObjectId('5f48c8e3997640fac49fbff8'), 'device_id': 3, 'time': datetime.datetime(2020, 8, 27, 9, 5, 39, 827000), 'values': {'count': 50, 'late': 0, 'max': 0, 'min': 0, 'on_time': 50, 'sum': 50}}, {'_id': ObjectId('5f48c8e3997640fac49fbff7'), 'device_id': 4, 'time': datetime.datetime(2020, 8, 27, 9, 5, 39, 827000), 'values': {'count': 120, 'late': 0, 'max': 0, 'min': 0, 'on_time': 120, 'sum': 120}}, {'_id': ObjectId('5f48c8e3997640fac49fbff9'), 'device_id': 3, 'time': datetime.datetime(2020, 8, 28, 9, 5, 39, 827000), 'values': {'count': 210, 'late': 0, 'max': 0, 'min': 0, 'on_time': 210, 'sum': 210}}]
I want to delete the dicts from the old_device_data_list out of the result list. I tried it with numpy with:
numpy.setdiff1d(result, old_device_data_list)
Then I got error:
TypeError: '<' not supported between instances of 'dict' and 'dict'
The description of numpy.setdiff1d says:
Return the sorted, unique values in ar1 that are not in ar2.
In order to sort the values, it needs to compare them using the < operator. But dictionaries cannot be compared like this. The relation "smaller than" is not defined for dictionaries.
NumPy is designed for working with numeric values, not for arbitrary Python data structures.
You could use a simple list comprehension to create a list of those dictionaries that are in result but not in old_device_data_list:
result = [d for d in result if d not in old_device_data_list]

Merging overlapping interval objects with dependencies

i need to merge interval objects to get distinct ranges of intervals based on extra parameters. How is the best way to do that?
It's about unambiguous statement whether in a given hour state is true. The returned list must have non-duplicated intervals.
Interval object description:
{
'startDate': datetime.datetime, # start of interval
'endDate': datetime.datetime, # end of interval
'prioritized': bool # if True - it's always important, override no-prioritized intervals
'state': bool # result of interval
}
In the examples below i changed startDate/endDate to strings to make them look better.
Interval list look like:
interval_list = [
{'startDate': '10:00:00', 'endDate': '12:00:00', 'prioritized': False, 'state': False},
{'startDate': '11:00:00', 'endDate': '18:00:00', 'prioritized': True, 'state': True},
{'startDate': '13:00:00', 'endDate': '17:00:00', 'prioritized': False, 'state': False},
{'startDate': '17:00:00', 'endDate': '20:00:00', 'prioritized': False, 'state': True},
{'startDate': '19:30:00', 'endDate': '19:45:00', 'prioritized': True, 'state': False}
]
I am trying to achieve the following:
merge(interval_list) should return:
[
{'startDate': '10:00:00', 'endDate': '11:00:00', 'state': False},
{'startDate': '11:00:00', 'endDate': '19:30:00', 'state': True},
{'startDate': '19:30:00', 'endDate': '19:45:00', 'state': False},
{'startDate': '19:45:00', 'endDate': '20:00:00', 'state': True},
]
I have following not completed code right now:
def merge_range(ranges: list):
ranges = sorted(ranges, key=lambda x: x['startDate'])
last_interval = dict(ranges[0])
for current_interval in sorted(ranges, key=lambda x: x['startDate']):
if current_interval['startDate'] > last_interval['endDate']:
yield dict(last_interval)
last_interval['startDate'] = current_interval['startDate']
last_interval['endDate'] = current_interval['endDate']
last_interval['prioritized'] = current_interval['prioritized']
last_interval['state'] = current_interval['state']
else:
if current_interval['state'] == last_interval['state']:
last_interval['endDate'] = max(last_interval['endDate'], current_interval['endDate'])
else:
pass # i stopped here
yield dict(last_interval)
And use it by merged_interval_list = list(merge_range(interval_list))
Is it a good way ?
I got an answer for this question:
For first i separate events to prioritized and non-prioritize lists.
Based on the priority list, I create a negation of the interval on a given day.
Next i set prioritized list as main list and start iterate over non-prioritize list.
import datetime
from pprint import pprint
df = "%Y-%m-%d %H:%M:%S"
ds = "%Y-%m-%d"
events = {}
prioritized_events = {}
events["2019-05-10"] = [{
'startDate': datetime.datetime.strptime("2019-05-10 01:00:00", df),
'endDate': datetime.datetime.strptime("2019-05-10 02:00:00", df),
'state': True
}, {
'startDate': datetime.datetime.strptime("2019-05-10 10:00:00", df),
'endDate': datetime.datetime.strptime("2019-05-10 12:00:00", df),
'state': False
}, {
'startDate': datetime.datetime.strptime("2019-05-10 13:00:00", df),
'endDate': datetime.datetime.strptime("2019-05-10 17:00:00", df),
'state': False
}, {
'startDate': datetime.datetime.strptime("2019-05-10 17:00:00", df),
'endDate': datetime.datetime.strptime("2019-05-10 20:00:00", df),
'state': True
}]
prioritized_events["2019-05-10"] = [{
'startDate': datetime.datetime.strptime("2019-05-10 11:00:00", df),
'endDate': datetime.datetime.strptime("2019-05-10 18:00:00", df),
'state': True
}, {
'startDate': datetime.datetime.strptime("2019-05-10 19:30:00", df),
'endDate': datetime.datetime.strptime("2019-05-10 20:00:00", df),
'state': False
}]
allowed_intervals = []
for event_date in prioritized_events:
minimal_time = datetime.datetime.combine(datetime.datetime.strptime(event_date, ds), datetime.time.min)
maximum_time = datetime.datetime.combine(datetime.datetime.strptime(event_date, ds), datetime.time.max)
for ev in prioritized_events[event_date]:
if ev['startDate'] != minimal_time:
allowed_intervals.append({
'startDate': minimal_time,
'endDate': ev['startDate']
})
minimal_time = ev['endDate']
if prioritized_events[event_date][len(prioritized_events[event_date]) - 1]['endDate'] != maximum_time:
allowed_intervals.append({
'startDate': prioritized_events[event_date][len(prioritized_events[event_date]) - 1]['endDate'],
'endDate': maximum_time
})
for event_date in events:
if event_date not in prioritized_events:
prioritized_events[event_date] = events[event_date]
else:
for ev in events[event_date]:
start = ev['startDate']
end = ev['endDate']
state = ev['state']
done = False
for allowed_interval in allowed_intervals:
if start >= allowed_interval['startDate'] and end <= allowed_interval['endDate']:
prioritized_events[event_date].append({
'startDate': start,
'endDate': end,
'state': state
})
done = True
break
elif allowed_interval['startDate'] <= start < allowed_interval['endDate'] < end:
prioritized_events[event_date].append({
'startDate': start,
'endDate': allowed_interval['endDate'],
'state': state
})
start = allowed_interval['endDate']
elif start < allowed_interval['startDate'] and start < allowed_interval['endDate'] < end:
prioritized_events[event_date].append({
'startDate': allowed_interval['startDate'],
'endDate': allowed_interval['endDate'],
'state': state
})
start = allowed_interval['endDate']
elif start < allowed_interval['startDate'] and start < allowed_interval['endDate'] and allowed_interval['startDate'] < end <= allowed_interval['endDate']:
prioritized_events[event_date].append({
'startDate': allowed_interval['startDate'],
'endDate': end,
'state': state
})
start = end
if done:
continue
prioritized_events[event_date] = sorted(prioritized_events[event_date], key=lambda k: k['startDate'])
And now sorted list:
pprint(prioritized_events["2019-05-10"])
returns:
[
{'startDate': datetime.datetime(2019, 5, 10, 1, 0),
'endDate': datetime.datetime(2019, 5, 10, 2, 0),
'state': True
},
{'startDate': datetime.datetime(2019, 5, 10, 10, 0),
'endDate': datetime.datetime(2019, 5, 10, 11, 0),
'state': False
},
{'startDate': datetime.datetime(2019, 5, 10, 11, 0),
'endDate': datetime.datetime(2019, 5, 10, 18, 0),
'state': True
},
{'startDate': datetime.datetime(2019, 5, 10, 18, 0),
'endDate': datetime.datetime(2019, 5, 10, 19, 30),
'state': True
},
{'startDate': datetime.datetime(2019, 5, 10, 19, 30),
'endDate': datetime.datetime(2019, 5, 10, 20, 0),
'state': False
}
]
When we deal with time intervals, the main idea is to sort the dates (start and end) along with their status: start or end. Here, we need an access to the original interval too, to handle priorities and states.
Let's try with this list:
interval_list = [
{'startDate': '10:00:00', 'endDate': '12:00:00', 'prioritized': False, 'state': False},
{'startDate': '11:00:00', 'endDate': '18:00:00', 'prioritized': True, 'state': True},
{'startDate': '13:00:00', 'endDate': '17:00:00', 'prioritized': False, 'state': False},
{'startDate': '17:00:00', 'endDate': '20:00:00', 'prioritized': False, 'state': True},
{'startDate': '19:30:00', 'endDate': '19:45:00', 'prioritized': True, 'state': False}
]
First, we convert datestrings to dates (as you did):
import datetime
day = '2019-05-10'
def get_datetime(d, t):
return datetime.datetime.strptime(d+" "+t, "%Y-%m-%d %H:%M:%S")
for interval in interval_list:
interval['startDate'] = get_datetime(day, interval['startDate'])
interval['endDate'] = get_datetime(day, interval['endDate'])
Now, we build a new list with the needed information:
L = sorted(
[(interval['startDate'], 1, i) for i, interval in enumerate(interval_list)]
+[(interval['endDate'], -1, i) for i, interval in enumerate(interval_list)]
)
L is the following list of tuples (date, dir, index) (dir: 1 means it's a start date, -1 means it's an end date):
[(datetime.datetime(2019, 5, 10, 10, 0), 1, 0), (datetime.datetime(2019, 5, 10, 11, 0), 1, 1), (datetime.datetime(2019, 5, 10, 12, 0), -1, 0), (datetime.datetime(2019, 5, 10, 13, 0), 1, 2), (datetime.datetime(2019, 5, 10, 17, 0), -1, 2), (datetime.datetime(2019, 5, 10, 17, 0), 1, 3), (datetime.datetime(2019, 5, 10, 18, 0), -1, 1), (datetime.datetime(2019, 5, 10, 19, 30), 1, 4), (datetime.datetime(2019, 5, 10, 19, 45), -1, 4), (datetime.datetime(2019, 5, 10, 20, 0), -1, 3)]
Now we can iterate over L and keep a track of the current state and priority to yield dates when state is modified according to given priority:
def interval_info(i):
interval = interval_list[i]
return interval['state'], interval['prioritized']
T = []
stack = []
for boundary_date, direction, i in L:
state, prioritized = interval_info(i) # state and priority of the current date
if direction == 1: # start date
if stack:
prev_state, prev_prioritized = interval_info(stack[-1]) # previous infos
if state != prev_state and prioritized >= prev_prioritized: # enter a new state with a greater or equal priority
T.append((boundary_date, state)) # enter in new state
else: # begin of covered area
T.append((boundary_date, state)) # enter in new state
stack.append(i) # add the opened interval
elif direction == -1: # end date
stack.remove(i) # remove the closed interval (i is a *value* in stack)
if stack:
prev_state, prev_prioritized = interval_info(stack[-1])
if state != prev_state and not prev_prioritized: # leave a non priority state
T.append((boundary_date, prev_state)) # re-enter in prev state
else: # end of covered area
T.append((boundary_date, None)) # enter in None state
The value of T is:
[(datetime.datetime(2019, 5, 10, 10, 0), False), (datetime.datetime(2019, 5, 10, 11, 0), True), (datetime.datetime(2019, 5, 10, 19, 30), False), (datetime.datetime(2019, 5, 10, 19, 45), True), (datetime.datetime(2019, 5, 10, 20, 0), None)]
You can then easily produce the output you wanted. Hope it helps!
EDIT: Bonus: how to convert start dates to time intervals:
>>> import datetime
>>> T = [(datetime.datetime(2019, 5, 10, 10, 0), False), (datetime.datetime(2019, 5, 10, 11, 0), True), (datetime.datetime(2019, 5, 10, 19, 30), False), (datetime.datetime(2019, 5, 10, 19, 45), True), (datetime.datetime(2019, 5, 10, 20, 0), None)]
>>> [{'startDate': s[0], 'endDate': e[0], 'state': s[1]} for s,e in zip(T, T[1:])]
[{'startDate': datetime.datetime(2019, 5, 10, 10, 0), 'endDate': datetime.datetime(2019, 5, 10, 11, 0), 'state': False}, {'startDate': datetime.datetime(2019, 5, 10, 11, 0), 'endDate': datetime.datetime(2019, 5, 10, 19, 30), 'state': True}, {'startDate': datetime.datetime(2019, 5, 10, 19, 30), 'endDate': datetime.datetime(2019, 5, 10, 19, 45), 'state': False}, {'startDate': datetime.datetime(2019, 5, 10, 19, 45), 'endDate': datetime.datetime(2019, 5, 10, 20, 0), 'state': True}]
You just have to zip every start date with the the next one to get intervals.

Sort list of Dict By Multiple Keys, Including List

I would like to sort this list of dicts by a list key and then by date.
I am trying to sort the dicts by 'label' according the label_order and then by descending 'date'.
label_order = [3, 4, 2, 1]
data = [
{'label': 1, 'data': 5, 'date': datetime(2018, 12, 31)},
{'label': 3, 'data': 2, 'date': datetime(2017, 12, 31)},
{'label': 3, 'data': 1, 'date': datetime(2018, 12, 31)},
{'label': 4, 'data': 3, 'date': datetime(2018, 12, 31)},
{'label': 4, 'data': 4, 'date': datetime(2018, 12, 25)},
]
After sorting would look like this:
data = [
{'label': 3, 'data': 1, 'date': datetime(2018, 12, 31)},
{'label': 3, 'data': 2, 'date': datetime(2017, 12, 31)},
{'label': 4, 'data': 3, 'date': datetime(2018, 12, 31)},
{'label': 4, 'data': 4, 'date': datetime(2018, 12, 25)},
{'label': 1, 'data': 5, 'date': datetime(2018, 12, 31)},
]
I've tried lambda expressions and itemgetter, but I am having difficulty combining the right strategies for the sort key. Maybe it is just trying to do too much at one time.
Any help or direction would be appreciated.
A more efficient approach is to build a dict that maps items in label_order to indices, so that you can use the indices as keys when performing the sort:
keys = {n: i for i, n in enumerate(label_order)}
sorted(data, key=lambda d: (-keys[d['label']], d['date']), reverse=True)
This returns:
[{'label': 3, 'data': 1, 'date': datetime(2018, 12, 31)},
{'label': 3, 'data': 2, 'date': datetime(2017, 12, 31)},
{'label': 4, 'data': 3, 'date': datetime(2018, 12, 31)},
{'label': 4, 'data': 4, 'date': datetime(2018, 12, 25)},
{'label': 1, 'data': 5, 'date': datetime(2018, 12, 31)}]
It's a little tricky to sort dates in reverse order. Instead, let's use the negative of the label's index so they're sorted in descending order. Then we can reverse the sorting and get the results in the order we actually want!
from datetime import datetime
label_order = [3, 4, 2, 1]
data = [
{'label': 1, 'data': 5, 'date': datetime(2018, 12, 31)},
{'label': 3, 'data': 2, 'date': datetime(2017, 12, 31)},
{'label': 3, 'data': 1, 'date': datetime(2018, 12, 31)},
{'label': 4, 'data': 3, 'date': datetime(2018, 12, 31)},
{'label': 4, 'data': 4, 'date': datetime(2018, 12, 25)},
]
def descending_sort_key(item):
return -label_order.index(item['label']), item['date']
data.sort(key=descending_sort_key, reverse=True)
Voila - no date math or other trickery.

Python : get the next working time slot

Handling a staff planning (Python 2.7), I have for a given person a list of working days (and times) in isoWeek.
With a given date I'm trying to determine when will be the next date the staff person will be available.
working_days = {
1:
{
'start': datetime.time(8, 0),
'end': datetime.time(17, 15)
},
2:
{
'start': datetime.time(7, 45),
'end': datetime.time(17, 0)
},
3:
{
'start': datetime.time(8, 0),
'end': datetime.time(16, 45)
},
4:
{
'start': datetime.time(10, 0),
'end': datetime.time(15, 30)
},
5:
{
'start': datetime.time(8, 30),
'end': datetime.time(17, 15)
}
}
searched_date = datetime.datetime(2018, 6, 13, 20, 57, 00) // Wednesday
searched_date_week_day = searched_date.isoweekday()
expected result : datetime.datetime(2018, 6, 14, 10, 00, 00) // Thursday at 10
I would have liked to just use the weekdays to achieve this result but I have my doubts on it.
What I wrote is tricked on the simplest example (my searched time is greater than the end working time of the staff)
next_working_day = next((x for x in working_days if x >= searched_date_week_day), None)
gives: 3 , Wednesday
Considering there will me more tricky cases (let's imagine my searched date is on a Sunday, isoweek 7, it doesn't work), is my only way out is to transform my working_days in some datetime.datetime list?
Fiddle
Using your Fiddle as a starting point:
import datetime
working_days = [{'start': datetime.time(8, 0), 'end': datetime.time(17, 0), 'day': 1}, {'start': datetime.time(8, 0), 'end': datetime.time(17, 0), 'day': 2}, {'start': datetime.time(8, 0), 'end': datetime.time(17, 0), 'day': 3}, {'start': datetime.time(8, 0), 'end': datetime.time(17, 0), 'day': 4}]
searched_date = datetime.datetime(2018, 6, 13, 20, 57, 00)
searched_date_week_day = searched_date.isoweekday()
i = 0
while i < len(working_days) and working_days[i]['day'] < searched_date_week_day:
i = i + 1
# if reached the end then next working day must be first day of next week
if i == len(working_days):
next_working_day = working_days[0]
else:
next_working_day = working_days[(i+1) % len(working_days)]
print next_working_day
It assumes that the entries in your list are in day order.

Categories

Resources