In MySQL, If I have a list of date ranges (range-start and range-end). e.g.
10/06/1983 to 14/06/1983
15/07/1983 to 16/07/1983
18/07/1983 to 18/07/1983
And I want to check if another date range contains ANY of the ranges already in the list, how would I do that?
e.g.
06/06/1983 to 18/06/1983 = IN LIST
10/06/1983 to 11/06/1983 = IN LIST
14/07/1983 to 14/07/1983 = NOT IN LIST
This is a classical problem, and it's actually easier if you reverse the logic.
Let me give you an example.
I'll post one period of time here, and all the different variations of other periods that overlap in some way.
|-------------------| compare to this one
|---------| contained within
|----------| contained within, equal start
|-----------| contained within, equal end
|-------------------| contained within, equal start+end
|------------| not fully contained, overlaps start
|---------------| not fully contained, overlaps end
|-------------------------| overlaps start, bigger
|-----------------------| overlaps end, bigger
|------------------------------| overlaps entire period
on the other hand, let me post all those that doesn't overlap:
|-------------------| compare to this one
|---| ends before
|---| starts after
So if you simple reduce the comparison to:
starts after end
ends before start
then you'll find all those that doesn't overlap, and then you'll find all the non-matching periods.
For your final NOT IN LIST example, you can see that it matches those two rules.
You will need to decide wether the following periods are IN or OUTSIDE your ranges:
|-------------|
|-------| equal end with start of comparison period
|-----| equal start with end of comparison period
If your table has columns called range_end and range_start, here's some simple SQL to retrieve all the matching rows:
SELECT *
FROM periods
WHERE NOT (range_start > #check_period_end
OR range_end < #check_period_start)
Note the NOT in there. Since the two simple rules finds all the non-matching rows, a simple NOT will reverse it to say: if it's not one of the non-matching rows, it has to be one of the matching ones.
Applying simple reversal logic here to get rid of the NOT and you'll end up with:
SELECT *
FROM periods
WHERE range_start <= #check_period_end
AND range_end >= #check_period_start
Taking your example range of 06/06/1983 to 18/06/1983 and assuming you have columns called start and end for your ranges, you could use a clause like this
where ('1983-06-06' <= end) and ('1983-06-18' >= start)
i.e. check the start of your test range is before the end of the database range, and that the end of your test range is after or on the start of the database range.
If your RDBMS supports the OVERLAP() function then this becomes trivial -- no need for homegrown solutions. (In Oracle it apparantly works but is undocumented).
In your expected results you say
06/06/1983 to 18/06/1983 = IN LIST
However, this period does not contain nor is contained by any of the periods in your table (not list!) of periods. It does, however, overlap the period 10/06/1983 to 14/06/1983.
You may find the Snodgrass book (http://www.cs.arizona.edu/people/rts/tdbbook.pdf) useful: it pre-dates mysql but the concept of time hasn't changed ;-)
I created function to deal with this problem in MySQL. Just convert the dates to seconds before use.
DELIMITER ;;
CREATE FUNCTION overlap_interval(x INT,y INT,a INT,b INT)
RETURNS INTEGER DETERMINISTIC
BEGIN
DECLARE
overlap_amount INTEGER;
IF (((x <= a) AND (a < y)) OR ((x < b) AND (b <= y)) OR (a < x AND y < b)) THEN
IF (x < a) THEN
IF (y < b) THEN
SET overlap_amount = y - a;
ELSE
SET overlap_amount = b - a;
END IF;
ELSE
IF (y < b) THEN
SET overlap_amount = y - x;
ELSE
SET overlap_amount = b - x;
END IF;
END IF;
ELSE
SET overlap_amount = 0;
END IF;
RETURN overlap_amount;
END ;;
DELIMITER ;
Look into the following example. It will helpful for you.
SELECT DISTINCT RelatedTo,CAST(NotificationContent as nvarchar(max)) as NotificationContent,
ID,
Url,
NotificationPrefix,
NotificationDate
FROM NotificationMaster as nfm
inner join NotificationSettingsSubscriptionLog as nfl on nfm.NotificationDate between nfl.LastSubscribedDate and isnull(nfl.LastUnSubscribedDate,GETDATE())
where ID not in(SELECT NotificationID from removednotificationsmaster where Userid=#userid) and nfl.UserId = #userid and nfl.RelatedSettingColumn = RelatedTo
Try This on MS SQL
WITH date_range (calc_date) AS (
SELECT DATEADD(DAY, DATEDIFF(DAY, 0, [ending date]) - DATEDIFF(DAY, [start date], [ending date]), 0)
UNION ALL SELECT DATEADD(DAY, 1, calc_date)
FROM date_range
WHERE DATEADD(DAY, 1, calc_date) <= [ending date])
SELECT P.[fieldstartdate], P.[fieldenddate]
FROM date_range R JOIN [yourBaseTable] P on Convert(date, R.calc_date) BETWEEN convert(date, P.[fieldstartdate]) and convert(date, P.[fieldenddate])
GROUP BY P.[fieldstartdate], P.[fieldenddate];
CREATE FUNCTION overlap_date(s DATE, e DATE, a DATE, b DATE)
RETURNS BOOLEAN DETERMINISTIC
RETURN s BETWEEN a AND b or e BETWEEN a and b or a BETWEEN s and e;
Another method by using BETWEEN sql statement
Periods included :
SELECT *
FROM periods
WHERE #check_period_start BETWEEN range_start AND range_end
AND #check_period_end BETWEEN range_start AND range_end
Periods excluded :
SELECT *
FROM periods
WHERE (#check_period_start NOT BETWEEN range_start AND range_end
OR #check_period_end NOT BETWEEN range_start AND range_end)
SELECT *
FROM tabla a
WHERE ( #Fini <= a.dFechaFin AND #Ffin >= a.dFechaIni )
AND ( (#Fini >= a.dFechaIni AND #Ffin <= a.dFechaFin) OR (#Fini >= a.dFechaIni AND #Ffin >= a.dFechaFin) OR (a.dFechaIni>=#Fini AND a.dFechaFin <=#Ffin) OR
(a.dFechaIni>=#Fini AND a.dFechaFin >=#Ffin) )
Related
I need to use a string to determine which calculation to run. I am trying to use a dispatch table instead of an elif ladder. I need to run some one liners, some multi-line functions, and I need to run a function based on a portion a an incoming state.
This code is simplified to explain. The first 4 functions work but the last 3 do not.
<!-- language: python -->
Fun = functions.get(reference, lambda : print('Invalid Ref'))
fun(my_df, start, stop)
def Ripple(df, start, stop):#Some multi-line function
temp = df.trc3_s12_db[df.index >= start, df.index <= stop]
return temp.values.max() - temp.values.min()
def RAve(df, start, stop, ave, spacing=100):#Changing function
return df.trc3_s12_db.rolling(ave*spacing).[df.index >= start, df.index <= stop].min()
functions = { #Dispatch Table
'MinA': lambda df, start, stop: df[df.index >= start, df.index <= stop].tA.min() * (-1),
'MaxA': lambda df, start, stop: df[df.index >= start, df.index <= stop].tA.max() * (-1),
'MinB': lambda df, start, stop: df[df.index >= start, df.index <= stop].tB.min() * (-1),
'Ripple': Ripple,
'5MHz Ave': RAve(ave=5),
'2.2MHz Ave': RAve(ave=2.2),
'%dMHz Ave': RAve(ave=%d) #Is this possible?
}
I know I can pass functions and arguments using a tuple but then the whole table needs to be a tuple.
Can I pass variables through the formatting of a string into a dispatch table?
What is the best way to sort through these possibilities?
For the first part, it looks like you're trying to create a partial function, a function made of anotherfunction with some of it's arguments 'burned in'.
For the second part '%dMHz Ave': RAve(ave=%d) #Is this possible? - no, this isn't possible. You'd need some other logic to detect this case, and then use something other than the dispatch table (dict) in that case. E.g. Use a regexp to check if the expression matches "xxxMHz Ave" and in that case to use RAve(ave=xxx)m
I am trying to implement an attendance system. I have two list which I already convert them to Unix timestamp. One list contains a fix timetable and the other is the log of the student if they clock in.
For example the timetable list might include
timetable[1519650000, 1519740000, 1519743600]
timetableEnd[1519653600, 1519743600, 1519747200]
log[1519739987, 1519744087]
In the human readable way but not in the code
timetable[2018-02-26 13:00:00, 2018-02-27 14:00:00, 2018-02-27 15:00:00]
timetableEnd[2018-02-26 14:00:00, 2018-02-27 15:00:00, 2018-02-27 16:00:00]
log[2018-02-27 13:59:47, 2018-02-27 15:08:07]
Is the a way to loop every single log element to check with timetable element that will agree this condition
a <= x <= b
Where a = timetable begin time - 5mins
x = log time
b = end time for timetable time
Example: (1519650000 - 300) <= x <= 1519653600
return false since log doesnt have a value that satisfy this
Can I get some advice or a guidance on how should I proceed this.
You can build pairs of neighbors of a list using the pattern zip(a[:-1], a[1:]) and you can pair up the elements of two equally long lists using zip(a, b). Using these two things you can try this:
if all(start-300 <= log_element <= end
for ((start, end), log_element) in zip(
zip(timetable[:-1], timetable[1:]), log)):
print("All logs are in their boundaries.")
In your case this will not succeed because log[1] is not between timetable[1]-300 and timetable[2].
In case you are not familiar with Python's elegant functional style, you might find it easier to understand it phrased this (less elegant) way:
def all_logs_in_boundaries(timetable, log):
for ((start, end), log_element) in zip(
zip(timetable[:-1], timetable[1:]), log)):
if not (start-300 <= log_element <= end):
return False
return True # or: print("All logs are in their boundaries.")
Why don't you use datetime?
Use something like this
import datetime
logtime = datetime.datetime.fromtimestamp(x)
begintime = datatime.datetime.fromtimestamp(a)
arrivaltime = logtime - datetime.timedelta(minutes=5)
return (arrivaltime <= logtime and logtime <= begintime)
logtime is the datetime from x
begintime is the datetime from a
and arrivaltime is 5 minutes earlier than begintime
I have a problem with a nested list, time as elements
time=[(2017-01-01T00:00:00.000000Z,2017-01-01T00:00:39.820000Z),
(2017-01-01T00:00:38.840000Z,2017-01-01T01:36:33.260000Z),
(2017-01-01T01:36:45.960000Z,2017-01-01T03:06:15.340000Z),
(2017-01-01T03:06:24.320000Z,2017-01-01T03:31:00.420000Z),
(2017-01-01T03:31:22.880000Z,2017-01-01T03:48:43.500000Z),
(2017-01-01T03:48:53.280000Z,2017-01-01T04:14:53.660000Z),
(2017-01-01T04:15:15.160000Z,2017-01-01T04:34:44.060000Z),
(2017-01-01T04:34:57.440000Z,2017-01-01T04:46:31.100000Z),
(2017-01-01T04:46:53.320000Z,2017-01-01T05:22:20.340000Z),
(2017-01-01T05:22:24.920000Z,2017-01-01T06:17:30.900000Z),
(2017-01-01T06:18:02.280000Z,2017-01-01T07:01:45.740000Z),
(2017-01-01T07:02:04.640000Z,2017-01-01T07:39:48.780000Z),
(2017-01-01T07:40:12.400000Z,2017-01-01T08:19:46.140000Z),
(2017-01-01T08:20:13.520000Z,2017-01-01T10:17:45.380000Z),
(2017-01-01T10:17:59.880000Z,2017-01-01T15:01:29.100000Z),
(2017-01-01T15:01:55.840000Z,2017-01-01T15:08:45.460000Z),
(2017-01-01T15:09:04.000000Z,2017-01-01T15:42:13.180000Z),
(2017-01-01T15:42:30.360000Z,2017-01-01T16:14:07.340000Z),
(2017-01-01T16:14:24.560000Z,2017-01-01T17:11:28.420000Z),
(2017-01-01T17:11:32.960000Z,2017-01-01T17:46:07.660000Z),
(2017-01-01T17:46:30.280000Z,2017-01-01T18:02:17.860000Z),
(2017-01-01T18:02:35.240000Z,2017-01-01T18:16:17.740000Z),
(2017-01-01T18:16:26.720000Z,2017-01-01T18:39:10.540000Z),
(2017-01-01T18:39:19.360000Z,2017-01-01T19:45:25.860000Z),
(2017-01-01T19:45:34.720000Z,2017-01-01T20:41:00.220000Z),
(2017-01-01T20:41:21.520000Z,2017-01-01T21:13:51.660000Z),
(2017-01-01T21:14:13.360000Z,2017-01-01T21:41:16.220000Z),
(2017-01-01T21:41:28.640000Z,2017-01-01T22:03:03.820000Z),
(2017-01-01T22:03:29.400000Z,2017-01-01T23:14:13.500000Z),
(2017-01-01T23:14:35.200000Z,2017-01-01T23:59:59.980000Z)]
as you can see, all the elements belong to the same day, 2017-01-01, what I want to do is the difference (in seconds or ms) of the entire day (86400s) and all the intervals in the list, but there are some overlaps, so I think that first I have to do some kind of "intersection check", and after all the intersections are set, just do the difference between all the elements and 86400, but how can I do that intersection check?. Any suggestion would be highly appreciated, Thanks in advance!
Desired Output:
86400(day) - 85000(possible time in seconds after time intersection of list) = 1400
The problem is twofold:
to replace overlapping intervals with their unions;
to sum the resulting non-overlapping intervals.
The first part can be done like this:
time.sort()
new_time = [list(time[0])]
for t in time[1:]:
if t[0] <= new_time[-1][1]:
if t[1] > new_time[-1][1]:
new_time[-1][1] = t[1]
else:
new_time.append(list(t))
while the second part is best done using datetime module:
import datetime
total = sum([ ( datetime.datetime.strptime(t[1], '%Y-%m-%dT%H:%M:%S.%fZ') -
datetime.datetime.strptime(t[0], '%Y-%m-%dT%H:%M:%S.%fZ') ).total_seconds()
for t in new_time ])
print(86400 - total)
After converting strings to numbers, you could use the top answer from Python find continuous interesctions of intervals
You can sort and then merge any overlaps
time.sort()
noOverlapList = []
start = time[0][0] # start of first interval
end = time[0][1] # end of first interval
for interval in time:
# if interval overlaps with tempInterval
if interval[0] < end and interval[1] > end:
end = interval[1]
else if interval[0] > end:
noOverlapList.append((start, end)) # merged non overlapping interval
start = interval[0]
end = interval[1]
Then just sum the intervals contained in noOverlaplList, and get the difference
I have a long list of data which I am working with now,containing a list of 'timestamp' versus 'quantity'. However, the timestamp in the list is not all in order (for example,timestamp[x] can be 140056 while timestamp[x+1] can be 560). I am not going to arrange them, but to add up the value of timestamp[x] to timestamp[x+1] when this happens.
ps:The arrangement of quantity needs to be in the same order as in the list when plotting.
I have been working with this using the following code, which timestamp is the name of the list which contain all the timestamp values:
for t in timestamp:
previous = timestamp[t-1]
increment = 0
if previous > timestamp[t]:
increment = previous
t += increment
delta = datetime.timedelta(0, (t - startTimeStamp) / 1000);
timeAtT = fileStartDate + (delta + startTime)
print("time at t=" + str(t) + " is: " + str(timeAtT));
previous = t
However it comes out with TypeError: list indices must be integers, not tuples. May I know how to solve this, or any other ways of doing this task? Thanks!
The problem is that you're treating t as if it is an index of the list. In your case, t holds the actual values of the list, so constructions like timestamp[t] are not valid. You either want:
for t in range(len(timestamp)):
Or if you want both an index and the value:
for (t, value) in enumerate(timestamp):
When you for the in timestamp you are making t take on the value of each item in timestamp. But then you try to use t as an index to make previous. To do this, try:
for i, t, in enumerate(timestamp):
previous = timestamp[i]
current = t
Also when you get TypeErrors like this make sure you try printing out the intermediate steps, so you can see exactly what is going wrong.
def recursive(start, end, datelist):
results = ga.GAnalytics().create_query(profile_id,
metrics,
start,
end,
dimensions).execute()
if results.get("containsSampledData") is True:
x = len(datelist) / 2
recursive(datelist[0],datelist[:x][-1],datelist[:x])
recursive(datelist[x:][0],datelist[-1],datelist[x:])
else:
unsampled_date_ranges = []
for x, y in start, end:
unsampled_date_ranges.append((x, y))
recursive(start_date, end_date, date_list)
The function above takes a start date, end date and an inclusive list of dates based on the start and end dates. If first checks if the data returned for the initial date range is sampled, if it is then the date range is split in half then checked, and so on.
My issue is with the else statement. To make sure the function worked I tried print start + " - " + end which returned the expected date ranges. Ideally, I would like the data to be returned as a list of tuples, so I tried the above, but unfortunately I am getting this error ValueError: too many values to unpack here for x, y in start, end:
What is the issue with my code in my else statement and how can I get it to return a list of tuples?