Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
432 views
in Technique[技术] by (71.8m points)

common table expression - Determine contiguous dates in SQL gaps and islands

I have a situation where a single patient can receive multiple services. These services can have overlapping dates, and can gaps and islands. I am trying to write a query that will show the contiguous length of time that the patient was receiving some kind of service.

Table is as follows:

CREATE TABLE #tt
(Patient    VARCHAR(10), StartDate DATETIME, EndDate DATETIME)
INSERT INTO #tt
VALUES
('Smith',   '2014-04-13',   '2014-06-04'),
('Smith',   '2014-05-07',   '2014-05-08'),
('Smith',   '2014-06-21',   '2014-09-19'),
('Smith',   '2014-08-27',   '2014-08-27'),
('Smith',   '2014-08-28',   '2014-09-19'),
('Smith',   '2014-10-30',   '2014-12-16'),
('Smith',   '2015-05-21',   '2015-07-03'),
('Smith',   '2015-05-22',   '2015-07-03'),
('Smith',   '2015-05-26',   '2015-11-30'),
('Smith',   '2015-06-25',   '2016-06-08'),
('Smith',   '2015-07-22',   '2015-10-22'),
('Smith',   '2016-08-11',   '2016-09-02'),
('Smith',   '2017-06-02',   '2050-01-01'),
('Smith',   '2017-12-22',   '2017-12-22'),
('Smith',   '2018-03-25',   '2018-06-30')

As you can see, many of the dates overlap. Ultimately what I want to see is the following results, which will show the dates where the patient was receiving at least one service, like so:

Patient     |StartDate        |EndDate
--------------------------------------
Smith       |2014-04-13       |2016-06-04
Smith       |2014-06-21       |2014-09-19
Smith       |2014-10-30       |2014-12-16
Smith       |2015-05-21       |2016-06-08
Smith       |2016-08-11       |2016-09-02
Smith       |2017-06-02       |2050-01-01

I've gotten bleary eyed from looking at the various gaps and islands SQL code. I've started out with this CTE, but obviously it isn't working, and if I wanted this, I could have simply used SELECT PHN, Min(StartDate), MAX(EndDate)

WITH HCC_PAT 
AS 
(
    SELECT DISTINCT
    PHN,
    StartDate,
    EndDate,
    MIN (StartDate) OVER (  PARTITION BY  PHN ORDER BY StartDate
                                        ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS PreviousStartDate,
    MAX (EndDate) OVER (    PARTITION BY  PHN ORDER BY EndDate
                                        ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS PreviousEndDate 

FROM    #tt)

SELECT  DISTINCT --hcc_Pat.HCCClientKey,
        hcc_pat.PHN,
        hcc_pat.StartDate,
        ISNULL (LEAD (PreviousEndDate) OVER (PARTITION BY PHN ORDER BY ENDDATE), 'January 1, 2050') AS EndDate
FROM    HCC_PAT
WHERE   PreviousEndDate > StartDate 
AND     (StartDate < PreviousStartDate OR PreviousStartDate IS NULL)

Any help at this point would be gratefully appreciated

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

One method spreads the dates out, with an indicator of whether the service is starting or ending. Then a cumulative sum of the indicator can be used to define the different groups -- the zero values in the cumulative sum are when a period ends.

The final step is aggregation:

with d as (
      select patient, startdate as dte, 1 as inc from tt
      union all
      select patient, enddate as dte, -1 as inc from tt
     ),
     dd as (
       select patient, dte, sum(sum(inc)) over (order by dte) as cume_inc
       from d
       group by patient, dte
      ),
     ddd as (
       select dd.*, sum(case when cume_inc = 0 then 1 else 0 end) over (partition by patient order by dte desc) as grp
       from dd
      )
select patient, min(dte) as startdate, max(dte) as enddate
from ddd
group by grp;

Here is a SQL Fiddle.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...