Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
456 views
in Technique[技术] by (71.8m points)

sql - Window functions to count distinct records

The query below is based on a complicated view and the view works as I want it to (I'm not going to include the view because I don't think it will help with the question at hand). What I can't get right is the drugCountsinFamilies column. I need it to show me the number of distinct drugNames for each drug family. You can see from the first screencap that there are three different H3A rows. The drugCountsInFamilies for H3A should be 3 (there are three different H3A drugs. )

enter image description here

You can see from the second screen cap that what's happening is the drugCountsInFamilies in the first screen cap is catching the number of rows that the drug name is listed on.
enter image description here

Below is my question, with comments on the part that is incorrect

select distinct
     rx.patid
    ,d2.fillDate
    ,d2.scriptEndDate
    ,rx.drugName
    ,rx.drugClass
    --the line directly below is the one that I can't figure out why it's wrong
    ,COUNT(rx.drugClass) over(partition by rx.patid,rx.drugclass,rx.drugname) as drugCountsInFamilies
from 
(
select 
    ROW_NUMBER() over(partition by d.patid order by d.patid,d.uniquedrugsintimeframe desc) as rn
    ,d.patid
    ,d.fillDate
    ,d.scriptEndDate
    ,d.uniqueDrugsInTimeFrame
    from DrugsPerTimeFrame as d
)d2
inner join rx on rx.patid = d2.patid
inner join DrugTable as dt on dt.drugClass=rx.drugClass
where d2.rn=1 and rx.fillDate between d2.fillDate and d2.scriptEndDate
and dt.drugClass in ('h3a','h6h','h4b','h2f','h2s','j7c','h2e')
order by rx.patid

SSMS gets mad if I try to add a distinct to the count(rx.drugClass) clause. Can it be done using window functions?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I came across this question in search for a solution to my problem of counting distinct values. In searching for an answer I came across this post. See last comment. I've tested it and used the SQL. It works really well for me and I figured that I would provide another solution here.

In summary, using DENSE_RANK(), with PARTITION BY the grouped columns, and ORDER BY both ASC and DESC on the columns to count:

DENSE_RANK() OVER (PARTITION BY drugClass ORDER BY drugName ASC) +
DENSE_RANK() OVER (PARTITION BY drugClass ORDER BY drugName DESC) - 1 AS drugCountsInFamilies

I use this as a template for myself.

DENSE_RANK() OVER (PARTITION BY PartitionByFields ORDER BY OrderByFields ASC ) +
DENSE_RANK() OVER (PARTITION BY PartitionByFields ORDER BY OrderByFields DESC) - 1 AS DistinctCount

I hope this helps!


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...