Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
270 views
in Technique[技术] by (71.8m points)

sql - How to select more than 1 record per day?

This is a postgresql problem.

PostgreSQL 8.3.3 on i686-redhat-linux-gnu, compiled by GCC gcc (GCC) 3.4.6 20060404 (Red Hat 3.4.6-9).

The table looks like:

date_time           other_column
2012-11-01 00:00:00 ...
2012-11-02 01:00:00 ...
2012-11-02 02:00:00 ...
2012-11-02 03:00:00 ...
2012-11-02 04:00:00 ...
2012-11-03 05:00:00 ...
2012-11-03 06:00:00 ...
2012-11-05 00:00:00 ...
2012-11-07 00:00:00 ...
2012-11-07 00:00:00 ...
...

I want to select at most 3 records per day from a specific date range.

For example, I want to select at most 3 records from 2012-11-02 to 2012-11-05. The expected result would be:

date_time           other_column
2012-11-02 01:00:00 ...
2012-11-02 02:00:00 ...
2012-11-02 03:00:00 ...
2012-11-03 05:00:00 ...
2012-11-03 06:00:00 ...
2012-11-05 00:00:00 ...

I have spent a few hours on this and still cannot figure it out. Please help me. :(

UPDATE: The current sql I tried could only select one record per day:

SELECT DISTINCT ON (TO_DATE(SUBSTRING((date_time || '') FROM 1 FOR 10), 'YYYY-MM-DD')) *
FROM myTable
WHERE  date_time >=  '20121101 00:00:00'  
AND  date_time <= '20121130 23:59:59'
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The following answers all use date_trunc('day',date_time) or just cast to date to truncate a timestamp to a date. There's no need to jump through hoops with date formatting and strings. See Date/time functions in the manual.

This SQLFiddle shows three possible answers: http://sqlfiddle.com/#!12/0fd51/14, all of which produce the same result for the input data (but not necessarily the same result if date_time can have duplicates in it).

To solve your problem you could use a correlated subquery with a limit to generate an IN-list to filter on:

SELECT a.date_time, a.other_column
FROM table1 a
WHERE a.date_time IN (
  SELECT b.date_time
  FROM table1 b
  WHERE b.date_time IS NOT NULL
    AND a.date_time::date = b.date_time::date
  ORDER BY b.date_time
  LIMIT 3
)
AND a.date_time::date BETWEEN '2012-11-02' AND '2012-11-05';

This should be the most portable approach - though it won't work with MySQL (at least as of 5.5) because MySQL doesn't support LIMIT in a subquery used in an IN clause. It works in SQLite3 and PostgreSQL, though, and should work in most other DBs.

Another option would be to select the range of dates you wanted, annotate the rows within the range with a row number using a window function, then filter the output to exclude excess rows:

SELECT date_time, other_column
FROM (
  SELECT 
    date_time, 
    other_column, 
    rank() OVER (PARTITION BY date_trunc('day',date_time) ORDER BY date_time) AS n
  FROM Table1
  WHERE date_trunc('day',date_time) BETWEEN '2012-11-02' AND '2012-11-05'
  ORDER BY date_time
) numbered_rows
WHERE n < 4;

If ties are a possibility, ie if date_time is not unique, then consider using either the rank or dense_rank window functions instead of row_number to get deterministic results, or add an additional clause to the ORDER BY in row_number to break the tie.

If you use rank then it'll include none of the rows if it can't fit all of them in; if you use dense_rank it'll include all of them even if it has to go over the 3-row-per-day limit to do so.

All sorts of other processing are possible this way too, using the window specification.


Here's yet another formulation that uses array aggregation and slicing, which is completely PostgreSQL specific but fun.

SELECT b.date_time, b.other_column 
FROM (
  SELECT array_agg(a.date_time ORDER BY a.date_time)
  FROM table1 a
  WHERE a.date_time::date BETWEEN '2012-11-02' 
    AND '2012-11-05'
  GROUP BY a.date_time::date
) x(arr) 
INNER JOIN table1 b ON (b.date_time = ANY (arr[1:3]));

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...