Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
373 views
in Technique[技术] by (71.8m points)

data.table - R: How to judge Date in the same week?

I want create a new colume to represent which date are in the same week.

A data.table DATE_SET contains Date information, like:

DATA_SET<- data.table(transday = seq(from  = (Sys.Date()-64), to = Sys.Date(), by = 1))

For example, '2017-03-01' and '2017-03-02' are in the same week, '2017-03-01' and '2017-03-08' both Wednesday, but they are not in the same week.

If "2016-01-01" is the first week in 2016, "2017-01-01" is the first week in 2017, the value is 1, but they are not in the same week. So i want the unique value to pecify "a same week".

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The answer to this question depends strongly on

  • the definition of the first day of the week (usually Sunday or Monday) and
  • the numbering of the weeks within the year (starting with the first Sunday, Monday, or Thursday of the year, or on 1st January, etc).

A selection of different options can be seen from the example below:

      dates  isoweek day week_iso week_US week_UK DT_week DT_iso lub_week lub_iso   cut.Date
 2015-12-25 2015-W52   5 2015-W52      51      51      52     52       52      52 2015-12-21
 2015-12-26 2015-W52   6 2015-W52      51      51      52     52       52      52 2015-12-21
 2015-12-27 2015-W52   7 2015-W52      52      51      52     52       52      52 2015-12-21
 2015-12-28 2015-W53   1 2015-W53      52      52      52     53       52      53 2015-12-28
 2015-12-29 2015-W53   2 2015-W53      52      52      52     53       52      53 2015-12-28
 2015-12-30 2015-W53   3 2015-W53      52      52      53     53       52      53 2015-12-28
 2015-12-31 2015-W53   4 2015-W53      52      52      53     53       53      53 2015-12-28
 2016-01-01 2015-W53   5 2015-W53      00      00       1     53        1      53 2015-12-28
 2016-01-02 2015-W53   6 2015-W53      00      00       1     53        1      53 2015-12-28
 2016-01-03 2015-W53   7 2015-W53      01      00       1     53        1      53 2015-12-28
 2016-01-04 2016-W01   1 2016-W01      01      01       1      1        1       1 2016-01-04
 2016-01-05 2016-W01   2 2016-W01      01      01       1      1        1       1 2016-01-04
 2016-01-06 2016-W01   3 2016-W01      01      01       1      1        1       1 2016-01-04
 2016-01-07 2016-W01   4 2016-W01      01      01       2      1        1       1 2016-01-04
 2016-01-08 2016-W01   5 2016-W01      01      01       2      1        2       1 2016-01-04

which is created by this code:

library(data.table)

dates <- as.Date("2016-01-01") + (-7:7)
print(data.table(
  dates,
  isoweek   = ISOweek::ISOweek(dates),
  day       = ISOweek::ISOweekday(dates),
  week_iso  = format(dates, "%G-W%V"),
  week_US   = format(dates, "%U"),
  week_UK   = format(dates, "%W"),
  DT_week   = data.table::week(dates),
  DT_iso    = data.table::isoweek(dates),
  lub_week  = lubridate::week(dates),
  lub_iso   = lubridate::isoweek(dates),
  cut.Date  = cut.Date(dates, "week")  
), row.names = FALSE)     

The format YYYY-Www used in some of the columns is one of the ISO 8601 week formats. It includes the year which is required to distinguish different weeks in different years as requested by the OP.

The ISO week definition is the only format which ensures that each week always consists of 7 days, also across New Year. The other definitions may start or end the year with "weeks" with less than 7 days. Due to the seamless partioning of the year, the ISO week-numbering year is slightly different from the traditional Gregorian calendar year, e.g., 2016-01-01 belongs to the last ISO week 53 of 2015 (2015-W53).

As suggested here, cut.Date() might be the best option for the OP.

Disclosure: I'm maintainer of the ISOweek package which was published at a time when strptime() did not recognize the %G and %V format specifications for output in the Windows versions of R. (Still today they aren't recognized on input).


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...