r - generalizing Dplyr aggregation for hierarchical data

Question

Welcome To Ask or Share your Answers For Others

r - generalizing Dplyr aggregation for hierarchical data

posted Oct 6, 2021 in Technique[技术] by 深蓝 (71.8m points)

r - generalizing Dplyr aggregation for hierarchical data

I am working with a hierarchical dataset and I need to solve one issue:


library(tidyverse)
library(matrixStats)


df <- 
  tibble(
  LV = c('0.1', '0.1.1', '0.1.2', '0.1.2.1'),
  A = c(0.5, 1.2, 20000, 100),
  B = c(192, 18, 18, 5)
)

df_step1 <-
  df %>%
  mutate(SUB = str_count(LV, "[.]"),
         MAX_DEPTH = max(SUB)) %>%
  fastDummies::dummy_cols(select_columns = 'SUB') %>%
  mutate_at(vars(starts_with('SUB')),funs(.*A/B)) %>%
  rename(HEAD = SUB_1) %>%
  mutate(HEAD = HEAD[1]) %>%
  mutate_at(vars(starts_with('SUB')), ~ifelse(.== 0, 1,.))


# HEre is an issue - script write 1 to this possiont
df_step1 %>% 
  .[3, 8]

There are 3 sub-hirarchies in the data.

Second and third row is 'child' of 0.1 level and row 4 (0.1.2.1) is 'child' of (0.1.2) meaning it is grandchild of 0.1.

In line 3 columns 8 current script caluclates metric as 1 but there should really be 200000/18 as is subhierarchi (child) of Depth 2.

Is there any way to automate this in dplyr?

I was initially thinking about making some string and substring checks using:

str = '12.1.1'
substring(str, 1, str_length(str)-2)
str_detect(str, substring(str, 1, str_length(str)-2))

question from:https://stackoverflow.com/questions/66049359/generalizing-dplyr-aggregation-for-hierarchical-data

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

Categories

r - generalizing Dplyr aggregation for hierarchical data

r - generalizing Dplyr aggregation for hierarchical data

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags