My dummy data:
| adv_id | dpa_id | json |
| -------| -------|----------------------------------------------|
| 1 | 21 |{'adv_name':'Tesla', 'product_name':'Model X'}|
| 1 | 21 |{'adv_name':'Tesla', 'product_name':'Model Y'}|
| 2 | 22 |{'adv_name':'BMW', 'product_name':'X5'} |
The problem is our hive system can only retrive 1 million rows of data at most. I need to bind the json together into a bigger one for every adv_id, dpa_id combo. Below is the output that I want:
adv_id |
dpa_id |
json |
1 |
21 |
[{'adv_name':'Tesla', 'product_name':'Model X'}, {'adv_name':'Tesla', 'product_name':'Model Y'}] |
2 |
22 |
[{'adv_name':'BMW', 'product_name':'X5'}] |
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…