Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
263 views
in Technique[技术] by (71.8m points)

sql - bind column values by group using hiveql

My dummy data:

| adv_id | dpa_id | json                                         |
| -------| -------|----------------------------------------------|
| 1      | 21     |{'adv_name':'Tesla', 'product_name':'Model X'}|
| 1      | 21     |{'adv_name':'Tesla', 'product_name':'Model Y'}|
| 2      | 22     |{'adv_name':'BMW', 'product_name':'X5'}       |

The problem is our hive system can only retrive 1 million rows of data at most. I need to bind the json together into a bigger one for every adv_id, dpa_id combo. Below is the output that I want:

adv_id dpa_id json
1 21 [{'adv_name':'Tesla', 'product_name':'Model X'}, {'adv_name':'Tesla', 'product_name':'Model Y'}]
2 22 [{'adv_name':'BMW', 'product_name':'X5'}]

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can do a group by and collect_list:

select adv_id, dpa_id, collect_list(json) as json
from table
group by adv_id, dpa_id;

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...