Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
568 views
in Technique[技术] by (71.8m points)

hiveql - How to use Hive Query results(multiple) in a variable for other query

I have two tables one is schools and one is students.I want to find all the students of a particular school. The schema of schools is: id, name, location and of students is :id, name, schoolId. I wrote the following script:

schoolId=$(hive -e "set hive.cli.print.header=false;select id from school;")
 hive -hiveconf "schoolId"="$schoolId" 

hive>select id,name from student where schoolId like  '${hiveconf:schoolId}%'

I dont get any result as schoolId stores all the id together.For example there are 3 schools with id: 123, 256,346 schoolId variable stores as 123 256 346 and the result is null.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Use collect_set() with concat_ws to get comma delimited string, IDs should be cast to string:

schoolId=$(hive -e "set hive.cli.print.header=false;select concat_ws('\',\'',collect_set(cast(id as string))) from school;");

hive -hiveconf "schoolId"="$schoolId" 

Then use IN operator:

select id,name from student where schoolId in ('${hiveconf:schoolId}');

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...