Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
750 views
in Technique[技术] by (71.8m points)

apache spark - Pyspark append executor environment variable

Is it possible to append a value to the PYTHONPATH of a worker in spark?

I know it is possible to go to each worker node, configure spark-env.sh file and do it, but I want a more flexible approach

I am trying to use setExecutorEnv method, but with no success

conf = SparkConf().setMaster("spark://192.168.10.11:7077")
              .setAppName(''myname')
              .set("spark.cassandra.connection.host", "192.168.10.11") /
              .setExecutorEnv('PYTHONPATH', '$PYTHONPATH:/custom_dir_that_I_want_to_append/')

It creates a pythonpath env.variable on each executor, force it to be lower_case, and does not interprets $PYTHONPATH command to append the value.

I end up with two different env.variables,

pythonpath  :  $PYTHONPATH:/custom_dir_that_I_want_to_append
PYTHONPATH  :  /old/path/to_python

The first one is dynamically created and the second one already existed before.

Does anyone know how to do it?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I figured out myself...

The problem is not with spark, but in ConfigParser

Based on this answer, I fixed the ConfigParser to always preserve case.

After this, I found out that the default spark behavior is to append the values to existing worker env.variables, if there is a env.variable with the same name.

So, it is not necessary to mention $PYTHONPATH within dollar sign.

.setExecutorEnv('PYTHONPATH', '/custom_dir_that_I_want_to_append/')

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...