Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
520 views
in Technique[技术] by (71.8m points)

ipython notebook - Spark context 'sc' not defined

I am new to Spark and I am trying to install the PySpark by referring to the below site.

http://ramhiser.com/2015/02/01/configuring-ipython-notebook-support-for-pyspark/

I tried to install both prebuilt package and also by building the Spark package thru SBT.

When I try to run a python code in IPython Notebook I get the below error.

    NameError                                 Traceback (most recent call last)
   <ipython-input-1-f7aa330f6984> in <module>()
      1 # Check that Spark is working
----> 2 largeRange = sc.parallelize(xrange(100000))
      3 reduceTest = largeRange.reduce(lambda a, b: a + b)
      4 filterReduceTest = largeRange.filter(lambda x: x % 7 == 0).sum()
      5 

      NameError: name 'sc' is not defined

In the command window I can see the below error.

<strong>Failed to find Spark assembly JAR.</strong>
<strong>You need to build Spark before running this program.</strong>

Note that I got a scala prompt when I executed spark-shell command

Update:

With help of a friend I am able to fix the issue related to Spark assembly JAR by correcting the contents of .ipython/profile_pyspark/startup/00-pyspark-setup.py file

I have now only the problem of Spark Context variable. Changing the title to be appropriately reflect my current issue.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

you need to do the following after you have pyspark in your path:

from pyspark import SparkContext
sc =SparkContext()

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...