Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
305 views
in Technique[技术] by (71.8m points)

python - Updating pandas to version 0.19 in Azure ML Studio

I would really like to get access to some of the updated functions in pandas 0.19, but Azure ML studio uses pandas 0.18 as part of the Anaconda 4.0 bundle. Is there a way to update the version that is used within the "Execute Python Script" components?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I offer the below steps for you to show how to update the version of pandas library in Execute Python Script.

Step 1 : Use the virtualenv component to create an independent python runtime environment in your system.Please install it first with command pip install virtualenv if you don't have it.

If you installed it successfully ,you could see it in your python/Scripts file.

enter image description here

Step2 : Run the commad to create independent python runtime environment.

enter image description here

Step 3 : Then go into the created directory's Scripts folder and activate it (this step is important , don't miss it)

Please don't close this command window and use pip install pandas==0.19 to download external libraries in this command window.

enter image description here

Step 4 : Compress all of the files in the Lib/site-packages folder into a zip package (I'm calling it pandas - package here)

enter image description here

Step 5 :Upload the zip package into the Azure Machine Learning WorkSpace DataSet.

enter image description here

specific steps please refer to the Technical Notes.

After success, you will see the uploaded package in the DataSet List

enter image description here

Step 6 : Before the defination of method azureml_main in the Execute Python Script module, you need to remove the old pandas modules & its dependencies, then to import pandas again, as the code below.

import sys
import pandas as pd
print(pd.__version__)
del sys.modules['pandas']
del sys.modules['numpy']
del sys.modules['pytz']
del sys.modules['six']
del sys.modules['dateutil']
sys.path.insert(0, '.\Script Bundle')
for td in [m for m in sys.modules if m.startswith('pandas.') or m.startswith('numpy.') or m.startswith('pytz.') or m.startswith('dateutil.') or m.startswith('six.')]:
    del sys.modules[td]
import pandas as pd
print(pd.__version__)
# The entry point function can contain up to two input arguments:
#   Param<dataframe1>: a pandas.DataFrame
#   Param<dataframe2>: a pandas.DataFrame
def azureml_main(dataframe1 = None, dataframe2 = None):

Then you can see the result from logs as below, first print the old version 0.14.0, then print the new version 0.19.0 from the uploaded zip file.

[Information]         0.14.0
[Information]         0.19.0

You could also refer to these threads: Access blob file using time stamp in Azure and reload with reset.

Hope it helps you.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...