Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
599 views
in Technique[技术] by (71.8m points)

python - fetch_mldata: how to manually set up MNIST dataset when source server is down?

I need to run a code that contains these lines:

from sklearn.datasets import fetch_mldata
mnist = fetch_mldata('MNIST original')

There seems to be a problem with executing it.

TimeoutError: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond

As the code tries to download something from somewhere and my internet connecton works well, I assume that server that it wants to access is down.

How can I set it up manually?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

fetch_mldata will by default check the data in `'~/scikit_learn_data/mldata' to see if the dataset is already downloaded or not.

According to source code

    # if the file does not exist, download it
    if not exists(filename):
        urlname = MLDATA_BASE_URL % quote(dataname)

So in your case, it will check the location

~/scikit_learn_data/mldata/mnist-original.mat

and if not found, it will download from

http://mldata.org/repository/data/download/matlab/mnist-original.mat

which currently is down as you suspected.

So what you can do is download the dataset from any other location like this:

https://github.com/amplab/datascience-sp14/blob/master/lab7/mldata/mnist-original.mat

and keep that in the above folder.

After that when you run fetch_mldata() it should pick the downloaded dataset without connecting mldata.org.

Update:

Here ~ refers to the user home folder. You can use the following code to know the default location of that folder according to your system.

from sklearn.datasets import get_data_home
print(get_data_home())

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...