Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
273 views
in Technique[技术] by (71.8m points)

python - sklearn doesn't have attribute 'datasets'

I have started using sckikit-learn for my work. So I was going through the tutorial which gives standard procedure to load some datasets:

$ python
>>> from sklearn import datasets
>>> iris = datasets.load_iris()
>>> digits = datasets.load_digits()

However, for my convenience, I tried loading the data in the following way:

In [1]: import sklearn

In [2]: iris = sklearn.datasets.load_iris()

However, this throws following error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-2-db77d2036db5> in <module>()
----> 1 iris = sklearn.datasets.load_iris()

AttributeError: 'module' object has no attribute 'datasets'

However, if I use the apparently similar method:

In [3]: from sklearn import datasets

In [4]: iris = datasets.load_iris()

It works without problem. In fact the following also works:

In [5]: iris = sklearn.datasets.load_iris()

I am completely confused about this. Am I missing something very trivial? What is the difference between the two approaches?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

sklearn is a package. This answer said it very succinctly:

when you import a package, only variables/functions/classes in the __init__.py file of that package are directly visible, not sub-packages or modules.

datasets is a sub-package of sklearn. This is why this happens:

In [1]: import sklearn

In [2]: sklearn.datasets
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-2-325a2bfc35d0> in <module>()
----> 1 sklearn.datasets

AttributeError: module 'sklearn' has no attribute 'datasets'

However, the reason why this works:

In [3]: from sklearn import datasets

In [4]: sklearn.datasets
Out[4]: <module 'sklearn.datasets' from '/home/ethan/.virtualenvs/test3/lib/python3.5/site-packages/sklearn/datasets/__init__.py'>

is that when you load the sub-package datasets by doing from sklearn import datasets it is automatically added to the namespace of the package sklearn. This is one of the lesser-known "traps" of the Python import system.

Also, note that if you look at the __init__.py for sklearn you will see 'datasets' as a member of __all__, but this only allows you to do:

In [1]: from sklearn import *
In [2]: datasets
Out[2]: <module 'sklearn.datasets' from '/home/ethan/.virtualenvs/test3/lib/python3.5/site-packages/sklearn/datasets/__init__.py'>

One last point to note is that if you inspect either sklearn or datasets you will see that, although they are packages, their type is module. This is because all packages are considered modules - however, not all modules are packages.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...