Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.1k views
in Technique[技术] by (71.8m points)

python - GridSearch over MultiOutputRegressor?

Let's consider a multivariate regression problem (2 response variables: Latitude and Longitude). Currently, a few machine learning model implementations like Support Vector Regression sklearn.svm.SVR do not currently provide naive support of multivariate regression. For this reason, sklearn.multioutput.MultiOutputRegressor can be used.

Example:

from sklearn.multioutput import MultiOutputRegressor
svr_multi = MultiOutputRegressor(SVR(),n_jobs=-1)

#Fit the algorithm on the data
svr_multi.fit(X_train, y_train)
y_pred= svr_multi.predict(X_test)

My goal is to tune the parameters of SVR by sklearn.model_selection.GridSearchCV. Ideally, if the response was a single variable and not multiple, I would perform an operation as follows:

from sklearn.svm import SVR
from sklearn.model_selection import GridSearchCV
from sklearn.pipeline import Pipeline

pipe_svr = (Pipeline([('scl', StandardScaler()),
                  ('reg', SVR())]))

grid_param_svr = {
    'reg__C': [0.01,0.1,1,10],
    'reg__epsilon': [0.1,0.2,0.3],
    'degree': [2,3,4]
}

gs_svr = (GridSearchCV(estimator=pipe_svr, 
                  param_grid=grid_param_svr, 
                  cv=10,
                  scoring = 'neg_mean_squared_error',
                  n_jobs = -1))

gs_svr = gs_svr.fit(X_train,y_train)

However, as my response y_train is 2-dimensional, I need to use the MultiOutputRegressor on top of SVR. How can I modify the above code to enable this GridSearchCV operation? If not possible, is there a better alternative?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I just found a working solution. In the case of nested estimators, the parameters of the inner estimator can be accessed by estimator__.

from sklearn.multioutput import MultiOutputRegressor
from sklearn.svm import SVR
from sklearn.model_selection import GridSearchCV
from sklearn.pipeline import Pipeline

pipe_svr = Pipeline([('scl', StandardScaler()),
        ('reg', MultiOutputRegressor(SVR()))])

grid_param_svr = {
    'reg__estimator__C': [0.1,1,10]
}

gs_svr = (GridSearchCV(estimator=pipe_svr, 
                      param_grid=grid_param_svr, 
                      cv=2,
                      scoring = 'neg_mean_squared_error',
                      n_jobs = -1))

gs_svr = gs_svr.fit(X_train,y_train)
gs_svr.best_estimator_    

Pipeline(steps=[('scl', StandardScaler(copy=True, with_mean=True, with_std=True)), 
('reg', MultiOutputRegressor(estimator=SVR(C=10, cache_size=200,
 coef0=0.0, degree=3, epsilon=0.1, gamma='auto', kernel='rbf', max_iter=-1,    
 shrinking=True, tol=0.001, verbose=False), n_jobs=1))])

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...