python - Confusing probabilities of the predict_proba of scikit-learn's svm

Question

Welcome To Ask or Share your Answers For Others

python - Confusing probabilities of the predict_proba of scikit-learn's svm

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - Confusing probabilities of the predict_proba of scikit-learn's svm

My purpose is to draw the PR curve by the sorted probability of each sample for a specific class. However, I found that the obtained probabilities by svm's predict_proba() have two different behaviors when I use two different standard datasets: the iris and digits.

The first case is evaluated with the "iris" case with the python code below, and it works reasonably that the class gets the highest probability.

D = datasets.load_iris()
clf = SVC(kernel=chi2_kernel, probability=True).fit(D.data, D.target)
output_predict = clf.predict(D.data)
output_proba = clf.predict_proba(D.data)
output_decision_function = clf.decision_function(D.data)
output_my = proba_to_class(output_proba, clf.classes_)

print D.data.shape, D.target.shape
print "target:", D.target[:2]
print "class:", clf.classes_
print "output_predict:", output_predict[:2]
print "output_proba:", output_proba[:2]

Next, it produces the outputs like below. Apparently, the highest probability of each sample match the outputs of the predict(): The 0.97181088 for sample #1 and 0.96961523 for sample #2.

(150, 4) (150,)
target: [0 0]
class: [0 1 2]
output_predict: [0 0]
output_proba: [[ 0.97181088  0.01558693  0.01260218]
[ 0.96961523  0.01702481  0.01335995]]

However, when I change the dataset to "digits" with the following code, the probabilities reveal an inverse phenomenon, that the lowest probability of each sample dominates the outputted labels of the predict() with probability 0.00190932 for sample #1 and 0.00220549 for sample #2.

D = datasets.load_digits()

Outputs:

(1797, 64) (1797,)
target: [0 1]
class: [0 1 2 3 4 5 6 7 8 9]
output_predict: [0 1]
output_proba: [[ 0.00190932  0.11212957  0.1092459   0.11262532      0.11150733  0.11208733
0.11156622  0.11043403  0.10747514  0.11101985]
[ 0.10991574  0.00220549  0.10944998  0.11288081  0.11178518   0.11234661
0.11182221  0.11065663  0.10770783  0.11122952]]

I've read this post and it leads a solution to using linear SVM with decision_function(). However, because of my task, I still have to focus on the chi-squared kernel for SVM.

Any solutions?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T19:32:06+0000

As the documentation states, there is no guarantee that predict_proba and predict will give consistent results on SVC. You can simply use decision_function. That is true for both linear and kernel SVM.

Categories

python - Confusing probabilities of the predict_proba of scikit-learn's svm

python - Confusing probabilities of the predict_proba of scikit-learn's svm

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags