Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
493 views
in Technique[技术] by (71.8m points)

python - How to specify the prior probability for scikit-learn's Naive Bayes

I'm using the scikit-learn machine learning library (Python) for a machine learning project. One of the algorithms I'm using is the Gaussian Naive Bayes implementation. One of the attributes of the GaussianNB() function is the following:

class_prior_ : array, shape (n_classes,)

I want to alter the class prior manually since the data I use is very skewed and the recall of one of the classes is very important. By assigning a high prior probability to that class the recall should increase.

However, I can't figure out how to set the attribute correctly. I've read the below topics already but their answers don't work for me.

How can the prior probabilities manually set for the Naive Bayes clf in scikit-learn?

How do I know what prior's I'm giving to sci-kit learn? (Naive-bayes classifiers.)

This is my code:

gnb = GaussianNB()
gnb.class_prior_ = [0.1, 0.9]
gnb.fit(data.XTrain, yTrain)
yPredicted = gnb.predict(data.XTest)

I figured this was the correct syntax and I could find out which class belongs to which place in the array by playing with the values but the results remain unchanged. Also no errors were given.

What is the correct way of setting the attributes of the GaussianNB algorithm from scikit-learn library?

Link to the scikit documentation of GaussianNB

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

@Jianxun Li: there is in fact a way to set prior probabilities in GaussianNB. It's called 'priors' and its available as a parameter. See documentation: "Parameters: priors : array-like, shape (n_classes,) Prior probabilities of the classes. If specified the priors are not adjusted according to the data." So let me give you an example:

from sklearn.naive_bayes import GaussianNB
# minimal dataset
X = [[1, 0], [1, 0], [0, 1]]
y = [0, 0, 1]
# use empirical prior, learned from y
mn = GaussianNB()
print mn.fit(X,y).predict([1,1])
print mn.class_prior_

>>>[0]
>>>[ 0.66666667  0.33333333]

But if you changed the prior probabilities, it will give a different answer which is what you are looking for I believe.

# use custom prior to make 1 more likely
mn = GaussianNB(priors=[0.1, 0.9])
mn.fit(X,y).predict([1,1])
>>>>array([1])

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...