Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
919 views
in Technique[技术] by (71.8m points)

tensorflow - My CNN classifier gives wrong prediction on random images

I trained my CNN classifier (using tensorflow) with 3 data categories (ID card, passport, bills).
When I test it with images that belong to one of the 3 categories, it gives the right prediction. However, when I test it with a wrong image (a car image for example) it keeps giving me prediction (i.e. it predicts that the car belongs the ID card category).

Is there a way to make it display an error message instead of giving a wrong prediction?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This should be tackled differently. This is known as open set recognition problem. You can google it and find more about it but basically it's this: You cannot train your classifier on every class imaginable. It will always run into some other class that it's not familiar with and that it hasn't already seen before.

There are a few solutions from which I will single out the 3 of them:

  1. Separate binary classifier - You can build separate binary classifier that recognizes images and sorts them in two categories depending on if the bill, passport or ID are in the image or not. If they are, it should let the algorithm you have already build to process the image and classify it into one of the 3 categories. If the first classifier says that some other object is in the image, you can immediately discard the image because it's not the image of bill/passport/ID.

  2. Thresholding. In the case when the ID is on the image, probability of the ID is high and probabilities for bill and passport are fairly low. In the case when the image is something else (ex. a car), the probabilities are most probably about the same for all 3 classes. In other words, probability for neither of the classes really stand out. That is a situation in which you pick the highest probability of the ones generated and set the output class to be the class of that probability, regardless the value of probability is 0.4 or something like that. To resolve this, you can set a threshold at, let's say 0.7, and say if neither of probabilities is over that threshold, there is something else on the picture (not ID, passport or bill).

  3. Create the fourth class: Unknown. If you pick this option, you should add few of the other images to the dataset and label them unknown. Then train the classifier and see what the result is.

I would recommend 1 or 2. Hope it helps :)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...