Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
286 views
in Technique[技术] by (71.8m points)

python - Confusion Matrix giving Bad Results but Validation Accuracy ~95%

This is my Code, I have around 5000 images in Training and roughly 532 in test data. My Val_accuracy shows 95% but when i create Confusion matrix and classification report, it gives very poor results on validation/test set, out of 532 images it predicts 314 correct (TP). I think the problem lies in setting batch_size and other hyperparameters. Please HELP, This is for my Research Paper. Please help, I'M stuck badly!

import os
import numpy as np
import matplotlib.pyplot as plt
import keras
from keras.applications import xception
from keras.layers import *
from keras.models import *
from keras.preprocessing import image

model = xception.Xception(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
for layers in model.layers:
    layers.trainable=False
    
flat1 = Flatten()(model.layers[-1].output)
class1 = Dense(256, activation='relu')(flat1)
output = Dense(1, activation='sigmoid')(class1)

model = Model(inputs = model.inputs, outputs = output)


model.compile(loss = 'binary_crossentropy', optimizer='adam', metrics=['accuracy'])


train_datagen = image.ImageDataGenerator(
    rescale = 1./255,
    shear_range = 0.2,
    zoom_range = 0.2,
    horizontal_flip = True,
    )

test_datagen = image.ImageDataGenerator(rescale = 1./255)

train_generator = train_datagen.flow_from_directory(
    '/Users/xd_anshul/Desktop/Research/Major/CovidDataset/Train',
    target_size = (224,224),
    batch_size = 10,
    class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
    '/Users/xd_anshul/Desktop/Research/Major/CovidDataset/Test',
    target_size = (224,224),
    batch_size = 10,
    class_mode='binary')



hist = model.fit(
    train_generator,
    steps_per_epoch=9,
    epochs=5,
    validation_data=validation_generator,
    validation_steps=2)






from sklearn.metrics import classification_report, confusion_matrix
Y_pred = model.predict(validation_generator)
y_pred = [1 * (x[0]>=0.5) for x in Y_pred]
print('Confusion Matrix')
print(confusion_matrix(validation_generator.classes, y_pred))
print('Classification Report')
target_names = ['Covid', 'Normal']
print(classification_report(validation_generator.classes, y_pred, 
target_names=target_names))

OUTPUT:

Epoch 1/5
9/9 [==============================] - 21s 2s/step - loss: 0.2481 - accuracy: 0.9377 - val_loss: 4.1552 - val_accuracy: 0.9500
Epoch 2/5
9/9 [==============================] - 16s 2s/step - loss: 1.9680 - accuracy: 0.9767 - val_loss: 15.5336 - val_accuracy: 0.8500
Epoch 3/5
9/9 [==============================] - 16s 2s/step - loss: 0.2898 - accuracy: 0.9867 - val_loss: 0.0000e+00 - val_accuracy: 1.0000
Epoch 4/5
9/9 [==============================] - 16s 2s/step - loss: 1.4597 - accuracy: 0.9640 - val_loss: 2.3671 - val_accuracy: 0.9500
Epoch 5/5
9/9 [==============================] - 16s 2s/step - loss: 3.3822 - accuracy: 0.9365 - val_loss: 3.5101e-22 - val_accuracy: 1.0000


  
  Confusion Matrix
[[314  96]
 [ 93  29]]
Classification Report
              precision    recall  f1-score   support

       Covid       0.77      0.77      0.77       410
      Normal       0.23      0.24      0.23       122

    accuracy                           0.64       532
   macro avg       0.50      0.50      0.50       532
weighted avg       0.65      0.64      0.65       532
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Let's say your predictions array is something like:

preds_sigmoid = np.array([[0.8451], [0.454], [0.5111]])

containing these values as sigmoid squeeze them in a range of [0,1]. When you apply argmax as you did, you will get index 0 everytime because argmax returns the maximum index at specified axis.

pred = np.argmax(preds_sigmoid , axis = 1) # pred is full of zeros.

You should evaluate the predictions like if it is bigger than some threshold, let's say 0.5, it belongs to second class. You can use list comprehension for this:

pred = [1 * (x[0]>=0.5) for x in preds_sigmoid]

Therefore predictions will be handled properly.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...