I have a multi-label classification problem, where each data point has exactly 3 labels (out of many labels, say 1000). In my model, I pick the top 5 predicted labels.
Here is a snippet of model code:
def top_labels(true_label, pred_label):
return tf.keras.metrics.top_k_categorical_accuracy(true_label, pred_label, k=5))
model = Sequential()
model.add(Embedding(10000, 128, input_length=250))
model.add(Flatten())
model.add(Dense(100, activation='relu'))
model.add(Dense(len(classes), activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy', top_labels])
My question is: what is the expected top_k_categorical_accuracy outcome?
If my training data is the following:
data_idx features true_labels
1 blabla 2,3,4
2 blabla 1,2,3
And the prediction result is
data_idx top_5 predicted_labels
1 1,4,5,8,9
2 4,5,6,7,8
I have two guesses:
0.5: because for 1st data point, there is one match of label (label 4), and for 2nd data point, no label is matched.
1/6: because for 1st data point, there is one label match out of 3 true labels, and for 2nd data point, no label is matched out of 3.
I feel like the answer is 1), but I'm confused after testing the following code:
y_true = [[1, 1, 0, 0]] # assume 2 labels for a data point
y_pred = [[0, 0.9, 0.05, 0.95]]
m = tf.keras.metrics.top_k_categorical_accuracy(y_true, y_pred, k=3)
m.numpy() # result: array([0.], dtype=float32)
So it looks like top_k_categorical_accuracy can only handle one true label instead of multiple true labels (it takes the first true label only and ignore the rest).
However, I'm not sure if setting the activation of the last layer as sigmoid
changes the evaluation behavior.
Can someone clarify a bit please? Thank you.
question from:
https://stackoverflow.com/questions/65947810/what-is-the-expected-result-of-top-k-categorical-accuracy-in-multi-label-classif