A very important note, before you use the distance layer, is to take into consideration that you have only one convolutional neural network.
The shared weights actually refer to only one convolutional neural network, and the weights are shared because the same weights are used when passing a pair of images (depending on the loss function used) in order to compute the features and subsequently the embeddings of each input image.
You would have only one neural network, and the block logic will need to look like:
def euclidean_distance(vectors):
(features_A, features_B) = vectors
sum_squared = K.sum(K.square(features_A - features_B), axis=1, keepdims=True)
return K.sqrt(K.maximum(sum_squared, K.epsilon()))
image_A = Input(shape=...)
image_B = Input(shape=...)
feature_extractor_model = get_feature_extractor_model(shape=...)
features_A = feature_extractor(image_A)
features_B = feature_extractor(image_B)
distance = Lambda(euclidean_distance)([features_A, features_B])
outputs = Dense(1, activation="sigmoid")(distance)
siamese_model = Model(inputs=[image_A, image_B], outputs=outputs)
Of course, the feature extractor model can be a pretrained network from Keras/TensorFlow, with the output classification layer improved.
The main logic should be like the one above, of course, if you want to use triplet loss, that would require three inputs (Anchor, Positive, Negative), but for the beginning I would recommend to stick to the basics.
Also, it would a good idea to consult this documentation:
- https://www.pyimagesearch.com/2020/11/30/siamese-networks-with-keras-tensorflow-and-deep-learning/
- https://towardsdatascience.com/one-shot-learning-with-siamese-networks-using-keras-17f34e75bb3d
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…