dataset: {'omniglot', 'miniImageNet'}. Whether to use the Omniglot
or miniImagenet dataset
distance: {'l2', 'cosine'}. Which distance metric to use
n-train: Support samples per class for training tasks
n-test: Support samples per class for validation tasks
k-train: Number of classes in training tasks
k-test: Number of classes in validation tasks
q-train: Query samples per class for training tasks
q-test: Query samples per class for validation tasks
fce: Whether (True) or not (False) to use full context embeddings (FCE)
lstm-layers: Number of LSTM layers to use in the support set
FCE
unrolling-steps: Number of unrolling steps to use when calculating FCE
of the query sample
I had trouble reproducing the results of this paper using the cosine
distance metric as I found the converge to be slow and final performance
dependent on the random initialisation. However I was able to reproduce
(and slightly exceed) the results of this paper using the l2 distance
metric.
Omniglot
k-way
5
5
20
20
n-shot
1
5
1
5
Published (cosine)
98.1
98.9
93.8
98.5
This Repo (cosine)
92.0
93.2
75.6
77.8
This Repo (l2)
98.3
99.8
92.8
97.8
miniImageNet
k-way
5
5
n-shot
1
5
Published (cosine, FCE)
44.2
57.0
This Repo (cosine, FCE)
42.8
53.6
This Repo (l2)
46.0
58.4
Model-Agnostic Meta-Learning (MAML)
I used max pooling instead of strided convolutions in order to be
consistent with the other papers. The miniImageNet experiments using
2nd order MAML took me over a day to run.
dataset: {'omniglot', 'miniImageNet'}. Whether to use the Omniglot
or miniImagenet dataset
distance: {'l2', 'cosine'}. Which distance metric to use
n: Support samples per class for few-shot tasks
k: Number of classes in training tasks
q: Query samples per class for training tasks
inner-train-steps: Number of inner-loop updates to perform on training
tasks
inner-val-steps: Number of inner-loop updates to perform on validation
tasks
inner-lr: Learning rate to use for inner-loop updates
meta-lr: Learning rate to use when updating the meta-learner weights
meta-batch-size: Number of tasks per meta-batch
order: Whether to use 1st or 2nd order MAML
epochs: Number of training epochs
epoch-len: Meta-batches per epoch
eval-batches: Number of meta-batches to use when evaluating the model
after each epoch
NB: For MAML n, k and q are fixed between train and test. You may need
to adjust meta-batch-size to fit your GPU. 2nd order MAML uses a lot
more memory.
Omniglot
k-way
5
5
20
20
n-shot
1
5
1
5
Published
98.7
99.9
95.8
98.9
This Repo (1)
95.5
99.5
92.2
97.7
This Repo (2)
98.1
99.8
91.6
95.9
miniImageNet
k-way
5
5
n-shot
1
5
Published
48.1
63.2
This Repo (1)
46.4
63.3
This Repo (2)
47.5
64.7
Number in brackets indicates 1st or 2nd order MAML.
请发表评论