Bottom line is that I get this error, and I've seen a few posts about it but nothing seems to be on point.
ValueError: The last dimension of the inputs to `Dense` should be defined. Found `None`.
I'm using tf.data.Dataset
as my input, and it yields variable length outputs in a dictionary.
I'm using .padded_batch
and things get padded as expected.
Each sample, is something like {'seq_feature1': [1, 2, 3], 'seq_feature2': [5, 6], 'feature': 10}
(before batching and padding padding) - inside my network, each seq_feature
gets embedded using tf.keras.layers.Embedding
then concatenated and passed through an LSTM and then concatenated again to another embedded feature (no time dimension) and then into a dense layer where it all goes wrong.
This is how my forward pass looks like (the model is implemented as subclass, so this is part of the call
method):
# self.embedding1 & 2 & 3 are tf.keras.layers.Embedding
embedded_seq1 = self.embedding1(inputs['seq_feature1']) # (batch_size, seq_len, emb_dim1)
embedded_seq2 = self.embedding2(inputs['seq_feature2']) # (batch_size, seq_len, emb_dim2)
embedded_feat = tf.squeeze(self.embedding3(inputs['feature'])) # (batch_size, emb_dim3)
# project and pass through LSTM
x = tf.concat([embedded_seq1, embedded_seq2], axis=-1)
x = self.dense1(x) # <-- Works as expected, (batch_size, seq_len, dense1_units)
x = self.lstm(x) # (batch_size, lstm_units)
# concat pass through Dense
x = tf.concat([x, embedded_feat, axis=-1) # (batch_size, lstm_units + emb_dim3)
# ERROR GETS THROWN HERE
x = self.dense2(x)
Last line throws the ValueError
mentioned before, which doesn't make any sense. I can calculate the last dimension of the input, and I validated it using eager execution and running through the layers one by one (which works without error, the error only fails when running it all at once).
What am I missing here?
*actual code uses masking in LSTM but irrelevant for post I guess
EDIT
I can avoid this error if I explicitly call the self.dense2.build(tf.TensorShape([None, <the_number_i_can_calculate_but_tf_cant>]))
but since I can easily calculate it using the lstm_units + emb_dim3
it is still a bug and I can't understand why it happens
question from:
https://stackoverflow.com/questions/65646965/dense-layer-not-recognizing-input-shape-even-though-it-should