recurrent neural network - Difference between bidirectional_dynamic_rnn and stack_bidirectional_dynamic_rnn in Tensorflow

Question

Welcome To Ask or Share your Answers For Others

recurrent neural network - Difference between bidirectional_dynamic_rnn and stack_bidirectional_dynamic_rnn in Tensorflow

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

recurrent neural network - Difference between bidirectional_dynamic_rnn and stack_bidirectional_dynamic_rnn in Tensorflow

I am building a dynamic RNN network with stacking multiple LSTMs. I see there are 2 options

# cells_fw and cells_bw are list of cells eg LSTM cells
stacked_cell_fw = tf.contrib.rnn.MultiRNNCell(cells_fw)
stacked_cell_bw = tf.contrib.rnn.MultiRNNCell(cells_bw)

output = tf.nn.bidirectional_dynamic_rnn(
          stacked_cell_fw, stacked_cell_bw, INPUT,
          sequence_length=LENGTHS, dtype=tf.float32)

vs

output = tf.contrib.rnn.stack_bidirectional_dynamic_rnn(cells_fw, cells_bw, INPUT,
sequence_length=LENGTHS, dtype=tf.float32)

What is the difference between the 2 approaches and is one better than the other?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T18:58:55+0000

If you want to have have multiple layers that pass the information backward or forward in time, there are two ways how to design this. Assume the forward layer consists of two layers F1, F2 and the backword layer consists of two layers B1, B2.

If you use tf.nn.bidirectional_dynamic_rnn the model will look like this (time flows from left to right):

If you use tf.contrib.rnn.stack_bidirectional_dynamic_rnn the model will look like this:

Here the black dot between first and second layer represents a concatentation. I.e., the outputs of the forward and backward cells are concatenated together and fed to the backward and forward layers of the next upper layer. This means both F2 and B2 receive exactly the same input and there is an explicit connection between backward and forward layers. In "Speech Recognition with Deep Recurrent Neural Networks" Graves et al. summarize this as follows:

... every hidden layer receives input from both the forward and backward layers at the level below.

This connection only happens implicitly in the unstacked BiRNN (first image), namely when mapping back to the output. The stacked BiRNN usually performed better for my purposes, but I guess that depends on your problem setting. But for sure it is worthwile to try it out!

EDIT

In response to your comment: I base my answer on the documentation of the function tf.contrib.rnn.stack_bidirectional_dynamic_rnn which says:

Stacks several bidirectional rnn layers. The combined forward and backward layer outputs are used as input of the next layer. tf.bidirectional_rnn does not allow to share forward and backward information between layers.

Also, I looked at the implementation available under this link.

Categories

recurrent neural network - Difference between bidirectional_dynamic_rnn and stack_bidirectional_dynamic_rnn in Tensorflow

recurrent neural network - Difference between bidirectional_dynamic_rnn and stack_bidirectional_dynamic_rnn in Tensorflow

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags