TL;DR: you need to reset the random generator state in your loop that creates the model, by calling tf.random.set_seed
.
Explanation:
A neural network consists of a series of mathematical operations. Most frequently an operation in a neural network can be viewed as the following linear equation : y=W*x+b
, where:
x
is the input
y
is the output
W
is the weight of the node of the network
b
is the bias of the node of the network.
(All those linear equation are then separated by a non linearity to avoid a collapse of the network).
Most of the time, neural networks have their weights (W
) initialized randomly. Which means that every time that you run the training of a neural network, you will get a slightly different results. If your model is robust enough, those results will be statistically similar.
Now, if one wants to be able to reproduce some exact results, that random initialization of weights can be a problem. To be able to reproduce some exact numbers, one can set a seed to the random generator, that provides the guarantee that during the execution of the program, the generated random numbers will always be from the same series, in the same order.
Running the same program twice in the terminal
Consider the following program, named network_with_seed.py
:
import tensorflow as tf
tf.random.set_seed(0)
model = tf.keras.Sequential([tf.keras.layers.Dense(1,input_shape=(1,))])
print(model.trainable_weights)
That program is a simple network with one fully connected layer with node(also known as a perceptron). That network as one weight W
, that the program will print.
If we execute that program a first time, we get:
$ python network_with_seed.py
[<tf.Variable 'dense/kernel:0' shape=(1, 1) dtype=float32, numpy=array([[-0.7206192]], dtype=float32)>, <tf.Variable 'dense/bias:0' shape=(1,) dtype=float32, numpy=array([0.], dtype=float32)>]
Now, If we execute that code again:
python network_with_seed.py
[<tf.Variable 'dense/kernel:0' shape=(1, 1) dtype=float32, numpy=array([[-0.7206192]], dtype=float32)>, <tf.Variable 'dense/bias:0' shape=(1,) dtype=float32, numpy=array([0.], dtype=float32)>]
We get exactly the same value for the weight W
, -0.7206192
, thanks to the seed.
Creating a model in a loop
Now, lets imagine that we want to create that network twice, and get some numbers, but we want to do everything in one python script.
Consider that program, loop_network_with_seed.py
:
import tensorflow as tf
tf.random.set_seed(0)
for idx in range(2):
print(f"Model {idx+1}")
model = tf.keras.Sequential([tf.keras.layers.Dense(1,input_shape=(1,))])
print(model.trainable_weights)
If we run it
$ python loop_network_with_seed.py
Model 1
[<tf.Variable 'dense/kernel:0' shape=(1, 1) dtype=float32, numpy=array([[-0.7206192]], dtype=float32)>, <tf.Variable 'dense/bias:0' shape=(1,) dtype=float32, numpy=array([0.], dtype=float32)>]
Model 2
[<tf.Variable 'dense_1/kernel:0' shape=(1, 1) dtype=float32, numpy=array([[0.19195998]], dtype=float32)>, <tf.Variable 'dense_1/bias:0' shape=(1,) dtype=float32, numpy=array([0.], dtype=float32)>]
If we just look at the randomly initialized kernel W
we see, for the same model, different values : -0.7206192, 0.19195998
, and not the same one as when we were executing the same model twice in a row by calling our script network_with_seed.py
. If we train those two networks, we won't get exactly the same results, because their initial weight W
is different.
Note that if you were to run that program again, you will get exactly the same results, because the weight of the model 1 will be set to -0.7206192
again, and the weight W
of the model 2 will be set to 0.19195998
, thanks to the seed. The seed makes the randomness reproducible between each run.
This is because of the internal state of the random number generator. For more information, you can read the documentation of tf.random.set_seed
Resetting the random generator state
If you want reproducible results for the same set of parameters, you need to reset the random state before the creation of each model. By setting the seed back, we ensure that the same random values will be used for our weights W
.
Consider the final program network_with_seed_in_loop.py
import tensorflow as tf
for _ in range(2):
print(f"Model {idx+1}")
tf.random.set_seed(0)
model = tf.keras.Sequential([tf.keras.layers.Dense(1,input_shape=(1,))])
print(model.trainable_weights)
Outputs :
$ python network_with_seed_in_loop.py`
Model 1
[<tf.Variable 'dense/kernel:0' shape=(1, 1) dtype=float32, numpy=array([[-0.7206192]], dtype=float32)>, <tf.Variable 'dense/bias:0' shape=(1,) dtype=float32, numpy=array([0.], dtype=float32)>]
Model 2
[<tf.Variable 'dense_1/kernel:0' shape=(1, 1) dtype=float32, numpy=array([[-0.7206192]], dtype=float32)>, <tf.Variable 'dense_1/bias:0' shape=(1,) dtype=float32, numpy=array([0.], dtype=float32)>]
Here, the internal state is reset, and the kernel W
of each network are equal.