Machine learning and GANs part 2
The first part was going through the code that dealt with the processing of the images and importing the packages for the program. Now this part is going through the shape of the network itself.
The TF_GAN tutorial starts defining functions for the layers.
def _dense(inputs, units, l2_weight):
return tf.layers.dense(
inputs, units, None,
kernel_initializer=tf.keras.initializers.glorot_uniform,
kernel_regularizer=tf.keras.regularizers.l2(l=l2_weight),
bias_regularizer=tf.keras.regularizers.l2(l=l2_weight))
As we can see this creates the dense layer. The arguments for the function are inputs, units, l2 weight. The inputs are the image data being put through the network. Units which are another word neuron for the network. L2_weights stands for layer 2 weights. Weights are the measure are how strong a connection is to another neuron.
kernel_initializer is are ways to start with random weights for your model.
kernel_regularizer reduces overfitting by adding penalties to the weights
bias_regularizer which tries to reduce the bias of network
def _batch_norm(inputs, is_training):
return tf.layers.batch_normalization(
inputs, momentum=0.999, epsilon=0.001, training=is_training)
This is the batch normalisation layer. Batch normalisation is a technique which normalises output from layers within the neural network.
def _deconv2d(inputs, filters, kernel_size, stride, l2_weight):
return tf.layers.conv2d_transpose(
inputs, filters, [kernel_size, kernel_size], strides=[stride, stride],
activation=tf.nn.relu, padding='same',
kernel_initializer=tf.keras.initializers.glorot_uniform,
kernel_regularizer=tf.keras.regularizers.l2(l=l2_weight),
bias_regularizer=tf.keras.regularizers.l2(l=l2_weight))
This layer is a Deconvolution layer or an Transposed convolution layer. These layers are part of Convolutional Neural Networks. Networks like those can scan an image and learn different features of the images. Which can be used to difference between the images.
The arguments of the function are simply settings for the how the deconvolutional layer will scan the image. Most of them can be found here.
Filters argument size of which filters/scans through the image.
Kernel_size is the size the convolution window. The convolution window also known as the kernel.
Strides is the movement of the kernel along the image.
The next returns a Conv2DTranspose class. Which is a TensorFlow layer. The class uses the same arguments defined from the earlier function. The kernel size and the strides used a list of two integers.
activation=tf.nn.relu, padding='same',
This line defines the activation function. Padding
kernel_initializer=tf.keras.initializers.glorot_uniform,
kernel_regularizer=tf.keras.regularizers.l2(l=l2_weight),
bias_regularizer=tf.keras.regularizers.l2(l=l2_weight)
these lines are the same as in the dense layer.
def _conv2d(inputs, filters, kernel_size, stride, l2_weight):
return tf.layers.conv2d(
inputs, filters, [kernel_size, kernel_size], strides=[stride, stride],
activation=None, padding='same',
kernel_initializer=tf.keras.initializers.glorot_uniform,
kernel_regularizer=tf.keras.regularizers.l2(l=l2_weight),
bias_regularizer=tf.keras.regularizers.l2(l=l2_weight))
The tf.layers.conv2d details are the same as the _deconv2d layer. But there is no activation function.
Using the layers, the tutorial just defined earlier. The next is the develop an generator. The first line
is_training = (mode == tf.estimator.ModeKeys.TRAIN)
Sets the generator to make sure its one training mode.
net = _dense(noise, 1024, weight_decay)
net = _batch_norm(net, is_training)
net = tf.nn.relu(net)
The net variable stacks on layers on top of each other. The first layer is the dense layer as noise as the input and 1024 units. And weight decay from the generator argument above. The next layer is an batch normalisation with net variable as the input argument and the training argument set to is_traisning. Third line is an tensorflow activation function relu. With the prevois layer out though it.
net = _dense(net, 7 * 7 * 256, weight_decay)
net = _batch_norm(net, is_training)
net = tf.nn.relu(net)
The next layers added to the neural network is an other dense layer. The previous layers as the input. 7 * 7 * 256 as the units. The next line is another batch norm like before. The next layer is an other relu activation function.
net = tf.reshape(net, [-1, 7, 7, 256])
net = _deconv2d(net, 64, 4, 2, weight_decay)
net = _deconv2d(net, 64, 4, 2, weight_decay)
# Make sure that generator output is in the same range as `inputs`
# ie [-1, 1].
net = _conv2d(net, 1, 4, 1, 0.0)
net = tf.tanh(net)
net = tf.reshape(net, [-1, 7, 7, 256])
Changes the shape of the of the tensor input from the previous layer of the network.
net = _deconv2d(net, 64, 4, 2, weight_decay)
net = _deconv2d(net, 64, 4, 2, weight_decay)
Two deconv2d layers are added with the previous layers as inputs. 64 filters. 2 as the number for kernel size.
net = _conv2d(net, 1, 4, 1, 0.0)
net = tf.tanh(net)
A convonultal layer with 1 filter, 4 as the kernel size, 1 equaling to stride and 0.0 equaling layer 2 weight.
Ending with an tanh activation layer.
At the end the function it returns the generator as the net variable.
First the tutorial defines an leaky relu. Another type of activation function.
Defining the discriminator the first line in the function is to delete unused conditioning. The next line sets up the training mode.
Like the generator it stacks layers using the net variable. The first layer is a convolution layer. With the image as the input. 64 filters, 4 as kernel size, 2 as strides.
After that an activation function of the leaky relu is added.
The next convolutional layer is the same, except from the filter which is now 128. Another leaky relu layer is added.
The next layer flattens the inputs from coming from previous layers.
After that the new dense layer is added with 1024 units. An batch norm layer is added with training mode set on. And another leakly relu is added.
The final layer is an dense layer with 1 unit.
The return statement gives the layers produced in variable form.
Now the tutorial has section for evaluating the model.
from tensorflow_gan.examples.mnist import util as eval_util
The tutorial imports the ulties for evaluating models.
real_data_logits = tf.reduce_mean(gan_model.discriminator_real_outputs)
gen_data_logits = tf.reduce_mean(gan_model.discriminator_gen_outputs)
defines the logits of the discromator and generator.
real_mnist_score = eval_util.mnist_score(gan_model.real_data)
generated_mnist_score = eval_util.mnist_score(gan_model.generated_data)
frechet_distance = eval_util.mnist_frechet_distance(
gan_model.real_data, gan_model.generated_data)
This works out scores of the mist data. Using the tensorflow_gan.examples.mnist util package to help evaluate the scores
return {
'real_data_logits': tf.metrics.mean(real_data_logits),
'gen_data_logits': tf.metrics.mean(gen_data_logits),
'real_mnist_score': tf.metrics.mean(real_mnist_score),
'mnist_score': tf.metrics.mean(generated_mnist_score),
'frechet_distance': tf.metrics.mean(frechet_distance),
}
This code block returns the stats of the scores calculated.
train_batch_size = 32 #@param
noise_dimensions = 64 #@param
generator_lr = 0.001 #@param
discriminator_lr = 0.0002 #@param
These blocks set up the parameters for the model. The Constructor helps piece together the GAN model. This called the GANEstimator.
def gen_opt():
gstep = tf.train.get_or_create_global_s tep()
base_lr = generator_lr
# Halve the learning rate at 1000 steps.
lr = tf.cond(gstep < 1000, lambda: base_lr, lambda: base_lr / 2.0)
return tf.train.AdamOptimizer(lr, 0.5)
This function helps optimise the generator more by decreasing the learning rate.
gstep = tf.train.get_or_create_global_s tep()
The global step rate for the model.
base_lr = generator_lr
Creating a variable which the base learning rate will match the generator learning rate.
gan_estimator = tfgan.estimator.GANEstimator(
generator_fn=unconditional_generator,
discriminator_fn=unconditional_discriminator,
generator_loss_fn=tfgan.losses.wasserstein_generator_loss,
discriminator_loss_fn=tfgan.losses.wasserstein_discriminator_loss,
params={'batch_size': train_batch_size, 'noise_dims': noise_dimensions},
generator_optimizer=gen_opt,
discriminator_optimizer=tf.train.AdamOptimizer(discriminator_lr, 0.5),
get_eval_metric_ops_fn=get_eval_metric_ops_fn)
This the gan estimator constructer:
generator_fn=unconditional_generator,
discriminator_fn=unconditional_discriminator,
These variables define the generator and discriminator functions using the functions made earlier.
generator_loss_fn=tfgan.losses.wasserstein_generator_loss,
discriminator_loss_fn=tfgan.losses.wasserstein_discriminator_loss,
These define the loss functions for generator and the discriminator. The loss function Wasserstein los function. This loss function tends to make the model more stable. As the loss rate less likely to fluctuate and stay stuck.
params={'batch_size': train_batch_size, 'noise_dims': noise_dimensions},
Sets the parameters set from earlier in a dictionary.
generator_optimizer=gen_opt,
discriminator_optimizer=tf.train.AdamOptimizer(discriminator_lr, 0.5),
get_eval_metric_ops_fn=get_eval_metric_ops_fn)
Sets up optimisers and collects metrics.
The tutorial’s comments does a good job the code.
tf.autograph.set_verbosity disables extra text from the output when training.
import time
steps_per_eval = 500 #@param
max_train_steps = 5000 #@param
batches_for_eval_metrics = 100 #@param
More parameters set up which we can use the custom text options to the right to change the pararmeters.
steps = []
real_logits, fake_logits = [], []
real_mnist_scores, mnist_scores, frechet_distances = [], [], []
Like the comment said the list is used to track metrics.
Cur_step is the variable for current step which is step to 0.
start_time = time.time()
Start time defined my the current when the program got this point.
while cur_step < max_train_steps:
next_step = min(cur_step + steps_per_eval, max_train_steps)
While loop that states that if the max training step is larger then the current step then continue.
next_step is done by adding the current step and step per evaluation compared to the max training steps.
start = time.time()
gan_estimator.train(input_fn, max_steps=next_step)
steps_taken = next_step - cur_step
time_taken = time.time() - start
print('Time since start: %.2f min' % ((time.time() - start_time) / 60.0))
print('Trained from step %i to %i in %.2f steps / sec' % (
cur_step, next_step, steps_taken / time_taken))
cur_step = next_step
Like the tutorial said in the text description of this section. It repeatedly calls train function gan estimator to show the genitor output. Start equals time the line of code is run.
gan_estimator.train(input_fn, max_steps=next_step)
input function used to feed in data and max_step is set to the next step.
steps_taken = next_step - cur_step
time_taken = time.time() – start
The step taken is worked out by the difference next_step and current step. time_taken worked out by difference between the start and current time.
print('Time since start: %.2f min' % ((time.time() - start_time) / 60.0))
print('Trained from step %i to %i in %.2f steps / sec' % (
cur_step, next_step, steps_taken / time_taken))
cur_step = next_step
These print statements print out time of training and step. With formatting.
metrics = gan_estimator.evaluate(input_fn, steps=batches_for_eval_metrics)
steps.append(cur_step)
real_logits.append(metrics['real_data_logits'])
fake_logits.append(metrics['gen_data_logits'])
real_mnist_scores.append(metrics['real_mnist_score'])
mnist_scores.append(metrics['mnist_score'])
frechet_distances.append(metrics['frechet_distance'])
print('Average discriminator output on Real: %.2f Fake: %.2f' % (
real_logits[-1], fake_logits[-1]))
print('Inception Score: %.2f / %.2f Frechet Distance: %.2f' % (
mnist_scores[-1], real_mnist_scores[-1], frechet_distances[-1]))
This block is mainly to calculate metrics.
metrics = gan_estimator.evaluate(input_fn, steps=batches_for_eval_metrics)
Gets the first evaluate from the input function and the batches for evaluation.
steps.append(cur_step)
real_logits.append(metrics['real_data_logits'])
fake_logits.append(metrics['gen_data_logits'])
real_mnist_scores.append(metrics['real_mnist_score'])
mnist_scores.append(metrics['mnist_score'])
frechet_distances.append(metrics['frechet_distance'])
These lines of code simply append the data into lists.
print('Average discriminator output on Real: %.2f Fake: %.2f' % (
real_logits[-1], fake_logits[-1]))
print('Inception Score: %.2f / %.2f Frechet Distance: %.2f' % (
mnist_scores[-1], real_mnist_scores[-1], frechet_distances[-1]))
These print statements output the scores of the training.
# Vizualize some images.
iterator = gan_estimator.predict(
input_fn, hooks=[tf.train.StopAtStepHook(num_steps=21)])
try:
imgs = np.array([next(iterator) for _ in range(20)])
except StopIteration:
pass
tiled = tfgan.eval.python_image_grid(imgs, grid_shape=(2, 10))
plt.axis('off')
plt.imshow(np.squeeze(tiled))
plt.show()
this block of code shows some of the image.
iterator = gan_estimator.predict(
input_fn, hooks=[tf.train.StopAtStepHook(num_steps=21)])
try:
imgs = np.array([next(iterator) for _ in range(20)])
except StopIteration:
pass
Try excpect block creates an variable for the images by iterating though iterator variable.
tiled = tfgan.eval.python_image_grid(imgs, grid_shape=(2, 10))
plt.axis('off')
plt.imshow(np.squeeze(tiled))
plt.show()
Turns the images into grid form
# Plot the metrics vs step.
plt.title('MNIST Frechet distance per step')
plt.plot(steps, frechet_distances)
plt.figure()
plt.title('MNIST Score per step')
plt.plot(steps, mnist_scores)
plt.plot(steps, real_mnist_scores)
plt.show()
Adds the steps and metrics close to the images.