Machine Learning and GANs Part 1

Today I will be trying to teach myself how to develop GANs. Which stand for Generative adversarial networks. This is a machine learning model which contains a “Generative” model and a “Discriminative”. The Generative model generates data and the Discriminative model which inspects the different type of data shown to it.

I have some previous experience playing around with machine learning, so this project should act as a refresher. I will be using Google’s machine learning course.

The structure of a GAN contains the generator which create fake data which later to be used in with the discriminator which tries to spot the different between the fake data. When the discriminator detects the data given to it is fake it punishes the generator. This allows the generator to get feedback and get better. If the discriminator incorrectly guesses the fake data is real it will get punished and also collect feedback and get better.

Training

The generator and discriminator has different training method. So they cant be run at the same time. So the GAN must alternate between the discriminator and the generator. As the later generator later improves the discriminator’s performance decreases because the discriminator struggles to see the different between fake or real data.  

Coding up the example

Using the Google TF Gan Package. Read though the example to see how an GAN is developed line by line.

image003.png

import tensorflow_gan as tfgan

This line of code imports the tensorflow GAN package. This is an library which makes it easier to train and analyse GANs.

import tensorflow_datasets as tfds

This line imports the tensorflow datasets. These are normally used for example projects. As in real life projects you be using your own data.

import matplotlib.pyplot as plt This line is used to show images of the data. This so we can actually see the data.

import numpy as np This is done to multiplate the arrays and matrices in the code. To change the shape of the array etc.

image005.png
assert 'batch_size' in params
assert 'noise_dims' in params

The assert statements are used to use if an condition is true. These assert statements are testing to see if the variables('batch_size' and 'noise_dims') are in the correct parameters.

bs = params['batch_size']
nd = params['noise_dims']

These assigns variables to the parameter keys of the batch size and noise dimensions. The reason why it’s a key is because the params are set up in a dictionary later on.

split = 'train' if mode == tf.estimator.ModeKeys.TRAIN else 'test'
shuffle = (mode == tf.estimator.ModeKeys.TRAIN)
just_noise = (mode == tf.estimator.ModeKeys.PREDICT)

These variables set up mode argument for the function. The first line sets to training mode which is saved in the split variable. Which is used to split the dataset form training and prediction. The second line sets the function to training mode as well. If the argument is set to shuffle. This variable is used in an if statement later on. The third variable sets the mode to prediction mode if just noise is set.

noise_ds = (tf.data.Dataset.from_tensors(0).repeat()
              .map(lambda _: tf.random_normal([bs, nd])))

This line of code produces random noise sample. tf.data.Dataset.from_tensors(0) creates an dataset object. The repeat function repeats the dataset an infinite number of times. The map function transforms the tensors with random numbers in the shape of the batch size and noise dimensions.

image007.png

This new function pre-processes images. The first line is explained with the comment, that the code maps the image of a shape of 0, 255 to -1, 1. The reason why the these numbers are chosen is because 255 is the bytes of the image. The -1 and 1 is the converting to a float data type.

images = (tf.cast(element['image'], tf.float32) - 127.5) / 127.5

(tf.cast(element['image'], tf.float32)

Tf.cast coverts tensor to a different data type. Inside the tf cast function is the element parameter of the pre-process function and tf.float32. So the image is converted into a float32 datatype.

- 127.5) / 127.5

This normalises the data. Normalisation data preparation techniques used to change number based columns to an common scale. And the pixels of the images scaled into an -1 and 1 shape.

images_ds = (tfds.load('mnist', split=split)
               .map(_preprocess)
               .cache()
               .repeat())

The images_ds variable main job is to load the mnist dataset. The tfds.load function loads the named dataset into the program. The named dataset is the MNIST dataset. The function uses a spilt argument which is set the split variable from earlier.

.cache() function caches the elements

.repeat() repeats the functions that ran earlier.

if shuffle:
    images_ds = images_ds.shuffle(
        buffer_size=10000, reshuffle_each_iteration=True)
  images_ds = (images_ds.batch(bs, drop_remainder=True)
               .prefetch(tf.data.experimental.AUTOTUNE))

The if statement is added if the shuffle option is picked.

The .shuffle() function randomly shuffles the items in the dataset. The buffer_size argument decides how large buffer would be. The buffer is a selection of elements in the dataset. The shuffle function would select randomly from the buffer selection with chosen elements replaced with new ones.         

reshuffle_each_iteration argument means after each iteration it will get reshuffled.

The .batch(bs, drop_remainder=True). This  function merges the elements into batches. Using the bs variable from earlier to set the number of the batch size. The drop remainder argument is true to stop a smaller batch sizes being produced if the number of elements is not even.

.prefetch(tf.data.experimental.AUTOTUNE))

Creates a Dataset that prefetches elements from this dataset. This is done so training data can be done faster as next item will be ready.

For more infomation check this link

return tf.data.Dataset.zip((noise_ds, images_ds))

This line of code zips the datasets of noise and images.

image010.png
params = {'batch_size': 100, 'noise_dims':64}

The parameters are set using the dictionary of the batch size and noise dims.

with tf.Graph().as_default():

This line creates the computation graph were the operations of TensorFlow can run.

ds = input_fn(tf.estimator.ModeKeys.TRAIN, params)

this dataset variable uses the input function from earlier. The mode key is set to train and the parameters set using the params variable from earlier.

numpy_imgs = next(tfds.as_numpy(ds))[1]

This line converts the images into numpy arrays. The next() function is used return next item in the dataset. As_numpy creates a dataset in the numpy format.

img_grid = tfgan.eval.python_image_grid(numpy_imgs, grid_shape=(10, 10))

This line of code allows the images to be shown in an grid which is sotred in a variable

Removes the matplotlib axis when showing the image.

```plt.imshow(np.squeeze(img_grid))

Not too sure will add later.

plt.show()

displays the image.

Tobi Olabode