Introduction to Generative Adversarial Networks
Generative Adversarial Networks represent one of the most influential innovations in deep learning, introducing an adversarial training paradigm where two neural networks compete against each other to produce increasingly realistic synthetic data.
The Adversarial Paradigm
Traditional generative models learn to approximate a data distribution directly through maximum likelihood estimation or variational inference. GANs take a fundamentally different approach by framing generation as a game between two players. The generator network attempts to produce fake samples that are indistinguishable from real data, while the discriminator network tries to identify which samples are real and which are generated. This adversarial dynamic drives both networks to improve continuously.
The insight behind GANs comes from game theory. In a zero-sum game, one player's gain equals another's loss. The generator wins by fooling the discriminator, while the discriminator wins by correctly classifying real versus fake samples. At equilibrium, the generator produces samples so realistic that the discriminator cannot do better than random guessing.
import torch
import torch.nn as nn
class Generator(nn.Module):
def __init__(self, latent_dim, output_dim):
super().__init__()
self.model = nn.Sequential(
nn.Linear(latent_dim, 256),
nn.LeakyReLU(0.2),
nn.Linear(256, 512),
nn.LeakyReLU(0.2),
nn.Linear(512, 1024),
nn.LeakyReLU(0.2),
nn.Linear(1024, output_dim),
nn.Tanh()
)
def forward(self, z):
return self.model(z)
class Discriminator(nn.Module):
def __init__(self, input_dim):
super().__init__()
self.model = nn.Sequential(
nn.Linear(input_dim, 1024),
nn.LeakyReLU(0.2),
nn.Dropout(0.3),
nn.Linear(1024, 512),
nn.LeakyReLU(0.2),
nn.Dropout(0.3),
nn.Linear(512, 256),
nn.LeakyReLU(0.2),
nn.Linear(256, 1),
nn.Sigmoid()
)
def forward(self, x):
return self.model(x)The Minimax Objective
The GAN training objective formalizes the adversarial game mathematically. The discriminator maximizes its ability to distinguish real from fake samples, while the generator minimizes the discriminator's success. This creates a minimax optimization problem where we simultaneously optimize both networks with opposing goals.
The value function that defines this game involves expectations over the real data distribution and the generator's output distribution. The discriminator receives high reward for assigning high probability to real samples and low probability to generated samples. The generator receives reward when the discriminator assigns high probability to its generated samples.
import torch.nn.functional as F
def discriminator_loss(real_output, fake_output):
real_loss = F.binary_cross_entropy(real_output, torch.ones_like(real_output))
fake_loss = F.binary_cross_entropy(fake_output, torch.zeros_like(fake_output))
return real_loss + fake_loss
def generator_loss(fake_output):
return F.binary_cross_entropy(fake_output, torch.ones_like(fake_output))
def train_step(generator, discriminator, real_data, latent_dim, g_optimizer, d_optimizer):
batch_size = real_data.size(0)
device = real_data.device
# Train discriminator
d_optimizer.zero_grad()
real_output = discriminator(real_data)
z = torch.randn(batch_size, latent_dim, device=device)
fake_data = generator(z)
fake_output = discriminator(fake_data.detach())
d_loss = discriminator_loss(real_output, fake_output)
d_loss.backward()
d_optimizer.step()
# Train generator
g_optimizer.zero_grad()
z = torch.randn(batch_size, latent_dim, device=device)
fake_data = generator(z)
fake_output = discriminator(fake_data)
g_loss = generator_loss(fake_output)
g_loss.backward()
g_optimizer.step()
return d_loss.item(), g_loss.item()Latent Space and Generation
The generator transforms samples from a simple prior distribution, typically a multivariate Gaussian, into samples that approximate the complex data distribution. This latent space serves as the source of variation in generated samples. Each point in the latent space corresponds to a potential output, and the generator learns a mapping that transforms this simple space into the manifold of realistic data.
The dimensionality of the latent space affects generation quality and diversity. Too few dimensions may limit the generator's ability to capture all modes of variation in the data. Too many dimensions can make training more difficult and may not improve quality. Practitioners typically choose latent dimensions between 64 and 512 for image generation tasks.
class GAN:
def __init__(self, latent_dim, data_dim, device):
self.latent_dim = latent_dim
self.device = device
self.generator = Generator(latent_dim, data_dim).to(device)
self.discriminator = Discriminator(data_dim).to(device)
self.g_optimizer = torch.optim.Adam(
self.generator.parameters(), lr=0.0002, betas=(0.5, 0.999)
)
self.d_optimizer = torch.optim.Adam(
self.discriminator.parameters(), lr=0.0002, betas=(0.5, 0.999)
)
def sample(self, num_samples):
self.generator.eval()
with torch.no_grad():
z = torch.randn(num_samples, self.latent_dim, device=self.device)
samples = self.generator(z)
self.generator.train()
return samples
def interpolate(self, z1, z2, steps=10):
self.generator.eval()
with torch.no_grad():
alphas = torch.linspace(0, 1, steps, device=self.device)
interpolations = []
for alpha in alphas:
z = (1 - alpha) * z1 + alpha * z2
sample = self.generator(z)
interpolations.append(sample)
self.generator.train()
return torch.stack(interpolations)Understanding the Discriminator
The discriminator functions as a learned loss function for the generator. Rather than using a fixed metric like mean squared error or pixel-wise differences, GANs learn what makes samples realistic through the discriminator's evolving judgment. This adaptive loss function can capture complex, high-level features that distinguish real from fake samples.
The discriminator must balance between being too easy and too hard to fool. If the discriminator is too weak, it provides no useful signal for the generator to improve. If the discriminator is too strong, it may reject all generated samples completely, leaving no gradient for the generator to follow. This balance is crucial for stable training.
class SpectralNormDiscriminator(nn.Module):
def __init__(self, input_dim):
super().__init__()
self.model = nn.Sequential(
nn.utils.spectral_norm(nn.Linear(input_dim, 1024)),
nn.LeakyReLU(0.2),
nn.utils.spectral_norm(nn.Linear(1024, 512)),
nn.LeakyReLU(0.2),
nn.utils.spectral_norm(nn.Linear(512, 256)),
nn.LeakyReLU(0.2),
nn.utils.spectral_norm(nn.Linear(256, 1))
)
def forward(self, x):
return self.model(x)
def compute_gradient_penalty(discriminator, real_samples, fake_samples):
batch_size = real_samples.size(0)
device = real_samples.device
alpha = torch.rand(batch_size, 1, device=device)
interpolates = alpha * real_samples + (1 - alpha) * fake_samples
interpolates.requires_grad_(True)
d_interpolates = discriminator(interpolates)
gradients = torch.autograd.grad(
outputs=d_interpolates,
inputs=interpolates,
grad_outputs=torch.ones_like(d_interpolates),
create_graph=True,
retain_graph=True
)[0]
gradient_penalty = ((gradients.norm(2, dim=1) - 1) ** 2).mean()
return gradient_penaltyThe Nash Equilibrium Goal
The theoretical goal of GAN training is to reach a Nash equilibrium where neither network can improve by changing its strategy unilaterally. At this point, the generator produces samples from the true data distribution, and the discriminator outputs probability 0.5 for all inputs, unable to distinguish real from fake.
In practice, reaching this equilibrium is challenging. The optimization landscape is non-convex, and gradient descent on the minimax objective does not guarantee convergence. The training dynamics can exhibit oscillations, mode collapse where the generator produces limited variety, or divergence where one network dominates completely.
def train_gan(gan, dataloader, epochs):
history = {"d_loss": [], "g_loss": []}
for epoch in range(epochs):
epoch_d_loss = 0
epoch_g_loss = 0
num_batches = 0
for real_data in dataloader:
real_data = real_data[0].to(gan.device)
d_loss, g_loss = train_step(
gan.generator,
gan.discriminator,
real_data,
gan.latent_dim,
gan.g_optimizer,
gan.d_optimizer
)
epoch_d_loss += d_loss
epoch_g_loss += g_loss
num_batches += 1
history["d_loss"].append(epoch_d_loss / num_batches)
history["g_loss"].append(epoch_g_loss / num_batches)
if (epoch + 1) % 10 == 0:
print(f"Epoch {epoch+1}: D_loss={history['d_loss'][-1]:.4f}, "
f"G_loss={history['g_loss'][-1]:.4f}")
return historyKey Takeaways
Generative Adversarial Networks introduce adversarial training where a generator and discriminator compete in a minimax game. The generator transforms latent vectors into synthetic samples while the discriminator learns to distinguish real from fake. This framework enables learning complex data distributions without explicit likelihood computation. The discriminator serves as a learned loss function that adapts during training. While theoretically elegant, GANs present significant training challenges that subsequent sections will address.