Skip to content

PyTorch Course: Deconstructing Modern Architectures

Welcome, my aspiring apprentices, to the crucible of creation! You stand at the precipice of a great awakening. Within these hallowed digital halls, we shall not merely learn PyTorch; we shall master it, shaping its very tensors to our will. This is not a course; it is a summons. A call to arms for all who dare to dream in tensors and architect the future! Prepare yourselves, for the path ahead is fraught with peril, caffeine, and the incandescent glow of computational glory! Mwahahaha!

PyTorch Course Lets get started - Torchenstein

Do you want to hear the origin story of this course? Click here

Course Goal: To imbue you—my fearless apprentices—with the eldritch secrets of PyTorch building blocks, enabling you to conjure, dissect, and ultimately command modern neural network architectures like Transformers and Diffusion models.

Learner level: Beginner - Advanced Prerequisite Madness: None! Whether you are a fresh-faced initiate or a seasoned GPU warlock, the lab doors stand open. ⚡️🧪

Module 0: Getting Started with PyTorch

Before we unleash neural monstrosities upon the world, we must ignite your development lair. This module guides you through preparing PyTorch on any operating system—so your GPUs purr at your command.

What you will learn: 1. How to setup a PyTorch environment for different operating systems and test it.

Lessons:

  1. Master Blueprint for the Rebellion - The master blueprint of our curriculum—study it well, rebel!
  2. Setting Up Your PyTorch Environments:
    1. Windows: Assembling the PyTorch Development Environment - Assemble your PyTorch lab on Windows. We'll use pyenv and poetry to perfectly manage your Python setup, preparing it for tensor rebellion.
    2. Linux: Assembling the PyTorch Open-Source Toolkit - Forge your PyTorch toolkit on the powerful and open foundation of Linux for maximum freedom and experimentation.
    3. macOS: Assembling Your PyTorch Setup - Calibrate your macOS system and assemble the ultimate PyTorch setup to awaken the neural engine of your Apple silicon.
    4. Google Colab: Assembling the Cloud Laboratory - Set up your PyTorch laboratory in the cloud with Google Colab. Seize the power of free GPUs for our grand experiments—mwahaha!

Module 1: PyTorch Core - I see tensors everywhere

Here we unveil the truth: the cosmos is a writhing mass of tensors awaiting our manipulation. Grasp them well—for they are the bedrock of every grand scheme to come!

This module dives into the fundamental components of PyTorch, essential for any deep learning task.

1.1 Tensors: The Building Blocks

What you will learn:

  1. Tensor Concept. What is a tensor? Tensor vs. Matrix. Mathematical vs. PyTorch interpretation. Why tensors are crucial for ML
  2. PyTorch Basics: Tensor creation and their attributes (dtype, shape, device).
  3. Tensor manipulation: Indexing, Slicing, Joining (torch.cat, torch.stack), Splitting. Manipulating tensor shapes (reshape, view, squeeze, unsqueeze, permute, transpose).

1.1 Lessons:

  1. Summoning Your First Tensors - Conjure tensors from void, inspect their properties, revel in their latent might (with a bit of help from torch.randn, torch.zeros, torch.ones, torch.arange, torch.linspace etc).
  2. Tensor Shape-Shifting & Sorcery - Slice, squeeze, and permute dimensions until reality warps to your whims (with a bit of help from torch.cat, torch.stack, torch.split, torch.reshape, torch.view, torch.squeeze, torch.unsqueeze, torch.permute, torch.transpose etc).
  3. DTypes & Devices: Choose Your Weapons - Select precision and hardware like a seasoned archmage choosing spell components, under the hood of torch.float, torch.float16, etc.

1.2 Tensor Operations: Computation at Scale

What you will learn:

  1. Overview of tensor math. Element-wise operations. Reduction operations (sum, mean, max, min, std). Basic matrix multiplication (torch.mm, torch.matmul, @ operator). Broadcasting: rules and practical examples with verifiable tiny data. In-place operations.

1.2 Lessons:

  1. Elemental Tensor Alchemy - Brew element-wise, reduction, and other operations into potent mathematical elixirs.
  2. Matrix Mayhem: Multiply or Perish - Orchestrate 2-D, batched, and high-dimensional multiplications with lethal elegance.
  3. Broadcasting: When Dimensions Bow to You - Command mismatched shapes to cooperate through the dark art of implicit expansion.

1.3 Einstein Summation: The Power of einsum

What you will learn:

  1. Understanding Einstein notation. Why it's powerful for complex operations (e.g., attention).

1.3 Lessons:

  1. Einstein Summation: Harness the Λ-Power - Invoke einsum to express complex ops with maddening brevity.
  2. Advanced Einsum Incantations - Wield multi-tensor contractions that underpin attention itself.

1.4 Autograd: Automatic Differentiation

What you will learn:

  1. What are gradients? The computational graph. How PyTorch tracks operations.
  2. requires_grad attribute. Performing backward pass with .backward(). Accessing gradients with .grad. torch.no_grad() and tensor.detach().
  3. Gradient accumulation. Potential pitfalls. Visualizing computational graphs (conceptually).

1.4 Lessons:

  1. Autograd: Ghosts in the Machine (Learning) - Meet the spectral gradient trackers haunting every tensor operation.
  2. Gradient Hoarding for Grand Spells - Accumulate gradients like arcane energy before unleashing colossal updates.

Module 2: torch.nn — Building Neural Networks

Witness code coalescing into living, breathing neural contraptions! In this module we bend torch.nn to our will, assembling layers and models worthy of legend.

2.1 The nn.Module Blueprint

What you will learn: - The role of nn.Module as the base class for layers and models.
- Implementing __init__ and forward.
- Registering parameters and buffers.
- Composing modules with nn.Sequential, nn.ModuleList, and nn.ModuleDict.
- Saving and restoring weights with state_dict.

2.1 Lessons:

  1. Building Brains with nn.Module - Craft custom neural matter by overriding __init__ & forward.
  2. Franken-Stacking Layers - Bolt modules together with Sequential, ModuleList, and ModuleDict.
  3. Preserving Your Monster's Memories - Save and resurrect model weights with state_dict necromancy.

2.2 Linear Layer and Activations

What you will learn: - Linear layers and high-dimensional matrix multiplication. - What is the role of linear layers in attention mechanisms (query, key, value)? - Activation functions (ReLU, GELU, SiLU, Tanh, Softmax, etc.).
- Dropout for regularisation.

2.2 Lessons:

  1. Linear Layers: The Vector Guillotine - Slice through dimensions turning inputs into finely-chopped activations.
  2. Activation Elixirs - Re-animate neurons with ReLU, GELU, SiLU, and other zesty potions.
  3. Dropout: Network lobotomy - Make neurons forget just enough to generalise—no lobotomy required.

2.3 Embedding Layers

What you will learn: - Embedding layers and their purpose in neural networks. - Embedding layer implementation from scratch, initialisation, and usage. - Positional encoding and how it is used to inject order into the model.

2.3 Lessons:

  1. Embedding Layers: Secret Identity Chips - Embed discreet meanings within high-dimensional space.
  2. Positional Encoding: Injecting Order into Chaos - Imbue sequences with a sense of place so attention never loses its bearings.

2.4 Normalisation Layers

What you will learn: - BatchNorm vs. LayerNorm and when to use each.
- RMSNorm and other modern alternatives.
- Training vs. evaluation mode caveats.

2.4 Lessons:

  1. Normalization: Calming the Beast - Tame activations with BatchNorm and LayerNorm before they explode.
  2. RMSNorm & Other Exotic Tonics - Sample contemporary concoctions for stable training.
  3. Train vs. Eval: Split Personality Disorders - Toggle modes and avoid awkward identity crises.

2.5 Loss Functions — Guiding Optimisation

What you will learn: - Loss functions recap, the main types of loss functions and when to use each. - Prepare inputs and targets for loss functions and outputs interpretation (logits vs. probabilities).
- Interpreting reduction modes and ignore indices.

2.5 Lessons:

  1. Loss Potions: Guiding Pain into Progress - Channel model errors into gradients that sharpen intelligence.
  2. Preparing Sacrificial Inputs & Targets - Align logits and labels for maximum learning agony.
  3. Reduction Rituals & Ignore Indices - Decipher reduction modes and skip unworthy samples without remorse.