Matrix Multiplication: Unleashing the Power of Tensors! ⚡¶
"Behold! The sacred art of matrix multiplication - where dimensions dance and vectors bend to my will!" — Professor Victor py Torchenstein
The Attention Formula (Preview of Things to Come)¶
$$\text{Attention}(Q, K, V) = \text{softmax}\left(\frac{QK^T}{\sqrt{d_k}}\right)V$$
Where:
- $Q$ is the Query matrix
- $K$ is the Key matrix
- $V$ is the Value matrix
- $d_k$ is the dimension of the key vectors
- $\text{softmax}$ normalizes the attention weights
Basic Matrix Operations¶
Let's start with the fundamentals before we conquer attention mechanisms!
Element-wise multiplication:
$C_{ij} = A_{ij} \times B_{ij}$
Matrix multiplication: $C_{ij} = \sum_{k} A_{ik} \times B_{kj}$
In [2]:
Copied!
import torch
# Create some matrices for experimentation
A = torch.randn(3, 4)
B = torch.randn(4, 2)
print("Matrix A shape:", A.shape)
print("Matrix B shape:", B.shape)
# Matrix multiplication
C = torch.matmul(A, B)
print("Result C shape:", C.shape)
print("\nMwahahaha! The matrices have been multiplied!")
import torch
# Create some matrices for experimentation
A = torch.randn(3, 4)
B = torch.randn(4, 2)
print("Matrix A shape:", A.shape)
print("Matrix B shape:", B.shape)
# Matrix multiplication
C = torch.matmul(A, B)
print("Result C shape:", C.shape)
print("\nMwahahaha! The matrices have been multiplied!")
Matrix A shape: torch.Size([3, 4]) Matrix B shape: torch.Size([4, 2]) Result C shape: torch.Size([3, 2]) Mwahahaha! The matrices have been multiplied!
PyTorch Matrix Multiplication Methods¶
Professor Torchenstein's arsenal includes multiple ways to multiply matrices:
torch.matmul()
- The general matrix multiplication function@
operator - Pythonic matrix multiplication (same as matmul)torch.mm()
- For 2D matrices onlytorch.bmm()
- Batch matrix multiplication
Mathematical Foundations¶
For matrices $A \in \mathbb{R}^{m \times n}$ and $B \in \mathbb{R}^{n \times p}$:
$$C = AB \quad \text{where} \quad C_{ij} = \sum_{k=1}^{n} A_{ik} B_{kj}$$
This operation is fundamental to:
- Linear transformations
- Neural network forward passes
- Attention mechanisms in Transformers
- And much more! 🧠⚡