Share

Building a 2D Heat Equation Simulation with CuPy: A Step-by-Step Guide

by nowrelated · June 2, 2025

In this blog post, we’ll use CuPy, a NumPy-compatible library for GPU-accelerated computing, to build a practical project: simulating the 2D heat equation. This isn’t about basic setup or simple examples—instead, we’ll create a numerical simulation from scratch, compare CPU and GPU performance, and visualize the results. By the end, you’ll see how CuPy can supercharge computational tasks and have a blueprint for your own GPU-powered projects.


1. What’s CuPy and Why This Project?

CuPy lets you run NumPy-like operations on an NVIDIA GPU, speeding up array-based computations dramatically. It’s perfect for tasks involving large datasets or intensive math—like scientific simulations.

Our project simulates the 2D heat equation, which models how heat spreads across a surface (think of a hot plate cooling down). This involves solving a partial differential equation numerically, requiring lots of matrix operations—a great fit for GPU acceleration. We’ll implement it on both CPU (with NumPy) and GPU (with CuPy), compare the performance, and visualize the heat diffusion.


2. Project Overview: The 2D Heat Equation

The 2D heat equation describes heat diffusion over time:

[\frac{\partial u}{\partial t} = \alpha \left( \frac{\partial^2 u}{\partial x^2} + \frac{\partial^2 u}{\partial y^2} \right)]

  • (u(x, y, t)): Temperature at position ((x, y)) and time (t).
  • (\alpha): Thermal diffusivity (how fast heat spreads).

We’ll solve this using the finite difference method, approximating derivatives by updating a grid of temperatures over small time steps. This requires computing differences across a large 2D grid repeatedly—ideal for showcasing GPU parallelism.


3. Step 1: Setting Up the Grid

We’ll start with a 1000×1000 grid, setting boundaries to 0°C and a hot square (100°C) in the center. Here’s the CuPy code:

import cupy as cp
import numpy as np

# Grid size
N = 1000
# Initial temperature grid on GPU
u = cp.zeros((N, N))
# Boundary conditions
u[0, :] = 0    # Top
u[-1, :] = 0   # Bottom
u[:, 0] = 0    # Left
u[:, -1] = 0   # Right
# Heat source in the center
u[N//4:3*N//4, N//4:3*N//4] = 100

This grid stays on the GPU, minimizing data transfers—a key optimization for GPU computing.


4. Step 2: CPU Implementation with NumPy

To establish a baseline, let’s solve the heat equation on the CPU using NumPy. We’ll use an explicit finite difference scheme:

# Parameters
alpha = 0.01          # Thermal diffusivity
dx = 1.0 / (N - 1)    # Spatial step
dt = dx**2 / (4 * alpha)  # Time step for stability
steps = 1000          # Number of iterations

# Initial grid on CPU
u_np = np.zeros((N, N))
u_np[N//4:3*N//4, N//4:3*N//4] = 100

# Simulation loop
for _ in range(steps):
    u_xx = (np.roll(u_np, -1, axis=0) - 2*u_np + np.roll(u_np, 1, axis=0)) / dx**2
    u_yy = (np.roll(u_np, -1, axis=1) - 2*u_np + np.roll(u_np, 1, axis=1)) / dx**2
    u_np += alpha * dt * (u_xx + u_yy)

Here, np.roll shifts the grid to compute spatial derivatives, and we update the temperature iteratively. For a 1000×1000 grid, this can take a while on a CPU.


5. Step 3: GPU Implementation with CuPy

Now, let’s adapt this to CuPy. The code is nearly identical, but runs on the GPU:

# Initial grid on GPU (already set up as u)
u_cp = u.copy()

# Simulation loop
for _ in range(steps):
    u_xx = (cp.roll(u_cp, -1, axis=0) - 2*u_cp + cp.roll(u_cp, 1, axis=0)) / dx**2
    u_yy = (cp.roll(u_cp, -1, axis=1) - 2*u_cp + cp.roll(u_cp, 1, axis=1)) / dx**2
    u_cp += alpha * dt * (u_xx + u_yy)

The magic is in CuPy’s GPU execution. Operations like cp.roll are parallelized across the GPU’s cores, slashing computation time.


6. Step 4: Performance Comparison

Let’s time both versions (results depend on hardware, but here’s an example):

  • CPU (NumPy): ~120 seconds
  • GPU (CuPy): ~5 seconds

That’s a 20-30x speedup! For small grids, GPU overhead might outweigh benefits, but for large-scale problems like this, CuPy shines. Keeping data on the GPU avoids transfer bottlenecks, amplifying the gain.


7. Step 5: Visualizing the Results

To see the heat diffusion, we’ll plot the final grid using Matplotlib. Since u_cp is on the GPU, we transfer it to the CPU:

import matplotlib.pyplot as plt

# Transfer to CPU
u_final = cp.asnumpy(u_cp)

# Plot heatmap
plt.imshow(u_final, cmap='hot')
plt.colorbar(label='Temperature (°C)')
plt.title('Final Temperature Distribution')
plt.show()

This creates a heatmap showing how the central heat has spread. You could add plots at earlier steps to track the process.


8. What We’ve Built and What’s Next

We’ve built a 2D heat equation simulator that leverages CuPy for GPU acceleration, cutting computation time dramatically. The steps—grid setup, CPU baseline, GPU optimization, and visualization—showcase a real-world workflow.

Want to take it further? Try:

  • Adding dynamic heat sources or sinks.
  • Using insulating boundaries (e.g., Neumann conditions).
  • Scaling to 3D for more complex physics.

CuPy’s potential extends to machine learning, signal processing, or any field with heavy numerical loads. With its NumPy-like simplicity and GPU power, it’s a game-changer for performance-critical projects.


You may also like