Share

Estimating π with a Monte Carlo Simulation Using CuPy

by nowrelated · June 2, 2025

In this project, we’ll use CuPy, a GPU-accelerated computing library, to estimate the value of π through a Monte Carlo simulation. This approach leverages random sampling to approximate π by simulating points in a square and circle, making it a fun and practical demonstration of CuPy’s power for parallel numerical tasks.


What is CuPy and Why This Project?

CuPy is a Python library that mirrors NumPy but runs computations on an NVIDIA GPU, offering significant speedups for array-based operations. It’s ideal for tasks involving large datasets or heavy math.

The Monte Carlo method estimates π by comparing the ratio of points falling inside a quarter circle to those in a square. This involves generating millions of random points—perfect for GPU acceleration due to the parallel nature of the computation.


Project Overview: Monte Carlo π Estimation

Imagine a unit square (side length 1) with a quarter circle inscribed inside it (radius 1). The area of the square is 1, and the quarter circle’s area is π/4. By randomly scattering points in the square and counting how many land inside the quarter circle, we can estimate π as:

[ \pi \approx 4 \times \frac{\text{points inside quarter circle}}{\text{total points}} ]

We’ll implement this on both CPU (NumPy) and GPU (CuPy) to compare performance.


Step 1: CPU Implementation with NumPy

Let’s start with a CPU baseline using NumPy:

import numpy as np
import time

# Number of points
n_points = 10_000_000

# Generate random x and y coordinates
np.random.seed(42)  # For reproducibility
start_time = time.time()
x = np.random.uniform(0, 1, n_points)
y = np.random.uniform(0, 1, n_points)

# Check if points lie inside the quarter circle (x^2 + y^2 <= 1)
distances = x**2 + y**2
inside = distances <= 1

# Estimate pi
pi_estimate = 4 * np.sum(inside) / n_points

cpu_time = time.time() - start_time
print(f"CPU π Estimate: {pi_estimate:.6f}")
print(f"CPU Time: {cpu_time:.2f} seconds")

For 10 million points, this takes a noticeable amount of time on a CPU due to random number generation and distance calculations.


Step 2: GPU Implementation with CuPy

Now, let’s port it to CuPy for GPU acceleration:

import cupy as cp
import time

# Number of points (same as CPU)
n_points = 10_000_000

# Generate random points on GPU
cp.random.seed(42)  # For reproducibility
start_time = time.time()
x = cp.random.uniform(0, 1, n_points)
y = cp.random.uniform(0, 1, n_points)

# Compute distances and count points inside quarter circle
distances = x**2 + y**2
inside = distances <= 1

# Estimate pi
pi_estimate = 4 * cp.sum(inside) / n_points

# Transfer result to CPU
pi_estimate_cpu = cp.asnumpy(pi_estimate)
gpu_time = time.time() - start_time
print(f"GPU π Estimate: {pi_estimate_cpu:.6f}")
print(f"GPU Time: {gpu_time:.2f} seconds")

CuPy handles random number generation and array operations on the GPU, keeping data in GPU memory until the final result.


Step 3: Performance Comparison

Example results (hardware-dependent):

  • CPU (NumPy): ~2.5 seconds
  • GPU (CuPy): ~0.1 seconds

That’s a 20-25x speedup! The GPU excels at generating random numbers and performing parallel element-wise operations on large arrays.


Step 4: Visualizing the Concept (Optional)

To illustrate, we could plot a smaller sample of points (e.g., 1000 points):

import matplotlib.pyplot as plt

# Smaller sample for visualization
n_sample = 1000
x_sample = cp.random.uniform(0, 1, n_sample)
y_sample = cp.random.uniform(0, 1, n_sample)
inside_sample = x_sample**2 + y_sample**2 <= 1

# Transfer to CPU
x_cpu = cp.asnumpy(x_sample)
y_cpu = cp.asnumpy(y_sample)
inside_cpu = cp.asnumpy(inside_sample)

# Plot
plt.scatter(x_cpu[inside_cpu], y_cpu[inside_cpu], color='blue', s=1, label='Inside')
plt.scatter(x_cpu[~inside_cpu], y_cpu[~inside_cpu], color='red', s=1, label='Outside')
plt.gca().set_aspect('equal')
plt.legend()
plt.title('Monte Carlo π Estimation')
plt.show()

This shows points inside (blue) and outside (red) the quarter circle, visually confirming the method.


What We’ve Built and What’s Next

We’ve created a Monte Carlo simulation to estimate π, harnessing CuPy’s GPU power for a significant performance boost. The steps—CPU baseline, GPU optimization, and optional visualization—demonstrate a practical workflow.

Try enhancing it:

  • Increase n_points to improve accuracy.
  • Compare different random number generators in CuPy.
  • Extend to other Monte Carlo problems (e.g., integration).

Project LinkCuPy GitHub Repository

Official DocumentationCuPy Docs

License: MIT License

You may also like