Setting up Docker-based CUDA Environment on a New Windows Machine

Last updated on November 6, 2025 pm

Setting up Docker-based CUDA Environment on a New Windows Machine

Since v19.03.1, Docker Engine has supported passing NVIDIA GPU devices directly to containers. There is no need to run NVIDIA images or runtimes explicitly anymore.

This guide will walk you through setting up the Docker-based CUDA environment on a new Windows machine.

Step 1: Install NVIDIA Drivers

Ensure you have the latest NVIDIA drivers installed for your GPU. You can download them from the Official NVIDIA Drivers website.

Run nvidia-smi in Command Prompt or PowerShell to verify the installation:

PS E:\> nvidia-smi
Thu Oct 30 15:16:28 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 581.42                 Driver Version: 581.42         CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 5070 ...  WDDM  |   00000000:01:00.0 Off |                  N/A |
| N/A   47C    P5             15W /  130W |     778MiB /  12227MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

Step 2: Enable and Install WSL 2

Run the following command in PowerShell as Administrator to enable WSL and install WSL 2:

1	`wsl --install`

Reboot your machine if prompted.

Here we choose Ubuntu 24.04 as our WSL base distribution.

1	`wsl --install Ubuntu-24.04`

Reboot your machine again if prompted.

Step 3: Install Docker Desktop

Download and install Docker Desktop from the Install Docker Desktop on Windows page. Remember to choose the WSL 2 as the backend during installation. Reboot your machine if prompted.

Step 4: Setup Docker Container with NVIDIA Toolkit

Start a Docker container with the following command:

1	`docker run -dit --name cuda_dev -v "${PWD}:/workspace/cuda" -w /workspace/cuda --gpus all ubuntu:24.04`

The flags used are:

-dit: Run the container in detached mode with an interactive terminal.
--name cuda_dev: Name the container cuda_dev;
-v "${PWD}:/workspace/cuda": Mount the current directory to /workspace/cuda in the container;
-w /workspace/cuda: Set the container’s working directory as /workspace/cuda;
--gpus all: Allocate all available GPUs to the container. Without this flag, the container won’t have access to the GPU.
ubuntu:24.04: Use the Ubuntu 24.04 image as the base image.

Connect to the container (via SSH or Docker CLI or VS Code Remote - Containers extension) and run the following commands to set up the CUDA environment inside the container:

apt update && apt install -y wget build-essential
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb
dpkg -i cuda-keyring_1.1-1_all.deb
apt update
rm -f cuda-keyring_1.1-1_all.deb
apt install -y cuda-toolkit-13-0

The deb URL can be found from the https://developer.nvidia.com/cuda-downloads website by selecting the appropriate version and platform.

Update the PATH environment variable to include the CUDA binaries. You can add the following line to your ~/.bashrc or ~/.zshrc file:

1	`export PATH="/usr/local/cuda/bin:$PATH"`

Run source ~/.bashrc or start a new terminal session to verify the installation.

Step 5: Verify CUDA Installation

nvcc --version checks CUDA compiler version:

root@ceaa054162f3:/workspace/cuda# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Wed_Aug_20_01:58:59_PM_PDT_2025
Cuda compilation tools, release 13.0, V13.0.88
Build cuda_13.0.r13.0/compiler.36424714_0

nvidia-smi inside the container verifies GPU access:

root@ceaa054162f3:/workspace/cuda# nvidia-smi
Thu Oct 30 15:30:59 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.95.02              Driver Version: 581.42         CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 5070 ...    On  |   00000000:01:00.0 Off |                  N/A |
| N/A   47C    P5             16W /  130W |     842MiB /  12227MiB |      6%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

You can also compile and run the following CUDA program to test:

#include <iostream>
#include <cuda_runtime.h>

int main() {
    // Check CUDA runtime version
    int runtime_version = 0;
    cudaRuntimeGetVersion(&runtime_version);
    std::cout << "CUDA Runtime Version: " << runtime_version << std::endl;

    // Check CUDA driver version
    int driver_version = 0;
    cudaDriverGetVersion(&driver_version);
    std::cout << "CUDA Driver Version: " << driver_version << std::endl;

    // Get number of devices
    int device_count = 0;
    cudaError_t err = cudaGetDeviceCount(&device_count);
    if (err != cudaSuccess) {
        std::cerr << "Error: " << cudaGetErrorString(err) << std::endl;
        // return 1;
    }

    std::cout << "Number of CUDA devices: " << device_count << std::endl;

    // Display device info
    for (int i = 0; i < device_count; ++i) {
        cudaDeviceProp prop;
        cudaGetDeviceProperties(&prop, i);

        std::cout << "\nDevice " << i << ": " << prop.name << std::endl;
        std::cout << "\tCompute capability: " << prop.major << "." << prop.minor << std::endl;
        std::cout << "\tTotal global memory: " << (prop.totalGlobalMem >> 20) << " MB" << std::endl;
        std::cout << "\tMultiprocessors: " << prop.multiProcessorCount << std::endl;
        std::cout << "\tMax threads per block: " << prop.maxThreadsPerBlock << std::endl;
        std::cout << "\tMax threads dim: (" 
                  << prop.maxThreadsDim[0] << ", "
                  << prop.maxThreadsDim[1] << ", "
                  << prop.maxThreadsDim[2] << ")" << std::endl;
        std::cout << "\tMax grid size: (" 
                  << prop.maxGridSize[0] << ", "
                  << prop.maxGridSize[1] << ", "
                  << prop.maxGridSize[2] << ")" << std::endl;
    }

    // Simple kernel launch test
    int *d_a;
    cudaMalloc((void**)&d_a, sizeof(int));
    cudaFree(d_a);
    std::cout << "\nCUDA kernel launch test passed (memory alloc/free succeeded)." << std::endl;

    return 0;
}

1 2	`nvcc -o verify_cuda verify_cuda.cu ./verify_cuda`

The expected output should be similar to:

CUDA Runtime Version: 13000
CUDA Driver Version: 13000
Number of CUDA devices: 1

Device 0: NVIDIA GeForce RTX 5070 Ti Laptop GPU
    Compute capability: 12.0
    Total global memory: 12226 MB
    Multiprocessors: 46
    Max threads per block: 1024
    Max threads dim: (1024, 1024, 64)
    Max grid size: (2147483647, 65535, 65535)

CUDA kernel launch test passed (memory alloc/free succeeded).

Then you have successfully set up the CUDA environment on your new computer!

DevOps

#VS Code #WSL2 #Docker #CUDA #Windows

Setting up Docker-based CUDA Environment on a New Windows Machine

https://blog.lingkang.dev/2025/10/30/cuda-setup/

Author

Lingkang

Posted on

October 30, 2025

Licensed under

Setting up Docker Image for SystemVerilog Workspace Previous

Grayscale BMP Images: A Practical Yet Gentle CUDA Tutorial for Beginners Next