Kubeflow

  • Kubeflow Charmers | bundle
  • Cloud
Channel Revision Published
latest/candidate 294 24 Jan 2022
latest/beta 430 30 Aug 2024
latest/edge 423 26 Jul 2024
1.9/stable 432 03 Dec 2024
1.9/beta 420 19 Jul 2024
1.9/edge 431 03 Dec 2024
1.8/stable 414 22 Nov 2023
1.8/beta 411 22 Nov 2023
1.8/edge 413 22 Nov 2023
1.7/stable 409 27 Oct 2023
1.7/beta 408 27 Oct 2023
1.7/edge 407 27 Oct 2023
juju deploy kubeflow --channel 1.9/stable
Show information

Platform:

This guide describes how to leverage NVIDIA GPU resources in your Charmed Kubeflow (CKF) deployment.

Requirements

  • A CKF deployment and access to the Kubeflow dashboard. See Get started for more details.
  • An NVIDIA GPU accessible from the Kubernetes cluster that CKF is deployed on. Depending on your deployment, refer to one of the following guides for more details:

Spin a Notebook on a GPU

Kubeflow Notebooks can use any GPU resource available in the Kubernetes cluster. This is configurable during the Notebook’s creation.

When creating a Notebook, under GPUs, select the number of GPUs and NVIDIA as the GPU vendor. The GPUs number depends both on the cluster setup and your code demands.

If your Notebook uses a Tensorflow-based image with CUDA, use the following code to confirm the notebooks have access to a GPU:

import tensorflow as tf
gpus = tf.config.list_physical_devices("GPU")
print(f"Congratz! The following GPUs are available to the notebook: {gpus}" if gpus else "There's no GPU available to the notebook")

In case your cluster setup uses Taints, see Leverage PodDefaults for more details.

Run Pipeline steps on a GPU

Kubeflow Pipelines provides steps to use GPU resources available in your Kubernetes cluster. You can enable this by adding the nvidia.com/gpu: 1 limit to a step during the Pipeline’s definition. See the detailed steps below.

A GPU can be used by one Pod at a time. Thus, a Pipeline can schedule Pods on a GPU only when available. For advanced GPU sharing practices on Kubernetes, see NVIDIA Multi-Instance GPU.

  1. Open a notebook with your Pipeline. If you don’t have one, use the following code as an example. It creates a Pipeline with a single component that checks GPU access:
# Import required objects
from kfp import dsl

@dsl.component(base_image="kubeflownotebookswg/jupyter-tensorflow-cuda:v1.9.0")
def gpu_check() -> str:
    """Get the list of GPUs and print it. If empty, raise a RuntimeError."""
    import tensorflow as tf
    gpus = tf.config.list_physical_devices("GPU")
    print("GPU list:", gpus)
    if not gpus:
        raise RuntimeError("No GPU has been detected.")
    return str(len(gpus) > 0)

@dsl.pipeline
def gpu_check_pipeline() -> str:
    """Create a pipeline that runs code to check access to a GPU."""
    gpu_check_object = gpu_check()
    return gpu_check_object.output

Make sure the KFP SDK is installed in the Notebook’s environment:

!pip install "kfp>=2.4,<3.0"
  1. Ensure the step of the Pipeline’s component gpu_check runs on a GPU by creating a function add_gpu_request(task) that uses the SDK’s add_node_selector_constraint() and set_accelerator_limit(). This sets the required limit for the step’s Pod:
def add_gpu_request(task: dsl.PipelineTask) -> dsl.PipelineTask:
    """Add a request field for a GPU to the container created by the PipelineTask object."""
    return task.add_node_selector_constraint(accelerator="nvidia.com/gpu").set_accelerator_limit(
        limit=1
    )
  1. Modify the Pipeline definition by calling add_gpu_request() to the component:
@dsl.pipeline
def gpu_check_pipeline() -> str:
    """Create a pipeline that runs code to check access to a GPU."""
    gpu_check_object = add_gpu_request(gpu_check())
    return gpu_check_object.output
  1. Submit and run the Pipeline:
# Submit the pipeline executes successfully
from kfp.client import Client
client = Client()
run = client.create_run_from_pipeline_func(
    gpu_check_pipeline,
    experiment_name="Check access to GPU",
    enable_caching=False,
)
  1. Navigate to the output Run details. In its logs, you can see the available GPU devices the step has access to.

Inference with a KServe ISVC on a GPU

KServe inference services (ISVC) can schedule their Pods on a GPU. To ensure the ISVC Pod is using a GPU, add the nvidia.com/gpu: 1 limit to the ISVC’s definition.

You can do so by using the kubectl Command Line Interface (CLI) or within a notebook.

Using kubectl CLI

Using the kubectl CLI, you can enable GPU usage in your InferenceService Pod by directly modifying its configuration YAML file. For example, the inference service YAML file from this example would be modified to:

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "sklearn-iris"
spec:
  predictor:
    model:
      modelFormat:
        name: sklearn
      storageUri: "gs://kfserving-examples/models/sklearn/1.0/model"
      resources:
        limits:
          nvidia.com/gpu: 1

Within a notebook

A GPU can be used by one Pod at a time. Thus, an ISVC Pod can be scheduled on a GPU only when available. For advanced GPU sharing practices on Kubernetes, see NVIDIA Multi-Instance GPU.

  1. Open a notebook with your InferenceService. If you don’t have one, use this one as an example.

Make sure the Kserve SDK is installed in the Notebook’s environment:

!pip install kserve
  1. Import V1ResourceRequirements from kubernetes.client package and add a resources field in the workload you want to run on a GPU. See the example for reference:
ISVC_NAME = "sklearn-iris"
isvc = V1beta1InferenceService(
    api_version=constants.KSERVE_V1BETA1,
    kind=constants.KSERVE_KIND,
    metadata=V1ObjectMeta(
        name=ISVC_NAME,
        annotations={"sidecar.istio.io/inject": "false"},
    ),
    spec=V1beta1InferenceServiceSpec(
        predictor=V1beta1PredictorSpec(
            sklearn=V1beta1SKLearnSpec(
                resources=V1ResourceRequirements(
                    limits= {"nvidia.com/gpu":"1"}
                ),
                storage_uri="gs://kfserving-examples/models/sklearn/1.0/model"
            )
        )
    ),
)