Kubeflow

Kubeflow Charmers | bundle
Cloud

Channel	Revision	Published
latest/candidate	294	24 Jan 2022
latest/beta	430	30 Aug 2024
latest/edge	423	26 Jul 2024
1.10/stable	436	07 Apr 2025
1.10/candidate	434	02 Apr 2025
1.10/beta	433	24 Mar 2025
1.9/stable	432	03 Dec 2024
1.9/beta	420	19 Jul 2024
1.9/edge	431	03 Dec 2024
1.8/stable	414	22 Nov 2023
1.8/beta	411	22 Nov 2023
1.8/edge	413	22 Nov 2023
1.7/stable	409	27 Oct 2023
1.7/beta	408	27 Oct 2023
1.7/edge	407	27 Oct 2023

Learn to deploy on juju >

Platform:

Relevant links

Homepage

Share your thoughts on this charm with the community on discourse.

Join the discussion

This guide describes how to install Charmed Kubeflow (CKF) on NVIDIA DGX hardware. DGX systems are purpose-built hardware for enterprise AI use cases, featuring NVIDIA Tensor Core GPUs.

Requirements

NVIDIA DGX-enabled hardware setup, including no NVIDIA drivers preinstalled, BIOS settings and bootloader.
kubectl.

Install MicroK8s

Install MicroK8s and enable required add-ons as follows:

sudo snap install microk8s --classic --channel 1.22
 
sudo microk8s enable dns:10.229.32.21 storage ingress registry rbac helm3 metallb:10.64.140.43-10.64.140.49,192.168.0.105-192.168.0.111
 
sudo usermod -a -G microk8s ubuntu
sudo chown -f -R ubuntu ~/.kube
newgrp microk8s

Edit /var/snap/microk8s/current/args/containerd-template.toml by adding:

[plugins."io.containerd.grpc.v1.cri".registry.configs]

[plugins."io.containerd.grpc.v1.cri".registry.configs."registry-1.docker.io".auth]
username = "afrikha"
password = "<>"

Finally , restart MicroK8s:

microk8s.stop
microk8s.start

Enable GPU add-on

Install the required GPU operator as follows:

sudo microk8s.enable gpu
mkdir .kube
microk8s config > ~/.kube/config

Check the GPU count for MicroK8s:

kubectl get nodes --show-labels | grep gpu.count

Configure MIG

Configure MIG devices running the following command:

kubectl label nodes blanka nvidia.com/mig.config=all-1g.5gb --overwrite

Check again the GPU count to confirm it has increased:

kubectl get nodes --show-labels | grep gpu.count

If no nodes appear in the command output above, uninstall all GPU drivers form K8s nodes and reinstall MicroK8s.

Deploy CKF

Follow the instructions in General installation for this section.

Explore some examples

CKF can be run on different types of DGX hardware:

See kubeflow-single-node-dgx for single-node examples.
See kubeflow-multi-node-dgx for multi-node examples.

Help improve this document in the forum (guidelines). Last updated 4 months ago.