Kubeflow

  • Kubeflow Charmers | bundle
  • Cloud
Channel Revision Published
latest/candidate 294 24 Jan 2022
latest/beta 430 30 Aug 2024
latest/edge 423 26 Jul 2024
1.10/stable 436 07 Apr 2025
1.10/candidate 434 02 Apr 2025
1.10/beta 433 24 Mar 2025
1.9/stable 432 03 Dec 2024
1.9/beta 420 19 Jul 2024
1.9/edge 431 03 Dec 2024
1.8/stable 414 22 Nov 2023
1.8/beta 411 22 Nov 2023
1.8/edge 413 22 Nov 2023
1.7/stable 409 27 Oct 2023
1.7/beta 408 27 Oct 2023
1.7/edge 407 27 Oct 2023
juju deploy kubeflow --channel latest/edge
Show information

Platform:

The Autoscaling model serving solution offers the ability to deploy KServe, Knative, and Istio charms on their own to serve Machine Learning (ML) models that can be accessed through ingress.

Requirements

  • Juju 2.9.49 or above.
  • A Kubernetes cluster with a configured LoadBalancer, DNS, and a storage class solution.

Deploy the solution

You can deploy the solution in the following ways:

  1. Deploy with Terraform.
  2. Deploy with charm bundle.

Regardless of the chosen deployment method, the following charm configuration is required:

juju config knative-serving istio.gateway.namespace="<Istio ingress gateway namespace>"

where the Istio ingress gateway namespace corresponds to the model name where the autoscaling-model-serving bundle is deployed.

Deploy with Terraform

The Autoscaling model serving is defined with a Terraform module that facilitates its deployment using the Terraform Juju provider. See Terraform Juju provider for more details.

In its most basic form, the solution can be deployed as follows:

terraform apply -v

Refer to this for more information about inputs and outputs of the module.

Deploy with charm bundle

Charm bundles are now obsolete, but as part of v0.1, the bundle.yaml file is still available. To deploy:

  1. Clone the autoscaling-model-serving repository.
  2. Deploy using the bundle.yaml file:
juju deploy ./bundle/bundle.yaml --trust

Perform inference

  1. Apply an InferenceService.

Kserve offers a simple example in steps 1, 2, and 3.

  1. Perform inference by making a request using the URL from the recently created InfercenceService.

For example, by running:

kubectl get inferenceservices <name of the inferenceservice> -n <namespace where it is deployed>

You get the following output:

NAME       	URL                                             	READY   PREV   LATEST   PREVROLLEDOUTREVISION   LATESTREADYREVISION                	AGE
<name>   http://<name>.<namespace>.<LoadBalancer IP.DNS>     	True       	100  

The http://<name>.<namespace>.<LoadBalancer IP.DNS> can be used in any sort of request, for example:

$ curl -v -H "Content-Type: application/json" http://<name>.<namespace>.<LoadBalancer IP.DNS>/v1/models/<name>:predict -d @./some-input.json

Integrate with COS

You can integrate the solution with Canonical Observability Stack (COS) while deploying with the Terraform module by running:

terraform apply -var cos_configuration=true

If the solution was deployed using the charm bundle, or using the Terraform module without the COS options passed, see Integrate with COS for more details.