Kubeflow

Kubeflow Charmers | bundle
Cloud

Channel	Revision	Published
latest/candidate	294	24 Jan 2022
latest/beta	430	30 Aug 2024
latest/edge	423	26 Jul 2024
1.10/stable	436	07 Apr 2025
1.10/candidate	434	02 Apr 2025
1.10/beta	433	24 Mar 2025
1.9/stable	432	03 Dec 2024
1.9/beta	420	19 Jul 2024
1.9/edge	431	03 Dec 2024
1.8/stable	414	22 Nov 2023
1.8/beta	411	22 Nov 2023
1.8/edge	413	22 Nov 2023
1.7/stable	409	27 Oct 2023
1.7/beta	408	27 Oct 2023
1.7/edge	407	27 Oct 2023

Learn to deploy on juju >

Platform:

Relevant links

Homepage

Share your thoughts on this charm with the community on discourse.

Join the discussion

The Autoscaling model serving solution offers the ability to deploy KServe, Knative, and Istio charms on their own to serve Machine Learning (ML) models that can be accessed through ingress.

Requirements

Juju 2.9.49 or above.
A Kubernetes cluster with a configured LoadBalancer, DNS, and a storage class solution.

Deploy the solution

You can deploy the solution in the following ways:

Deploy with Terraform.
Deploy with charm bundle.

Regardless of the chosen deployment method, the following charm configuration is required:

juju config knative-serving istio.gateway.namespace="<Istio ingress gateway namespace>"

where the Istio ingress gateway namespace corresponds to the model name where the autoscaling-model-serving bundle is deployed.

Deploy with Terraform

The Autoscaling model serving is defined with a Terraform module that facilitates its deployment using the Terraform Juju provider. See Terraform Juju provider for more details.

In its most basic form, the solution can be deployed as follows:

terraform apply -v

Refer to this for more information about inputs and outputs of the module.

Deploy with charm bundle

Charm bundles are now obsolete, but as part of v0.1, the bundle.yaml file is still available. To deploy:

Clone the autoscaling-model-serving repository.
Deploy using the bundle.yaml file:

juju deploy ./bundle/bundle.yaml --trust

Perform inference

Apply an InferenceService.

Kserve offers a simple example in steps 1, 2, and 3.

Perform inference by making a request using the URL from the recently created InfercenceService.

For example, by running:

kubectl get inferenceservices <name of the inferenceservice> -n <namespace where it is deployed>

You get the following output:

NAME       	URL                                             	READY   PREV   LATEST   PREVROLLEDOUTREVISION   LATESTREADYREVISION                	AGE
<name>   http://<name>.<namespace>.<LoadBalancer IP.DNS>     	True       	100

The http://<name>.<namespace>.<LoadBalancer IP.DNS> can be used in any sort of request, for example:

$ curl -v -H "Content-Type: application/json" http://<name>.<namespace>.<LoadBalancer IP.DNS>/v1/models/<name>:predict -d @./some-input.json

Integrate with COS

You can integrate the solution with Canonical Observability Stack (COS) while deploying with the Terraform module by running:

terraform apply -var cos_configuration=true

If the solution was deployed using the charm bundle, or using the Terraform module without the COS options passed, see Integrate with COS for more details.

Help improve this document in the forum (guidelines). Last updated 8 days ago.