Kubeflow
- Kubeflow Charmers | bundle
- Cloud
Channel | Revision | Published |
---|---|---|
latest/candidate | 294 | 24 Jan 2022 |
latest/beta | 430 | 30 Aug 2024 |
latest/edge | 423 | 26 Jul 2024 |
1.9/stable | 426 | 31 Jul 2024 |
1.9/beta | 420 | 19 Jul 2024 |
1.9/edge | 425 | 31 Jul 2024 |
1.8/stable | 414 | 22 Nov 2023 |
1.8/beta | 411 | 22 Nov 2023 |
1.8/edge | 413 | 22 Nov 2023 |
1.7/stable | 409 | 27 Oct 2023 |
1.7/beta | 408 | 27 Oct 2023 |
1.7/edge | 407 | 27 Oct 2023 |
juju deploy kubeflow --channel 1.9/beta
Deploy Kubernetes operators easily with Juju, the Universal Operator Lifecycle Manager. Need a Kubernetes cluster? Install MicroK8s to create a full CNCF-certified Kubernetes system in under 60 seconds.
Platform:
This guide presents the available Charmed Kubeflow (CKF) Loki logs.
The CKF charms that use the sidecar pattern can provide log information from its workloads. Those that do not use it need to leverage Promtail for doing so:
Loki does not provide a User Interface (UI). You can use the Grafana UI for checking Loki logs. See Visualize log data for more details on navigating Grafana Loki.
Sidecar pattern charms
The following subsections present CKF charms that use the sidecar pattern and can provide log information from their workloads.
- admission-webhook
- argo-controller
- dex-auth
- envoy
- jupyter-controller
- jupyter-ui
- katib-controller
- katib-db-manager
- katib-ui
- kfp-api
- kfp-metadata-writer
- kfp-persistence
- kfp-profile-controller
- kfp-schedwf
- kfp-ui
- kfp-viewer
- kfp-viz
- knative-operator
- kserve-controller
- kubeflow-dashboard
- kubeflow-profiles
- kubeflow-volumes
- mlmd
- oidc-gatekeeper
- pvcviewer-operator
- seldon-core
- tensorboard-controller
- tensorboards-web-app
Admission-webhook
Admission-webhook is a Go application that uses k8s.io/klog
for logging.
You can check its logs through the Grafana UI using the query {pebble_service="admission-webhook", charm="admission-webhook"}
.
See the logs source for more details.
Argo-controller
Argo-controller is a GO application that uses logrus
for logging.
You can check its logs through the Grafana UI using the query {pebble_service="argo-controller", charm="argo-controller"}
.
See the logs source for more details.
Dex-auth
Dex-auth is a GO application that uses slog
for logging.
You can check its logs through the Grafana UI using the query {pebble_service="dex", charm="dex-auth"}
See the logs source for more details.
Envoy
Envoy is a C++ application that uses standard spdlog
library for logging.
You can check its logs through the Grafana UI using the query {pebble_service="envoy", charm="envoy"}
.
See the logs source for more details. Since this is a third-party application, you will only see the configuration of envoyproxy.
Jupyter-controller
Jupyter-controller is a GO application that uses controller-runtime/pkg/log
for logging.
You can check its logs through the Grafana UI using the query {pebble_service="jupyter-controller", charm="jupyter-controller"}
.
See the logs source for more details.
Jupyter-ui
Jupyter-ui is a Python application that uses the logging
library for logging.
You can check its logs through the Grafana UI using the query {pebble_service="jupyter-ui", charm="jupyter-ui"}
.
See the logs source for more details.
Katib-controller
Katib-controller is a GO application that uses controller-runtime/pkg/log
for logging.
You can check its logs through the Grafana UI using the query {pebble_service="katib-controller", charm="katib-controller"}
.
See the logs source for more details.
Katib-db-manager
Katib-db-manager is a Go application that uses k8s.io/klog
for logging.
You can check its logs through the Grafana UI using the query {pebble_service="katib-db-manager", charm="katib-db-manager"}
.
See the logs source for more details.
Katib-ui
Katib-ui is a Go application that uses log
library for logging.
You can check its logs through the Grafana UI using the query {pebble_service="katib-ui", charm="katib-ui"}
.
See the logs source for more details.
Kfp-api
Kfp-api is a GO application that uses logrus
for logging.
You can check its logs through the Grafana UI using the query {pebble_service="apiserver", charm="kfp-api"}
.
See the logs source for more details.
Kfp-metadata-writer
Kfp-metadata-writer is a Python application that uses print
for logging.
You can check its logs through the Grafana UI using the query {pebble_service="kfp-metadata-writer", charm="kfp-metadata-writer"}
.
See the logs source for more details.
Kfp-persistence
Kfp-persistence is a GO application that uses logrus
for logging.
You can check its logs through the Grafana UI using the query {pebble_service="persistenceagent", charm="kfp-persistence"}
.
See the logs source for more details.
Kfp-profile-controller
Kfp-profile-controller is a GO application that uses controller-runtime
for logging.
You can check its logs through the Grafana UI using the query {pebble_service="kfp-profile-controller", charm="kfp-profile-controller"}
.
See the logs source for more details.
Kfp-schedwf
Kfp-schedwf is a GO application that uses logrus
for logging.
You can check its logs through the Grafana UI using the query {pebble_service="controller", charm="kfp-schedwf"}
.
See the logs source for more details.
Kfp-ui
Kfp-ui is a TypeScript application that uses console.log
for logging.
You can check its logs through the Grafana UI using the query {pebble_service="ml-pipeline-ui", charm="kfp-ui"}
.
See the logs source for more details.
Kfp-viewer
Kfp-viewer is a Go application that uses glog
for logging.
You can check its logs through the Grafana UI using the query {pebble_service="controller", charm="kfp-viewer"}
.
See the logs source for more details.
Kfp-viz
Kfp-viz is a Python application that uses the Tornado
framework for logging.
You can check its logs through the Grafana UI using the query {pebble_service="vis-server", charm="kfp-viz"}
.
See the logs source for more details.
Knative-operator
Knative-operator comes with two workloads containers and both are a GO application that uses go-kit/log
for logging.
You can check its logs through the Grafana UI using the query {pebble_service="knative-operator", charm="knative-operator"}
and {pebble_service="knative-operator-webhook", charm="knative-operator"}
.
See the logs source for more details.
Kserve-controller
Kserve-controller is a Go application that uses k8s.io/klog
for logging. This app also uses kube-rbac-proxy
.
You can check its logs through the Grafana UI using the query {pebble_service="kserve-controller", charm="kserve-controller"}
and {pebble_service="kube-rbac-proxy", charm="kserve-controller"}
See the logs source for more details.
Kubeflow-dashboard
Kubeflow-dashboard is a TypeScript application that uses console.log
for logging.
You can check its logs through the Grafana UI using the query {pebble_service="kubeflow-dashboard", charm="kubeflow-dashboard"}
.
See the logs source for more details.
Kubeflow-profiles
Kubeflow-profiles is a GO application that uses logrus
for logging.
You can check its logs through the Grafana UI using the query {pebble_service="kubeflow-kfam", charm="kubeflow-profiles"}
.
See the logs source for more details.
Kubeflow-volumes
Kubeflow-volumes is a Python application that uses the logging
library for logging.
You can check its logs through the Grafana UI using the query {pebble_service="kubeflow-volumes", charm="kubeflow-volumes"}
.
See the logs source for more details.
Mlmd
Mlmd is a C++ application that uses the google/glog
library for logging.
You can check its logs through the Grafana UI using the query {pebble_service="mlmd", charm="mlmd"}
.
See the logs source for more details.
Oidc-gatekeeper
Oidc-gatekeeper is a GO application that uses logrus
for logging.
You can check its logs through the Grafana UI using the query {pebble_service="oidc-authservice", charm="oidc-gatekeeper"}
.
See the logs source for more details.
Pvcviewer-operator
Pvcviewer-operator is a GO application that uses controller-runtime/pkg/log
for logging.
You can check its logs through the Grafana UI using the query {pebble_service="pvcviewer-operator", charm="pvcviewer-operator"}
.
See the logs source for more details.
Seldon-core
Seldon-core is a GO application that uses controller-runtime/pkg/log
for logging.
You can check its logs through the Grafana UI using the query {pebble_service="seldon-core", charm="seldon-core"}
.
See the logs source for more details.
Tensorboard-controller
Tensorboard-controller is a GO application that uses controller-runtime/pkg/log
for logging.
You can check its logs through the Grafana UI using the query {pebble_service="pvcviewer-operator", charm="pvcviewer-operator"}
.
See the logs source for more details.
Tensorboards-web-app
Tensorboards-web-app is a Python application that uses the logging
library for logging.
You can check its logs through the Grafana UI using the query {pebble_service="tensorboards-web-app", charm="tensorboards-web-app"}
.
See the logs source for more details.
Non sidecar pattern charms
The following CKF charms do not use the sidecar pattern and cannot use the log forwarding principle:
- Istio-gateway
- Istio-pilot
- Knative-eventing
- Knative-serving
- Metacontroller
- Training-operator
For monitoring these charms, you can use Promtail.
Promtail configuration
You can configure Promtail for a more efficient log collection, avoiding scraping all logs within the clusters and adding the correct Juju topology.
To do so, you need to configure the following:
- Clients section
- Scrape jobs configuration
Clients section
The clients section requires adding:
- The
<URL>
to Loki. - The Juju model name
<JUJU_MODEL>
. - The Juju model uuid
<JUJU_MODEL_UUID>
.
clients:
- url: <URL>
external_labels:
juju_model: <JUJU_MODEL>
juju_model_uuid: <JUJU_MODEL_UUID>
The URL is required for accessing the Loki server. The external labels are part of the Juju topology required for querying logs.
The Loki URL can be obtained as follows:
juju show-unit grafana-agent-k8s/0 -m cos-controller:cos --endpoint logging-consumer | yq '.[]."relation-info".[]."related-units".[].data.endpoint | fromjson | .url'
The Juju model name is always kubeflow
in this use case. You can obtain the Juju model uuid as follows:
juju models --format json | jq '.models.[] | select(."short-name"=="kubeflow") | ."model-uuid"'
Scrape jobs configuration
You can configure Promtail to optimize your scrape jobs. To do so, you need to follow these steps:
- Define a namespace for the kubernetes_sd_config.
- Define a label selectors to scrape only required Pods. This is recommended to save resources.
- [Optional] Enable all original labels from Pods via relabel_configs and action labelmap.
- [Optional] Add the rest of Juju topology to each log via pipeline stages and static_labels.
Here’s an example of scrape jobs for istio-pilot and istio-gateway controllers:
- job_name: istio
kubernetes_sd_configs:
- role: pod
namespaces:
names:
- kubeflow
selectors:
- role: pod
label: "app in (istio-ingressgateway, istiod)"
relabel_configs:
- action: replace
source_labels:
- __meta_kubernetes_pod_container_name
target_label: workload
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- action: replace
source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_node_name
target_label: __host__
- replacement: /var/log/pods/*$1/*.log
separator: /
source_labels:
- __meta_kubernetes_pod_uid
- __meta_kubernetes_pod_container_name
target_label: __path__
pipeline_stages:
- labeldrop:
- filename
- match:
selector: '{app="istio-ingressgateway"}'
stages:
- static_labels:
juju_application: istio-ingressgateway
juju_unit: istio-ingressgateway/0
charm: istio-gateway
- match:
selector: '{app="istiod"}'
stages:
- static_labels:
juju_application: istio-pilot
juju_unit: istio-pilot/0
charm: istio-pilot
Full example of Promtail deployment
This section provides an example that monitors all non sidecar pattern charms. You can check their logs through the Grafana UI using the query:
{juju_model="kubeflow", charm=~"istio-pilot|istio-gateway|knative-serving|knative-eventing|metacontroller-operator|training-operator"}`
Use the example code by specifying your Promtail configuration.
This Promtail deployment .yaml
file can be applied to the kubeflow
model as follows:
kubectl apply -f ./CKF_promtail.yaml -n kubeflow
Here’s the code snippet:
--- # Deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: promtail
labels:
app: promtail
spec:
selector:
matchLabels:
app: promtail
template:
metadata:
labels:
app: promtail
spec:
serviceAccount: promtail-serviceaccount
containers:
- name: promtail
image: grafana/promtail
args:
- -config.file=/etc/promtail/promtail.yaml
env:
- name: 'HOSTNAME' # needed when using kubernetes_sd_configs
valueFrom:
fieldRef:
fieldPath: 'spec.nodeName'
volumeMounts:
- name: logs
mountPath: /var/log/pods
- name: promtail-config
mountPath: /etc/promtail
volumes:
- name: logs
hostPath:
path: /var/log/pods
- name: promtail-config
configMap:
name: promtail-config
--- # configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: promtail-config
data:
promtail.yaml: |
server:
http_listen_port: 9080
grpc_listen_port: 0
clients:
- url: http://10.64.140.43/cos-loki-0/loki/api/v1/push
external_labels:
juju_model: kubeflow
juju_model_uuid: f9e6966e-d7bb-4f19-8c4e-276c95880d39
positions:
filename: /tmp/positions.yaml
target_config:
sync_period: 10s
scrape_configs:
- job_name: istio
kubernetes_sd_configs:
- role: pod
namespaces:
names:
- kubeflow
selectors:
- role: pod
label: "app in (istio-ingressgateway, istiod)"
relabel_configs:
- action: replace
source_labels:
- __meta_kubernetes_pod_container_name
target_label: workload
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- action: replace
source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_node_name
target_label: __host__
- replacement: /var/log/pods/*$1/*.log
separator: /
source_labels:
- __meta_kubernetes_pod_uid
- __meta_kubernetes_pod_container_name
target_label: __path__
pipeline_stages:
- labeldrop:
- filename
- match:
selector: '{app="istio-ingressgateway"}'
stages:
- static_labels:
juju_application: istio-ingressgateway
juju_unit: istio-ingressgateway/0
charm: istio-gateway
- match:
selector: '{app="istiod"}'
stages:
- static_labels:
juju_application: istio-pilot
juju_unit: istio-pilot/0
charm: istio-pilot
- job_name: knative
kubernetes_sd_configs:
- role: pod
namespaces:
names:
- knative-eventing
- knative-serving
relabel_configs:
- action: replace
source_labels:
- __meta_kubernetes_pod_container_name
target_label: workload
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- action: replace
source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_node_name
target_label: __host__
- replacement: /var/log/pods/*$1/*.log
separator: /
source_labels:
- __meta_kubernetes_pod_uid
- __meta_kubernetes_pod_container_name
target_label: __path__
pipeline_stages:
- labeldrop:
- filename
- match:
selector: '{namespace="knative-eventing"}'
stages:
- static_labels:
juju_application: knative-eventing
juju_unit: knative-eventing/0
charm: knative-eventing
- match:
selector: '{namespace="knative-serving"}'
stages:
- static_labels:
juju_application: knative-serving
juju_unit: knative-serving/0
charm: knative-serving
- job_name: metacontroller
kubernetes_sd_configs:
- role: pod
namespaces:
names:
- kubeflow
selectors:
- role: pod
label: "app.kubernetes.io/name=metacontroller-operator"
relabel_configs:
- action: replace
source_labels:
- __meta_kubernetes_pod_container_name
target_label: workload
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- action: replace
source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_node_name
target_label: __host__
- replacement: /var/log/pods/*$1/*.log
separator: /
source_labels:
- __meta_kubernetes_pod_uid
- __meta_kubernetes_pod_container_name
target_label: __path__
pipeline_stages:
- labeldrop:
- filename
- static_labels:
juju_application: metacontroller-operator
juju_unit: metacontroller-operator/0
charm: metacontroller-operator
- job_name: training-operator
kubernetes_sd_configs:
- role: pod
namespaces:
names:
- kubeflow
selectors:
- role: pod
label: "control-plane=kubeflow-training-operator"
relabel_configs:
- action: replace
source_labels:
- __meta_kubernetes_pod_container_name
target_label: workload
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- action: replace
source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_node_name
target_label: __host__
- replacement: /var/log/pods/*$1/*.log
separator: /
source_labels:
- __meta_kubernetes_pod_uid
- __meta_kubernetes_pod_container_name
target_label: __path__
pipeline_stages:
- labeldrop:
- filename
- static_labels:
juju_application: training-operator
juju_unit: training-operator/0
charm: training-operator
--- # Clusterrole.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: promtail-clusterrole
rules:
- apiGroups: [""]
resources:
- nodes
- services
- pods
verbs:
- get
- watch
- list
--- # ServiceAccount.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: promtail-serviceaccount
--- # Rolebinding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: promtail-clusterrolebinding
subjects:
- kind: ServiceAccount
name: promtail-serviceaccount
namespace: kubeflow
roleRef:
kind: ClusterRole
name: promtail-clusterrole
apiGroup: rbac.authorization.k8s.io