Charmed Spark K8s
- Canonical | bundle
Channel | Revision | Published |
---|---|---|
latest/edge | 4 | 06 Aug 2024 |
3.4/edge | 4 | 06 Aug 2024 |
juju deploy spark-k8s-bundle --channel edge
Deploy Kubernetes operators easily with Juju, the Universal Operator Lifecycle Manager. Need a Kubernetes cluster? Install MicroK8s to create a full CNCF-certified Kubernetes system in under 60 seconds.
Platform:
Enabling GPU acceleration
The Charmed Apache Spark solution offers an OCI image that supports the Apache Spark Rapids plugin that enables GPU acceleration on Spark jobs.
Setup
After installing spark-client and Microk8s with the GPU addon enabled, now we can look into how to launch Spark jobs with GPU in Kubernetes.
First, we need to create a pod template to limit the amount of GPU per container.
Edit the pod manifest file (we’ll refer to it as gpu_executor_template.yaml
) by adding the following content:
apiVersion: v1
kind: Pod
spec:
containers:
- name: executor
resources:
limits:
nvidia.com/gpu: 1
Submitting a Spark job with GPU acceleration
With the usage of the spark-client
snap, we can submit the desired Spark job adding some configuration options for enabling GPU acceleration:
spark-client.spark-submit \
... \
--conf spark.executor.resource.gpu.amount=1 \
--conf spark.task.resource.gpu.amount=1 \
--conf spark.rapids.memory.pinnedPool.size=1G \
--conf spark.plugins=com.nvidia.spark.SQLPlugin \
--conf spark.executor.resource.gpu.discoveryScript=/opt/getGpusResources.sh \
--conf spark.executor.resource.gpu.vendor=nvidia.com \
--conf spark.kubernetes.container.image=ghcr.io/canonical/charmed-spark-gpu:3.4-22.04_edge \
--conf spark.kubernetes.executor.podTemplateFile=gpu_executor_template.yaml
...
The Apache Spark configuration options can also be set at the service account level using the Spark Client snap to use them on every job. Please refer to the guide on how to manage options at the service account level. To have more information on how the Apache Spark Client manages configuration options please refer to the explanation section.
The options above are the minimal set that is needed to enable the Apache Spark Rapids plugin. For more information on available options, see the full list.