Charmed Spark K8s

  • Canonical Data Platform | bundle
Channel Revision Published
latest/edge 4 06 Aug 2024
3.4/edge 4 06 Aug 2024
juju deploy spark-k8s-bundle --channel edge
Show information

Platform:

Working with Charmed Kubernetes from within a pod

Setup

After installing Juju and Charmed Kubernetes (together with applying setup for the latter ), now we can look into how to launch Spark jobs from within a pod in Charmed Kubernetes.

First we create a pod using Canonical’s Charmed Spark container image.

Edit the pod manifest file (we’ll refer to it as shell-demo.yaml) by adding the following conent:

apiVersion: v1
kind: Pod
metadata:
  name: shell-demo
  namespace: default
spec:
  containers:
  - name: spark-client
    image: ghcr.io/canonical/charmed-spark:3.4-22.04_stable
    command: ["/bin/pebble", "run", "--hold"]
  serviceAccountName: spark
  hostNetwork: true
  dnsPolicy: Default

The pod can be created with the following command

kubectl apply -f shell-demo.yaml

We can log into the pod as below

$ kubectl exec --stdin --tty shell-demo -- /bin/bash 

Now let’s create the Kubernetes configuration on the pod, with contents from the original server’s .kube/config

$ mkdir ~/.kube
$ cat > ~/.kube/config << EOF
<KUBECONFIG_CONTENTS_FROM_CHARMED_KUBERNETES>
EOF

Then we need to set up a service account for Spark job submission. Let’s create a user called spark in default namespace.

$ python3 -m spark8t.cli.service_account_registry create --username spark

Spark Job Submission To Kubernetes Cluster

There is a script called spark-submit packaged within the Charmed Spark container image for Spark job submission. We can sue the Spark Pi job example again, such as

$ python3 -m spak8t.cli.spark_submit --username spark --class org.apache.spark.examples.SparkPi local:///opt/spark/examples/jars/spark-examples_2.12-3.3.2.jar 100

Or using the snap command (referring practically to the same thing):

$ spark-client.spark-submit --username spark --class org.apache.spark.examples.SparkPi local:///opt/spark/examples/jars/spark-examples_2.12-3.3.2.jar 100

Spark Shell

To invoke the Spark shell, you can run the following command within the pod.

$ python3 -m spark8t.cli.spark_shell --username spark

Or

$ spark-client.spark-shell --username spark

PySpark Shell

To launch a pyspark shell, run the following command within the pod.

$ python3 -m spark8t.cli.pyspark --username spark

Or

$ spark-client.pyspark --username spark

Help improve this document in the forum (guidelines). Last updated 4 months ago.