Interactive Scala with Almond

Almond is a Scala kernel for Jupyter. Some features: Ammonite, a Scala REPL implementation. Coursier, an artefact manager. You can deploy Almond on Kubernetes with the following manifest: apiVersion: apps/v1 kind: Deployment metadata: name: almond labels: app: almond spec: replicas: 1 selector: matchLabels: app: almond template: metadata: labels: app: almond spec: containers: - name: almond image: almondsh/almond:0.13.11 resources: requests: memory: 384Mi limits: memory: 384Mi ports: - containerPort: 8888 --- kind: Service apiVersion: v1 metadata: name: almond spec: type: ClusterIP selector: app: almond ports: - protocol: TCP port: 8888 targetPort: 8888 --- kind: Service apiVersion: v1 metadata: name: almond-headless spec: clusterIP: None selector: app: almond Port forward:...

<span title='2023-05-07 11:37:35 +0100 +0100'>May 7, 2023</span>&nbsp;·&nbsp;1 min&nbsp;·&nbsp;193 words&nbsp;·&nbsp;Joost

Overview of Spark configurations

Find myself looking for an overview too often. So let’s create a rough overview of common used config for Spark. As a start, create a Spark Session with default config: from pyspark.sql import SparkSession spark = SparkSession.builder \ .master(SPARK_MASTER) \ .appname("app name") \ .getOrCreate() The Spark Context represents the connection to the cluster; communicaties with lower-level API’s and RDDs. Some resource settings on the driver: ... .config("spark.driver.memory", "8g") ... .config("spark.cores.max", "4") ....

<span title='2021-11-08 01:26:43 +0000 UTC'>November 8, 2021</span>&nbsp;·&nbsp;3 min&nbsp;·&nbsp;431 words&nbsp;·&nbsp;Joost

Provide Spark with cross-account access

In case you need to provide Spark with resources from a different AWS account, I found that quite tricky to figure out. Let’s assume you have two AWS accounts: the alpha account where you run Python with IAM role alpha-role and access to the Spark cluster; and the beta account where you have the S3 bucket you want to get access to. You could give S3 read access to the alpha-role, but it is more persistent and easier to manage by creating an access-role in the beta account that can be assumed by the alpha-role....

<span title='2020-08-21 01:26:43 +0000 UTC'>August 21, 2020</span>&nbsp;·&nbsp;2 min&nbsp;·&nbsp;413 words&nbsp;·&nbsp;Joost