Provide Spark with cross-account access

In case you need to provide Spark with resources from a different AWS account, I found that quite tricky to figure out. Let’s assume you have two AWS accounts: the alpha account where you run Python with IAM role alpha-role and access to the Spark cluster; and the beta account where you have the S3 bucket you want to get access to. You could give S3 read access to the alpha-role, but it is more persistent and easier to manage by creating an access-role in the beta account that can be assumed by the alpha-role. ...

August 21, 2020 · 2 min · 413 words · Joost

Upload Gitlab CI artifacts to S3

With GitLab CI it is incredibly easy to build a Hugo website (like mine); you can even host it there. But in my case I use AWS S3 and Cloudfront because it is cheap and easy to setup. The CI pipeline to build and upload the static website is also straightforward with the following .gitlab-ci.yml: variables: GIT_SUBMODULE_STRATEGY: recursive stages: - build - upload build: stage: build image: monachus/hugo script: - hugo version - hugo only: - master artifacts: paths: - ./public upload: stage: upload dependencies: - build image: dobdata/primo-triumvirato:v0.1.7 script: - aws --version - aws configure set region $AWS_DEFAULT_REGION - aws s3 sync --delete ./public s3://$S3_BUCKET only: - master The build stage generates the static website, which is shared with successive stages as an artifact. The upload stage uses my primo-triumvirato image, but this can be any image that has the aws cli installed. The sync --delete ... command recursively copies new and updated files from the source directory to the destination and deletes files that exist in the destination but not in the source. ...

July 5, 2020 · 1 min · 206 words · Joost

Secure deployment to Kubernetes with a service account

Now that I have a number of pipelines running I would like to deploy these to Kubernetes through a service account. that is quite simple. As an admin user provide resources such as: the namespaces, optionally with limited resources; an isolated service account with restricted access to one namespace; an encoded config file to be used by the Gitlab pipeline. Service Account with permissions The following file serviceaccount.yaml creates the service account, a role, and attach that role to that account: ...

April 28, 2020 · 2 min · 373 words · Joost

Kubernetes for the hobbyist with Kops

Earlier I posted about my hobby cluster on GKE which I want to keep under an affordable budget. Unfortunately Google Cloud will start charging a management fee from june 2k20 of 10$ct per hour (=$73/mnth) just like AWS. If they unilaterally change the rules, let’s get out of here! I’m thinking of moving to a self-managed Kubernetes cluster on AWS with spot instances: 1 x 1GiB master-node (t2.micro spot instance, $2.920/mnth) 2 x 2GiB worker-nodes (t3.small spot instance, $5.256/mnth) With a total estimated monthly cost of $13.43 (~€15.10 incl. VAT). So, let’s deploy a self-managed Kubernetes cluster on AWS using Kops. ...

March 22, 2020 · 2 min · 350 words · Joost

Terraform Pipelines with GitLab CI

Gitlab-CI is awesomelishiously simple. Let’s assume you have a Terraform Gitlab project with a folder structure like mine: README.md .gitignore terraform │ main.tf │ outputs.tf └──variables.tf You can find a .gitignore example here. Since we can provide our credentials via environment variables, the provider can look like: provider "aws" { version = ">= 2.28.1" } In the Gitlab project page, go to “Settings” > “CI/CD” > “Variables”, and set the following variables: ...

March 16, 2020 · 2 min · 243 words · Joost