Provide Spark with cross-account access

In case you need to provide Spark with resources from a different AWS account, I found that quite tricky to figure out. Let’s assume you have two AWS accounts: the alpha account where you run Python with IAM role alpha-role and access to the Spark cluster; and the beta account where you have the S3 bucket you want to get access to. You could give S3 read access to the alpha-role, but it is more persistent and easier to manage by creating an access-role in the beta account that can be assumed by the alpha-role. ...

August 21, 2020 · 2 min · 413 words · Joost

Upload Gitlab CI artifacts to S3

With GitLab CI it is incredibly easy to build a Hugo website (like mine); you can even host it there. But in my case I use AWS S3 and Cloudfront because it is cheap and easy to setup. The CI pipeline to build and upload the static website is also straightforward with the following .gitlab-ci.yml: variables: GIT_SUBMODULE_STRATEGY: recursive stages: - build - upload build: stage: build image: monachus/hugo script: - hugo version - hugo only: - master artifacts: paths: - ./public upload: stage: upload dependencies: - build image: dobdata/primo-triumvirato:v0.1.7 script: - aws --version - aws configure set region $AWS_DEFAULT_REGION - aws s3 sync --delete ./public s3://$S3_BUCKET only: - master The build stage generates the static website, which is shared with successive stages as an artifact. The upload stage uses my primo-triumvirato image, but this can be any image that has the aws cli installed. The sync --delete ... command recursively copies new and updated files from the source directory to the destination and deletes files that exist in the destination but not in the source. ...

July 5, 2020 · 1 min · 206 words · Joost

Kubernetes for the hobbyist with Kops

Earlier I posted about my hobby cluster on GKE which I want to keep under an affordable budget. Unfortunately Google Cloud will start charging a management fee from june 2k20 of 10$ct per hour (=$73/mnth) just like AWS. If they unilaterally change the rules, let’s get out of here! I’m thinking of moving to a self-managed Kubernetes cluster on AWS with spot instances: 1 x 1GiB master-node (t2.micro spot instance, $2.920/mnth) 2 x 2GiB worker-nodes (t3.small spot instance, $5.256/mnth) With a total estimated monthly cost of $13.43 (~€15.10 incl. VAT). So, let’s deploy a self-managed Kubernetes cluster on AWS using Kops. ...

March 22, 2020 · 2 min · 350 words · Joost

Deploy to ECS Fargate with Jenkins

In this post I demonstrate a simple container deployment setup; a Jenkins pipeline to Elastic Container Registry (ECR) and Fargate on Elastic Container Service (ECS). I assume you have Jenkins running, with a pipeline and Git repo webhook tied to it. Besides the default Jenkins plugins, you’ll need the Pipeline Utiliy Steps. Also I assume you already have a ECR repository, a ECS Fargate cluster and an AWS service account with credentials. I decided not to use the AWS credentials plugin since it is too implicit. So instead, set regular username & password: ...

February 24, 2020 · 2 min · 394 words · Joost