Stop Using the CDK S3 Deployment Module

Introduction

S3 Deployment is a CDK module from AWS (currently “experimental” during June 2021 at the time of writing)that allows populating an S3 bucket with the contents of .zip files from other S3 buckets or from a local disk.

TLDR

You should use S3-Deployment only if you need to upload under 100 files
Never use S3-Deployment if you need to upload more 100 files/pages, this will be most of the time the usual case for people.

Quick Example to see the main issue

This is what happens under the hood:

When this stack is deployed (either via cdk deploy or via CI/CD), the contents of the local website-dist directory will be archived and uploaded to an intermediary assets bucket. This will be the main issue if you have websites with hundreds of pages.
The BucketDeployment construct synthesizes a custom CloudFormation resource of type Custom::CDKBucketDeployment into the template. The source bucket/key is set to point to the assets bucket.
The custom resource downloads the .zip archive, extracts it and issues aws s3 sync --delete against the destination bucket (in this case websiteBucket).

The problem

The problem we get is that the intermediary bucket is not really needed and is mostly limited to copying the files 2 times!

If you have 5 files you won’t see the problem but with websites of hundreds or thousands of pages you will be stuck in a long deployment that will also be closed automatically after 15 minutes and that will do upload basically twice!

This is due to the fact that this CDK module under the hood uses a Lambda to move the files from the intermediary bucket to the destination bucket.

Alternatives or solutions?

The Best solution ever: just use the most simple command to upload files to S3 called “aws s3 cp” like this:

aws s3 cp DIRECTORY_OR_FILE s3://BUCKET_NAME — recursive — quiet

2. Second best choice: use a Github action to upload again the files without remembering the previous S3 command: https://github.com/shallwefootball/upload-s3-action

name: Upload to S3on: [pull_request]jobs:  upload:    runs-on: ubuntu-latest    steps:      - uses: actions/checkout@master      - uses: shallwefootball/s3-upload-action@master        with:          aws_key_id: ${{ secrets.AWS_KEY_ID }}          aws_secret_access_key: ${{ secrets.AWS_SECRET_ACCESS_KEY}}          aws_bucket: BUCKET_NAME          source_dir: 'DIRECTORY_OR_FILE'

3. not suggested solution: increasing the memory of the Lambda with the memoryLimit parameter, from the default 128 to 1024, this might help but won’t make a difference with more than 200 files.

💡Also learn how to efficiently store one million small objects to S3 with AWS Athena integration:

How to write one million small objects to S3 Learn how to efficiently store one million small objects to S3 for your stock analytics app with AWS Athena…differ.blog

👉 To read more such articles, sign up for free on Differ.

Resources

If you want to help me out you can:

Join Medium Membership and get my future Articles: https://riccardogiorato.medium.com/membership
Follow me on Twitter suggesting other Topics to cover: https://twitter.com/riccardogiorato

Stop Using the CDK S3 Deployment Module

Do you want to wait hours to upload a React, Next.js, Gatsby, or Nuxt.js project? We hope you don’t and are looking for fast and easy solutions!

Introduction

TLDR

Quick Example to see the main issue

The problem

Alternatives or solutions?

Resources

If you want to help me out you can:

Continue Learning

Setup AWS CDK In 2 Minutes

3 Ways To Optimize AWS Costs For a New Project

AWS Wavelength for Ultra-Low Latency Applications

Mastering AWS CDK: Setting Up a Custom Domain for Your HTTP Gateway

How to Automate AWS IAM Best Practices Using the Principle of Least Privilege

Connect services in ECS with AWS ECS Service Connect

Main Menu

Follow Us

Stop Using the CDK S3 Deployment Module

Do you want to wait hours to upload a React, Next.js, Gatsby, or Nuxt.js project? We hope you don’t and are looking for fast and easy solutions!

Introduction

TLDR

Quick Example to see the main issue

The problem

Alternatives or solutions?

Resources

If you want to help me out you can:

Continue Learning

Setup AWS CDK In 2 Minutes

3 Ways To Optimize AWS Costs For a New Project

AWS Wavelength for Ultra-Low Latency Applications

Mastering AWS CDK: Setting Up a Custom Domain for Your HTTP Gateway

How to Automate AWS IAM Best Practices Using the Principle of Least Privilege

Connect services in ECS with AWS ECS Service Connect