
https://unsplash.com/photos/DZe1jK6pk5c
Introduction
S3 Deployment is a CDK module from AWS (currently “experimental” during June 2021 at the time of writing)that allows populating an S3 bucket with the contents of .zip files from other S3 buckets or from a local disk.
TLDR
- You should use S3-Deployment only if you need to upload under 100 files
- Never use S3-Deployment if you need to upload more 100 files/pages, this will be most of the time the usual case for people.
Quick Example to see the main issue
This is what happens under the hood:
- When this stack is deployed (either via
cdk deployor via CI/CD), the contents of the localwebsite-distdirectory will be archived and uploaded to an intermediary assets bucket. This will be the main issue if you have websites with hundreds of pages. - The
BucketDeploymentconstruct synthesizes a custom CloudFormation resource of typeCustom::CDKBucketDeploymentinto the template. The source bucket/key is set to point to the assets bucket. - The custom resource downloads the .zip archive, extracts it and issues
aws s3 sync --deleteagainst the destination bucket (in this casewebsiteBucket).
The problem
The problem we get is that the intermediary bucket is not really needed and is mostly limited to copying the files 2 times!
If you have 5 files you won’t see the problem but with websites of hundreds or thousands of pages you will be stuck in a long deployment that will also be closed automatically after 15 minutes and that will do upload basically twice!
This is due to the fact that this CDK module under the hood uses a Lambda to move the files from the intermediary bucket to the destination bucket.
Alternatives or solutions?
- The Best solution ever: just use the most simple command to upload files to S3 called “aws s3 cp” like this:
aws s3 cp DIRECTORY_OR_FILE s3://BUCKET_NAME — recursive — quiet
2. Second best choice: use a Github action to upload again the files without remembering the previous S3 command: https://github.com/shallwefootball/upload-s3-action
name: Upload to S3on: [pull_request]jobs: upload: runs-on: ubuntu-latest steps: - uses: actions/checkout@master - uses: shallwefootball/s3-upload-action@master with: aws_key_id: ${{ secrets.AWS_KEY_ID }} aws_secret_access_key: ${{ secrets.AWS_SECRET_ACCESS_KEY}} aws_bucket: BUCKET_NAME source_dir: 'DIRECTORY_OR_FILE'
3. not suggested solution: increasing the memory of the Lambda with the memoryLimit parameter, from the default 128 to 1024, this might help but won’t make a difference with more than 200 files.
💡Also learn how to efficiently store one million small objects to S3 with AWS Athena integration:
👉 To read more such articles, sign up for free on Differ.
Resources
- docs.aws.amazon.com/cdk/api/latest/docs/aws-s3-deployment-readme.html
- gist.github.com/riccardogiorato/428d015dee054d3cb9b06941851ab018
- docs.aws.amazon.com/cli/latest/reference/s3/cp.html
- github.com/shallwefootball/upload-s3-action
If you want to help me out you can:
- Join Medium Membership and get my future Articles: https://riccardogiorato.medium.com/membership
- Follow me on Twitter suggesting other Topics to cover: https://twitter.com/riccardogiorato