The AWS CDK makes it really easy to create a VPC and subnets and to use those instantiated objects in your CDK project when creating VPC-bound assets like EC2 and RDS instances. Most people when they first start to use the CDK are blown away by how little python code is required to do this:
vpc = ec2.Vpc(self, "MyVPC")
That’s it! 🎉
But, what if you need to use existing VPCs and subnets?
Often this will be the case where an administrator is using a product like Control Tower or Stax to manage accounts and networking. Or maybe you just want to deploy something new into an existing environment where the VPCs were created through the console?
This is one of those situations where I went into the task with CDK feeling pretty confident this would be a well-trodden path with easy and convenient ways to get things done. Turns out I was wrong 😂, so I thought I would share what I encountered.
Reading from existing resources
If you are familiar with CDK you will know Level 2 (L2) constructs often have methods that allow you to instantiate that construct with properties from a resource in your account. These methods are typically prefaced with “from” and allow you to use resource-identifying data to get CDK to find the resources in your account.
This leads us to our first challenge — how do we find or deduce the identifying data we will need to use these methods?
Identifying existing resources
There are a number of ways we can retrieve identifying information about existing resources in an account:
- Look in the console
- Use Cloudformation exported values from the stacks that created the resources, using
cdk.Fn.import_value()
- AWS CLI calls, eg
aws ec2 describe-vpcs
andaws ec2 describe-subnets
- A data store of values, eg in parameter store
In my case, the data I need to uniquely identify my VPC is its vpc_id
or vpc_cidr_block
, and for the subnets, I’ll need ipv4_cidr_block
or subnet_id
.
As a general rule, we want to not hard-code values like the vpc_id
or a stack_name
in our code, so copying the values from the console into our code is not what we want. The CloudFormation stacks in my case did not have guessable names and I didn’t want to have to wrangle CLI calls in my CDK project, so I decided it was easiest to just create parameter store values. This would allow me to store the data I needed under known key names that I could use in different accounts, so my code would not need to change when deploying the same stack to other accounts.
There was another reason for going this way. For some operations AWS tries to guess the type of the subnets in your VPC (isolated
, private
or public
) and in my case it did not derive these correctly — it could not tell the difference between isolated
and private
subnets. My solution was to make explicit parameter store lists of the subnets of each type.
Instantiating the VPC object
Now that I have my vpc_id
and subnet_id
lists available, I can start constructing my VPC
and Subnet
objects.
To populate the Vpc
object there are two methods:
Vpc.from_lookup(scope, id, options)
Vpc.from_vpc_attributes(scope, id, attrs)
Both of these will import an existing VPC into your Vpc
object, but in materially different ways. Vpc.from_vpc_attributes
will give you a Vpc
object that contains unresolved tokens that may not do what you expect when you try to use it in your CDK project, eg to create an EC2 instance. See the notes in the documentation.
Vpc.from_lookup
, on the other hand, resolves during your program’s execution context so your Vpc
object will be usable as a synth-time argument for other CDK objects.
To use from_lookup
you will need to ensure when you fetch your parameter store values to use with it that you use ssm.StringParameter.value_from_lookup()
to fetch them so you pass resolved parameter store values to it and not tokens.
Instantiating the Subnets
Similar to the Vpc
class, the CDK Subnet
class has two methods for creating Subnet
objects from metadata about your existing resources.
Subnet.from_lookup(scope, id, options)
Subnet.from_subnet_attributes(scope, id, attrs)
I bet you’re thinking “I know what happens next — we’ll use the from_lookup
method to create the subnets” — but no! Here we hit another wrinkle…
The Subnet
object has a property availability_zone
. When you create a Subnet
using a subnet_id
it does not get an availability_zone
set. Some things you might want to do with a Subnet
may not have an issue with that, but some, like Ec2.Instance
absolutely do and will complain if you pass them a Subnet
that does not have availability_zone
properties. You will get a message saying:
You cannot reference a Subnet’s availability zone if it was not supplied. Add the availabilityZone when importing using Subnet.fromSubnetAttributes()
To remedy this, you need to explicitly set the availability_zone
on each Subnet
when you create them, which you can only do:
a) if you know the AZ for each Subnet
and
b) you create the Subnet
using Subnet.from_subnet_attributes
which lets you set the availability_zone
property
Fortunately, rather than having to create yet another parameter store value to hold the AZ, we can get this information from the Vpc
object. To do this we use a SubnetSelector
that specifies the availability_zone
we want to find subnets for, then match them by subnet_id
.
Putting it all together
Here’s an abbreviated gist of the whole process as described above: The AWS CDK makes it really easy to create a VPC and subnets and to use those instantiated objects in your CDK project when creating VPC-bound assets like EC2 and RDS instances. Most people when they first start to use the CDK are blown away by how little python code is required to do this:
vpc = ec2.Vpc(self, "MyVPC")
That’s it! 🎉
But, what if you need to use existing VPCs and subnets?
Often this will be the case where an administrator is using a product like Control Tower or Stax to manage accounts and networking. Or maybe you just want to deploy something new into an existing environment where the VPCs were created through the console?
This is one of those situations where I went into the task with CDK feeling pretty confident this would be a well-trodden path with easy and convenient ways to get things done. Turns out I was wrong 😂, so I thought I would share what I encountered.
Reading from existing resources
If you are familiar with CDK you will know Level 2 (L2) constructs often have methods that allow you to instantiate that construct with properties from a resource in your account. These methods are typically prefaced with “from” and allow you to use resource-identifying data to get CDK to find the resources in your account.
This leads us to our first challenge — how do we find or deduce the identifying data we will need to use these methods?
Identifying existing resources
There are a number of ways we can retrieve identifying information about existing resources in an account:
- Look in the console
- Use Cloudformation exported values from the stacks that created the resources, using
cdk.Fn.import_value()
- AWS CLI calls, eg
aws ec2 describe-vpcs
andaws ec2 describe-subnets
- A data store of values, eg in parameter store
In my case, the data I need to uniquely identify my VPC is its vpc_id
or vpc_cidr_block
, and for the subnets, I’ll need ipv4_cidr_block
or subnet_id
.
As a general rule, we want to not hard-code values like the vpc_id
or a stack_name
in our code, so copying the values from the console into our code is not what we want. The CloudFormation stacks in my case did not have guessable names and I didn’t want to have to wrangle CLI calls in my CDK project, so I decided it was easiest to just create parameter store values. This would allow me to store the data I needed under known key names that I could use in different accounts, so my code would not need to change when deploying the same stack to other accounts.
There was another reason for going this way. For some operations AWS tries to guess the type of the subnets in your VPC (isolated
, private
or public
) and in my case it did not derive these correctly — it could not tell the difference between isolated
and private
subnets. My solution was to make explicit parameter store lists of the subnets of each type.
Instantiating the VPC object
Now that I have my vpc_id
and subnet_id
lists available, I can start constructing my VPC
and Subnet
objects.
To populate the Vpc
object there are two methods:
Vpc.from_lookup(scope, id, options)
Vpc.from_vpc_attributes(scope, id, attrs)
Both of these will import an existing VPC into your Vpc
object, but in materially different ways. Vpc.from_vpc_attributes
will give you a Vpc
object that contains unresolved tokens that may not do what you expect when you try to use it in your CDK project, eg to create an EC2 instance. See the notes in the documentation.
Vpc.from_lookup
, on the other hand, resolves during your program’s execution context so your Vpc
object will be usable as a synth-time argument for other CDK objects.
To use from_lookup
you will need to ensure when you fetch your parameter store values to use with it that you use ssm.StringParameter.value_from_lookup()
to fetch them so you pass resolved parameter store values to it and not tokens.
Instantiating the Subnets
Similar to the Vpc
class, the CDK Subnet
class has two methods for creating Subnet
objects from metadata about your existing resources.
Subnet.from_lookup(scope, id, options)
Subnet.from_subnet_attributes(scope, id, attrs)
I bet you’re thinking “I know what happens next — we’ll use the from_lookup
method to create the subnets” — but no! Here we hit another wrinkle…
The Subnet
object has a property availability_zone
. When you create a Subnet
using a subnet_id
it does not get an availability_zone
set. Some things you might want to do with a Subnet
may not have an issue with that, but some, like Ec2.Instance
absolutely do and will complain if you pass them a Subnet
that does not have availability_zone
properties. You will get a message saying:
You cannot reference a Subnet’s availability zone if it was not supplied. Add the availabilityZone when importing using Subnet.fromSubnetAttributes()
To remedy this, you need to explicitly set the availability_zone
on each Subnet
when you create them, which you can only do:
a) if you know the AZ for each Subnet
and
b) you create the Subnet
using Subnet.from_subnet_attributes
which lets you set the availability_zone
property
Fortunately, rather than having to create yet another parameter store value to hold the AZ, we can get this information from the Vpc
object. To do this we use a SubnetSelector
that specifies the availability_zone
we want to find subnets for, then match them by subnet_id
.
Putting it all together
Here’s an abbreviated gist of the whole process as described above:
# get the vpc-id from parameter store
vpc_id = ssm.StringParameter.value_from_lookup(self, "/vpc-id")
# get the Vpc from the id
vpc = ec2.Vpc.from_lookup(self, "vpc", vpc_id=vpc_id)
# get the subnets in AZ a from the vpc
subnets_in_az_a = vpc.select_subnets(availability_zones=["ap-southeast-2a"])
# create AZ lookup
az_lookup = {}
# iterate over the AZ a subnets
for subnet in subnets_in_az_a.subnets:
az_lookup[subnet.subnet_id] = subnet.availability_zone
# iterate over the public subnets
for subnet in vpc.public_subnets:
az_lookup[subnet.subnet_id] = subnet.availability_zone
# fetch the isolated subnets list from parameter store
iso_subnet_ids = ssm.StringParameter.value_from_lookup(self, "/iso-subnets").split(",")
#create an list to store the subnets in
iso_subnets = []
# iterate over the subnet ids and create the full Subnet object includeing AZ
iso_count = 1
for subnet_id in iso_subnet_ids:
iso_subnets.append(
ec2.Subnet.from_subnet_attributes(
self,
"IsoSub" + str(iso_count),
subnet_id=subnet_id,
availability_zone=az_lookup[subnet_id],
)
)
iso_count += 1
NB You will need to duplicate bits of the above code to include the other AZs in your region and the other subnet types you need.
Making it reusable
If you find this useful, you might want to do what I did and make the above code a small python module that you can import into multiple CDK stacks. I did this by adding this code into a module file, and then importing and calling it from my stack Class like so:
from network_info import get_network_info
# then inside the class
network_info = get_network_info(self, env_tag)
Where env_tag
is set to useful things like dev
, staging
or prod
.
In the module we grab these:
def get_network_info(calling_stack, env_tag):
and instead of self
in the code, in the gist, we would pass in the calling_stack
.
I hope you find this helpful!
Let me know with a clap or a comment.