Import Existing VPCs and Subnets into a CDK Python Project

Published on

The AWS CDK makes it really easy to create a VPC and subnets and to use those instantiated objects in your CDK project when creating VPC-bound assets like EC2 and RDS instances. Most people when they first start to use the CDK are blown away by how little python code is required to do this:

vpc = ec2.Vpc(self, "MyVPC")

That’s it! 🎉

But, what if you need to use existing VPCs and subnets?

Often this will be the case where an administrator is using a product like Control Tower or Stax to manage accounts and networking. Or maybe you just want to deploy something new into an existing environment where the VPCs were created through the console?

This is one of those situations where I went into the task with CDK feeling pretty confident this would be a well-trodden path with easy and convenient ways to get things done. Turns out I was wrong 😂, so I thought I would share what I encountered.

Reading from existing resources

If you are familiar with CDK you will know Level 2 (L2) constructs often have methods that allow you to instantiate that construct with properties from a resource in your account. These methods are typically prefaced with “from” and allow you to use resource-identifying data to get CDK to find the resources in your account.

This leads us to our first challenge — how do we find or deduce the identifying data we will need to use these methods?

Identifying existing resources

There are a number of ways we can retrieve identifying information about existing resources in an account:

  • Look in the console
  • Use Cloudformation exported values from the stacks that created the resources, using cdk.Fn.import_value()
  • AWS CLI calls, eg aws ec2 describe-vpcs and aws ec2 describe-subnets
  • A data store of values, eg in parameter store

In my case, the data I need to uniquely identify my VPC is its vpc_id or vpc_cidr_block, and for the subnets, I’ll need ipv4_cidr_block or subnet_id.

As a general rule, we want to not hard-code values like the vpc_id or a stack_name in our code, so copying the values from the console into our code is not what we want. The CloudFormation stacks in my case did not have guessable names and I didn’t want to have to wrangle CLI calls in my CDK project, so I decided it was easiest to just create parameter store values. This would allow me to store the data I needed under known key names that I could use in different accounts, so my code would not need to change when deploying the same stack to other accounts.

There was another reason for going this way. For some operations AWS tries to guess the type of the subnets in your VPC (isolated, private or public) and in my case it did not derive these correctly — it could not tell the difference between isolated and private subnets. My solution was to make explicit parameter store lists of the subnets of each type.

Instantiating the VPC object

Now that I have my vpc_id and subnet_id lists available, I can start constructing my VPC and Subnet objects.

To populate the Vpc object there are two methods:

  • Vpc.from_lookup(scope, id, options)
  • Vpc.from_vpc_attributes(scope, id, attrs)

Both of these will import an existing VPC into your Vpc object, but in materially different ways. Vpc.from_vpc_attributes will give you a Vpc object that contains unresolved tokens that may not do what you expect when you try to use it in your CDK project, eg to create an EC2 instance. See the notes in the documentation.

Vpc.from_lookup, on the other hand, resolves during your program’s execution context so your Vpc object will be usable as a synth-time argument for other CDK objects.

To use from_lookup you will need to ensure when you fetch your parameter store values to use with it that you use ssm.StringParameter.value_from_lookup() to fetch them so you pass resolved parameter store values to it and not tokens.

Instantiating the Subnets

Similar to the Vpc class, the CDK Subnet class has two methods for creating Subnet objects from metadata about your existing resources.

  • Subnet.from_lookup(scope, id, options)
  • Subnet.from_subnet_attributes(scope, id, attrs)

I bet you’re thinking “I know what happens next — we’ll use the from_lookup method to create the subnets” — but no! Here we hit another wrinkle…

The Subnet object has a property availability_zone . When you create a Subnet using a subnet_id it does not get an availability_zone set. Some things you might want to do with a Subnet may not have an issue with that, but some, like Ec2.Instance absolutely do and will complain if you pass them a Subnet that does not have availability_zone properties. You will get a message saying:

You cannot reference a Subnet’s availability zone if it was not supplied. Add the availabilityZone when importing using Subnet.fromSubnetAttributes()

To remedy this, you need to explicitly set the availability_zone on each Subnet when you create them, which you can only do:

a) if you know the AZ for each Subnet and
b) you create the Subnet using Subnet.from_subnet_attributes which lets you set the availability_zone property

Fortunately, rather than having to create yet another parameter store value to hold the AZ, we can get this information from the Vpc object. To do this we use a SubnetSelector that specifies the availability_zone we want to find subnets for, then match them by subnet_id.

Putting it all together

Here’s an abbreviated gist of the whole process as described above: The AWS CDK makes it really easy to create a VPC and subnets and to use those instantiated objects in your CDK project when creating VPC-bound assets like EC2 and RDS instances. Most people when they first start to use the CDK are blown away by how little python code is required to do this:

vpc = ec2.Vpc(self, "MyVPC")

That’s it! 🎉

But, what if you need to use existing VPCs and subnets?

Often this will be the case where an administrator is using a product like Control Tower or Stax to manage accounts and networking. Or maybe you just want to deploy something new into an existing environment where the VPCs were created through the console?

This is one of those situations where I went into the task with CDK feeling pretty confident this would be a well-trodden path with easy and convenient ways to get things done. Turns out I was wrong 😂, so I thought I would share what I encountered.

Reading from existing resources

If you are familiar with CDK you will know Level 2 (L2) constructs often have methods that allow you to instantiate that construct with properties from a resource in your account. These methods are typically prefaced with “from” and allow you to use resource-identifying data to get CDK to find the resources in your account.

This leads us to our first challenge — how do we find or deduce the identifying data we will need to use these methods?

Identifying existing resources

There are a number of ways we can retrieve identifying information about existing resources in an account:

  • Look in the console
  • Use Cloudformation exported values from the stacks that created the resources, using cdk.Fn.import_value()
  • AWS CLI calls, eg aws ec2 describe-vpcs and aws ec2 describe-subnets
  • A data store of values, eg in parameter store

In my case, the data I need to uniquely identify my VPC is its vpc_id or vpc_cidr_block, and for the subnets, I’ll need ipv4_cidr_block or subnet_id.

As a general rule, we want to not hard-code values like the vpc_id or a stack_name in our code, so copying the values from the console into our code is not what we want. The CloudFormation stacks in my case did not have guessable names and I didn’t want to have to wrangle CLI calls in my CDK project, so I decided it was easiest to just create parameter store values. This would allow me to store the data I needed under known key names that I could use in different accounts, so my code would not need to change when deploying the same stack to other accounts.

There was another reason for going this way. For some operations AWS tries to guess the type of the subnets in your VPC (isolated, private or public) and in my case it did not derive these correctly — it could not tell the difference between isolated and private subnets. My solution was to make explicit parameter store lists of the subnets of each type.

Instantiating the VPC object

Now that I have my vpc_id and subnet_id lists available, I can start constructing my VPC and Subnet objects.

To populate the Vpc object there are two methods:

  • Vpc.from_lookup(scope, id, options)
  • Vpc.from_vpc_attributes(scope, id, attrs)

Both of these will import an existing VPC into your Vpc object, but in materially different ways. Vpc.from_vpc_attributes will give you a Vpc object that contains unresolved tokens that may not do what you expect when you try to use it in your CDK project, eg to create an EC2 instance. See the notes in the documentation.

Vpc.from_lookup, on the other hand, resolves during your program’s execution context so your Vpc object will be usable as a synth-time argument for other CDK objects.

To use from_lookup you will need to ensure when you fetch your parameter store values to use with it that you use ssm.StringParameter.value_from_lookup() to fetch them so you pass resolved parameter store values to it and not tokens.

Instantiating the Subnets

Similar to the Vpc class, the CDK Subnet class has two methods for creating Subnet objects from metadata about your existing resources.

  • Subnet.from_lookup(scope, id, options)
  • Subnet.from_subnet_attributes(scope, id, attrs)

I bet you’re thinking “I know what happens next — we’ll use the from_lookup method to create the subnets” — but no! Here we hit another wrinkle…

The Subnet object has a property availability_zone . When you create a Subnet using a subnet_id it does not get an availability_zone set. Some things you might want to do with a Subnet may not have an issue with that, but some, like Ec2.Instance absolutely do and will complain if you pass them a Subnet that does not have availability_zone properties. You will get a message saying:

You cannot reference a Subnet’s availability zone if it was not supplied. Add the availabilityZone when importing using Subnet.fromSubnetAttributes()

To remedy this, you need to explicitly set the availability_zone on each Subnet when you create them, which you can only do:

a) if you know the AZ for each Subnet and
b) you create the Subnet using Subnet.from_subnet_attributes which lets you set the availability_zone property

Fortunately, rather than having to create yet another parameter store value to hold the AZ, we can get this information from the Vpc object. To do this we use a SubnetSelector that specifies the availability_zone we want to find subnets for, then match them by subnet_id.

Putting it all together

Here’s an abbreviated gist of the whole process as described above:

# get the vpc-id from parameter store
vpc_id = ssm.StringParameter.value_from_lookup(self, "/vpc-id")

# get the Vpc from the id
vpc = ec2.Vpc.from_lookup(self, "vpc", vpc_id=vpc_id)

# get the subnets in AZ a from the vpc
subnets_in_az_a = vpc.select_subnets(availability_zones=["ap-southeast-2a"])

# create AZ lookup
az_lookup = {}

# iterate over the AZ a subnets
for subnet in subnets_in_az_a.subnets:
    az_lookup[subnet.subnet_id] = subnet.availability_zone

# iterate over the public subnets
for subnet in vpc.public_subnets:
    az_lookup[subnet.subnet_id] = subnet.availability_zone

# fetch the isolated subnets list from parameter store
iso_subnet_ids = ssm.StringParameter.value_from_lookup(self, "/iso-subnets").split(",")

#create an list to store the subnets in
iso_subnets = []

# iterate over the subnet ids and create the full Subnet object includeing AZ
iso_count = 1
for subnet_id in iso_subnet_ids:
    iso_subnets.append(
        ec2.Subnet.from_subnet_attributes(
            self,
            "IsoSub" + str(iso_count),
            subnet_id=subnet_id,
            availability_zone=az_lookup[subnet_id],
        )
    )
    iso_count += 1

NB You will need to duplicate bits of the above code to include the other AZs in your region and the other subnet types you need.

Making it reusable

If you find this useful, you might want to do what I did and make the above code a small python module that you can import into multiple CDK stacks. I did this by adding this code into a module file, and then importing and calling it from my stack Class like so:

from network_info import get_network_info
# then inside the class
network_info = get_network_info(self, env_tag)

Where env_tag is set to useful things like dev, staging or prod.

In the module we grab these:

def get_network_info(calling_stack, env_tag):

and instead of self in the code, in the gist, we would pass in the calling_stack.

I hope you find this helpful!

Let me know with a clap or a comment.

Enjoyed this article?

Share it with your network to help others discover it

Continue Learning

Discover more articles on similar topics