Automate Custom EC2 AMIsKarl CardenasBlockedUnblockFollowFollowingMar 22Photo by Daniel Cheung on UnsplashIf you work for an organization/company that leverages the services of a public cloud provider such as AWS, chances are there is a customized image available in your environment.
Most companies today offer some sort of customized default image or images that contain baked in security tools, proxy variables, repository URL overrides, SSL certificates and so on.
This customized image is usually sourced from common images provided by the public cloud provider.
Today, we’re going to look at how we can completely automate a customized image sourced from the AWS Linux2 AMI and deploy it to all accounts inside an organization, while maintaining a minimal infrastructure footprint.
Code can be found athttps://github.
com/karl-cardenas-coding/automate-ami-demoAssumptionsAccounts are under an AWS Organizations.
All accounts require the customized AMI.
VPC ACLs and Security Groups allow Port 22 into to the VPC (Packer)CI/CD has proper credentials to query AWS Services (Organizations, VPC, EC2).
Gitlab and Gitlab Runner available.
Tools UtilizedTerraformPackerAWS SNSAWS LambdaAWS CLIGitlabGitlab CIDockerArchitectureA simplistic view of the automationIt all starts with the SNS topic provided by AWS for the Linux 2 AMI.
This SNS topic is the anchor to the automation that occurs every time Amazon updates the AMI.
The subscription to this SNS topic invokes a Lambda function, the function makes a REST call to a Gitlab repository that is configured with a Gitlab Runner.
The REST call is what triggers the CI/CD pipeline.
The pipeline delivers a newly minted AMI that is sourced from the Linux 2 AMI but includes baked in tools and configurations per our discretion.
The customized AMI is made available to all the AWS accounts.
I will break down the automation into three sections; pre-pipeline, terraform, packer.
Pre-pipelineSNSAWS provides customers with SNS topics for both of its managed AMIs (Linux & Linux 2).
The automation starts by subscribing to the following ARN(see below) and assigning it a lambda function as an endpoint.
When the AMI is updated, AWS publishes a message to the SNS topic and because the endpoint is a Lambda function, the function assigned will be triggered by the SNS publication.
arn:aws:sns:us-east-1:137112412989:amazon-linux-2-ami-updatesSide note: For other SNS topics of interest visit this Github repoSubscribe to the Linux2 SNS Topic ARNLambdaThe python code (see below) is what powers the Lambda function.
Its true purpose is to start a Gitlab CI runner.
The Gitlab pipeline is started by using a Gitlab trigger deploy token.
The deploy token can be created by going to a Gitlab repository’s settings; Settings -> CI/CD -> Pipeline triggers.
Further documentation can be found here.
The deploy token is added as an encrypted environment variable which is later decrypted during the lambda invocation.
The Gitlab repository's ID is also added as an environment variable.
The project ID can be found under a project’s settings; Settings -> General-> General Project.
Lambda code (python3.
7)Gitlab Token is encryptedNote: Ensure the Lambda role has proper permissions for KMS related actions.
IMPORTANT: If the customized AMI needs to be updated at a more frequent cadence, then a CloudWatch event rule can be attached to the Lambda so that the automation may be done at a more frequent cadence.
Gitlab CI YML.
gitlab-ci.
yml is the file that controls the Gitlab CI Runner.
This is where the pipeline is defined; images, stages, steps, scripts and so forth.
The configuration file for the pipelineThe yml file is pretty straightforward.
It involves two stages.
The first stage is for Terraform to create the JSON file that Packer will read.
The second stage is where Packer is executed.
The AWS credentials are also being set by leveraging Gitlab Secrets in the global before_script.
Because this is all occurring at Gitlab.
com the shared runners available at Gitlab.
com are being utilized tags: [gitlab-org] .
DockerThe pipeline is using a Docker image created from the Docker file below, sourced from Hashicorp’s Terraform image.
The image also includes python3, bash, the AWS CLI, and Packer.
The image is created using Docker and then uploaded to the Gitlab repository’s registry so that it may be used by the CI runner.
Instructions on how to complete this action can be found here.
Dockerfile for the pipelineTerraformTerraform is what is generating the JSON file that Packer will read and use to create the AMI per our directions.
Let’s breakdown the Terraform code.
The first thing done is to create a template file that will hold a list of all the accounts in the organization data “template_file".
The template file that contains all of the account IDs is then read in using data "local_file" , however this resource is dependent on null_resource.
getOrganizationAccts and the reason for that is because we only want to read in the file after it has been populated.
The data "template_file" is populated by a null_resource that is leveraging the local-exec provisioner.
This allows us to issue commands or to execute scripts.
In this instance, a bash script is being called.
Generating files and invoking the AWS CLIThe aws-cli.
sh script contains the following:echo -n $(aws organizations list-accounts –profile $PROFILE –output json –query Accounts[*].
Id | sed 's/[][]//g') >> $FILEThe script is simply issuing an aws cli command that queries the organization for all accounts.
The output is the stripped of brackets[] and redirect into the data "template_file" .
For a better understanding of how to use the AWS CLI with Terraform check out the blog article Invoking the AWS CLI with TerraformThe reason for why a script file was used rather than passing in the command is due to "" .
The hardest part of this automation is getting the double quotes right in order to have a working JSON file that Packer can interpret.
A lot of trial and error led down to the path of a script file rather than combating Terraform and escaping ticks inside the command attribute.
Also, echo -n simplifies this challenge by removing trailing white space, including
which the terraform function trimspace is unable to do.
The real challenge is handling JSON attributes that are expected in array syntax [".
", ".
"] , and ensuring the content is passed in as a string without double quotes.
More string manipulation will occur as the content is being passed into other downstream resources.
In order to create a JSON file that contains all of our required Packer configurations, a data template_file is once again being used.
The neat thing about this trick is that now dynamic JSON content can be generated by using Terraform variables (see below).
The other benefit to this is that Packer can share an AMI to all the specified accounts at once rather than manually adding permissions to an AMI for each account, one at a time.
The ami-template.
json is the template file that Terraform will use to create the JSON file we need for Packer.
All the required parameters are using template_file variables.
The Packer JSON template that will be used to generate the actual Packer fileIf you take a closer look a the code below you can see that the accounts list is being passed in by referencing data.
local_file.
getOrganizationsAccts.
content , you can also see some more string manipulation being done.
The first " and the last " is being removed from the string as Terraform will automatically add double ticks to our string.
If this step is not done then our string would look like [""15485.
""] and Packer would break at run time ???? .
Generating the JSON file for PackerA script file is being passed into the data "template_file .
Use this script to customize the AMI as needed (installing programs, environment configurations, and so forth).
This is the step where the AMI becomes a customized per your organization’s standards.
In the demo code this script file is named amazon.
sh .
#!/bin/bashsudo yum -y updateecho "This is where you would add your customization!"In the code base there is a module named ami-latest.
The module pulling down the latest version of the AWS Linux2 AMI and output the AMI id.
The AMI id is what Packer will use to source the newly created AMI from.
data "aws_ami" "ami-attrs" { most_recent = true owners = ["amazon"]filter { name = "architecture" values = ["x86_64"] }filter { name = "image-type" values = ["machine"] }filter { name = "is-public" values = ["true"] }filter { name = "state" values = ["available"] }name_regex = "^amzn2-ami-hvm-2.
0*"}Note: This could alternatively be done in Packer as well using source_ami_filter: [].
PackerThe second stage in the pipeline is where Packer is being invoked to build out an AMI.
Packer is using the generated file ami.
json from the previous pipeline step to create the customized AMI.
Packer operates in the following order:Confirm specified source AMI is availableVerify no other AMI with same name existsSpinning up an EC2 instance on our behalfGenerate a temporary key pairSSH into the EC2 instanceExecute specified scripts/user_dataStopping the EC2 instanceRegister the AMITerminate the EC2 instanceDelete temporary key pairPacker is creating an EC2 instance and customizing the image per directions.
ConclusionThe automation architecture has a minimal footprint and is serverless for the majority of the time.
The Linux2 AMI was used in the demo code but this can be adjusted to use other AMIs.
Remember, if this needs to occur at more frequent cadence to attach a CloudWatch rule to the Lambda function.
Feel free to fork the code and make adjustments as needed.
Just remember, string manipulation is the trickiest part.
Hopefully this solution can help you and your team from no longer manually creating AMIs and sharing them with other AWS accounts.
.. More details