An introduction to Infrastructure as Code with Terraform on AWS - Part 1

Discover Terraform through guided examples on AWS

Spz
BeTomorrow

--

As emphasised in a recent article (in French) on this blog, Infrastructure as Code is a crucial matter to agile software development in the cloud. Products grow and evolve, so does the software providing it, and so does the infrastructure enabling it.

There are several tools able to help manage infrastructure, we will focus here on one of the most popular: Terraform, and will work with AWS as our cloud provider.

This article is mainly aimed at developers, ops or managers wanting to discover infrastructure as code and/or explore Terraform. You should be able to follow along even without any prior experience in AWS or another cloud provider.

You can follow along and write everything yourself, and a repository is available with help for AWS setup and code as reference.

Introducing Terraform

Describing infrastructure

Infrastructure as code, as it suggests, is essentially expressing infrastructure in a programming language, as opposed to having infrastructure resulting from hard to reproduce, difficult to document manual interactions with a UI. You might want to simply express what APIs from your cloud provider to call, but most tools adopt a declarative approach. So you just express what you would like to be, and the tool handles how to get there. Terraform lets you describe resources in a specific language called HCL (short for HashiCorp Language, named after the company behind Terraform). It can then use this description to create and manage actual resources in your cloud infrastructure.

Terraform supports many providers, AWS being only one of them, and a Terraform project can have multiple providers. However, it does not really abstract each provider, and this will be obvious with our AWS examples. Providers map provider-specific features to Terraform resources expressed in the same HCL language, but they do not offer some sort of, say, universal virtual machine resource, that could be created on any cloud provider out there by just changing the provider name.

So it’s not a magic layer abstracting away your dependency on AWS, GCP or Azure, but it does bring a unified language to configure different providers.

Managing resources

Terraform is an infrastructure configuration manager, meaning its job is to bring a set of infrastructure resources in a given state, configured as you described in a configuration. This configuration is comprised of files written in a specific language, where you detail those resources and their properties. Say, for example, a VM instance with a specified OS version, an instance size, a name.

Terraform operates in a two step process. First it establishes a plan, based on what is knows about the current state of the resources and what is described in the configuration files. The plan is the list of actions it will take to attain what is described, and you will be able to review it. The second step is an apply step, where actions from the plan are actually performed through the different provider APIs.

Once changes have been applied, Terraform will store a state object, which contains identities and properties about all those resources, keeping track of resources it manages to help produce the next plan.

This two step approach, plan and apply, can be quite handy. You can review the plan, check it on partial iterations, or store it for a later apply, without touching your infrastructure.

Let’s see how this workflow translates with our first Terraform project.

Preparation

Let’s start by installing Terraform. It comes as a self contained zip bundle from the official website, or can usually be found in your favorite package manager. On OSX, with homebrew:

$ brew install terraform
$ terraform version

You will also need a text editor or IDE of your choice, most have a Terraform support through plugins, IntelliJ for example has a really nice one, so does Visual Source Code.

Last step to follow along is to configure an AWS environment to work in, you will need an IAM user with essentially admin permissions. For a first try, you may want to use a new separate AWS account. Also, Terraform commands will actually create resources and it will incur charges (though they should be really small if you don’t let them up for long).

You can find a short guide to setup an AWS user and configure your local environment in the repository.

Hello, World!

As our first exploration of Terraform, we will set up a simple web server on AWS, so that we can point our browser at it and see the default welcome page.

We will first simply create an EC2 instance. Terraform structures projects by directory. All *.tf files under the project directory will be part of our managed infrastructure description. So create a hello-world folder for this first project, and an HCL file named main.tf in it :

This first block declares our AWS provider, which Terraform will automatically download when we initialize our project, and also specifies in which region it operates (keep this region or if you are used to AWS, keep your change in mind and replace the AMI id by an equivalent in that region). The version is not mandatory, but it’s a good habit to control versions if you want to avoid surprises when new releases are rolled out.

The second block declares a variable. Variables can have different types (string, map, list, boolean…), a default value, a description and can be provided values at run time in different fashions. They can then be used to set attributes of other elements. Here we simply use a default blank value for now.

The third block describes our only resource, an AWS EC2 instance with a bare bone configuration. Resource types like aws_instance are provider specific. The resource name hello_world after the resource type is an identifier specifically for Terraform, it has no meaning to the provider. The convention is usually to keep Terraform specific symbols with lowercase and underscores, while using dashes in names, uppercase or otherwise, for the provider actual resources (like the tag on our instance). We mainly set the virtual machine image (AMI) with which to run the instance and its type. The key_name attribute specifies with which key pair to setup the instance so we could connect via SSH, and uses HCL’s expression syntax to reference our variable defined above. The user_data attribute here is set using a Heredoc style, and simply asks for the Nginx package to be installed on first boot so we can browse to its welcome page. The tags block holds a Name tag, which AWS uses for many kinds of resources.

Lastly the output block declares result attributes for our stack, which will get displayed after an apply operation. It will be useful to grab the public domain name to our instance. It references the corresponding attribute from our instance resource.

We can now initialize our project, Terraform will download external dependencies like the provider we declared.

$ terraform initInitializing the backend...Initializing provider plugins...
- Checking for available provider plugins...
- Downloading plugin for provider "aws" (terraform-providers/aws) 2.16.0...
Terraform has been successfully initialized!You may now begin working with Terraform.
...

As it reminds you, you will need to perform init again whenever an external dependency changes, like with a new or updated provider, or updates to other constructs we will see later on like modules and backends.

Oh, we do have a plan

Now we are ready to run our first plan.

$ terraform plan
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be persisted to local or remote state storage.
--------------------------------------------------------------------An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:# aws_instance.hello_world will be created
+ resource "aws_instance" "hello_world" {
+ ami = "ami-08d658f84a6d84a80"
+ arn = (known after apply)
+ associate_public_ip_address = (known after apply)
...
Plan: 1 to add, 0 to change, 0 to destroy.
...

Terraform is relatively explicit about what it’s doing by default, so you should get a good idea of what is going on. A plan does not change the state, only an apply would (or other lower level state modifying commands). The resulting plan is displayed, with deltas showing what will change.

Going for it

To actually create our instance, perform an apply. It will start by a plan again, then will prompt to proceed with the changes. Type in “yes” to go on with it. It should complete in a few seconds.

$ terraform apply
...
Plan: 1 to add, 0 to change, 0 to destroy.
Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yesaws_instance.hello_world: Creating...
aws_instance.hello_world: Still creating... [10s elapsed]
aws_instance.hello_world: Creation complete after 12s [id=i-xxxxxxxxx]
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.Outputs:instance_dns = ec2-xxx-xxx-xxx-xxx.eu-west-1.compute.amazonaws.com

You should now have an EC2 instance starting up in AWS, you can check it out in the web console and spot our instance named “hello-word”. It can take a couple minutes for it to come to a running state.

If you run a new plan, it will tell you that no changes are required.

Now if we point a browser to our instance domain, that you can get at the end of the apply output… it will timeout. If you know AWS a bit, you were probably expecting this. We created an instance with no Security Group setting, so it’s using the default one, which does not let HTTP connections go through.

Updating our infrastructure

Let’s fix our configuration by creating a Security Group for our instance with the right port open. Update main.tf and add an aws_security_group resource:

We set the Security Group to accept SSH (22) and HTTP (80) inbound connections from anywhere and let outbound connections free to go anywhere, then feed its id in the the vpc_security_group_ids list attribute of the instance resource.

Run another terraform apply and review the plan.

$ tf apply
aws_instance.hello_world: Refreshing state... (id=xxx)
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
+ create
~ update in-place
Terraform will perform the following actions:# aws_instance.hello_world will be updated in-place
~ resource "aws_instance" "hello_world" {
...
~ vpc_security_group_ids = [] -> (known after apply)
...
# aws_security_group.hello_world will be created
+ resource "aws_security_group" "hello_world" {
+ arn = (known after apply)
...
Plan: 1 to add, 1 to change, 0 to destroy.Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value:

This time our plan is comprised of a create operation and an update in-place operation. Enter “yes” again to apply the plan. What updated actually means depends on the provider, the resource and the change itself. In this case adding a security group to an instance can be done in-place without stopping our instance. But if for example you were to change the instance type to t2.micro, you would see the instance in the web console going through a stop and restart. It remains the same resource but will include downtime.

Once the apply completes, you should be able to copy the instance_dns from the output and paste it in your browser. This time you should get the default welcome Nginx page. Congrats!

Cleaning up

We have created and updated a tiny stack, let’s see how to clean it up before we explore a bit more our configuration options with Terraform. Like a plan for an update, you can see what Terraform would do with the destroy option:

$ terraform plan --destroy
aws_security_group.hello_world: Refreshing state... (id=i-xxxxxxxx)
aws_instance.hello_world: Refreshing state... (id=i-xxxxxxxx)
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
- destroy
Terraform will perform the following actions: # aws_instance.hello_world will be destroyed
...
# aws_security_group.hello_world will be destroyed
...
Plan: 0 to add, 0 to change, 2 to destroy.
...

Let’s issue a destroy command to actually clean it up.

$ terraform destroy

Like an apply, Terraform will first show the destroy plan then prompt for confirmation before proceeding. Simply type in “yes” to go on. Once done, you can check in the web console the instance switching to a terminated state.

Building upon it

We will build upon our configuration to explore tools Terraform provides and make our description more flexible. You can plan and apply along the way.

Files organisation in the project folder

Terraform structures configurations in a flat directory, it looks for all *.tf files directly inside the directory, and will not look in subdirectories. It is customary to group variables in a variables.tf file, and outputs in an outputs.tf file. You can extract the variable and the output in their own file to follow this pattern.

You will also want to add some patterns to your .gitignore to avoid committing local work files and the state or other secrets to your repository.

Overriding variables

Variables are inputs to your configuration, and can be overriden in different ways. If a variable has no default value and no specified value, Terraform will prompt you for one. You can set variables in a local file, terraform.tfvars is a default one and will be picked up if it exists. As an example, if you have a key pair set up and named hello, you can set it up in terraform.tfvars this way:

key_name = "hello"

You can also provide variable values at run time through parameters (e.g. “-var foo=bar”), files having the same format as above (e.g. “-var-file=foo.tfvars”) or environment variables (e.g. “TF_VAR_foo=bar terraform apply”).

Expressions and dependencies

Expressions are the links between the different elements in you configuration. Going back to the main.tf file, our instance resource looks like:

We added a vpc_security_groups_ids attribute expecting a list of SG ids. The expression syntax will be resolved by Terraform, with the “aws_security_group.hello_world.id” reference pointing to the “id” attribute of our new “aws_security_group” resource we named “hello_world”. This reference in an expression defines a dependency between those two resources, as the security group id is not known before the SG gets created on AWS. Terraform handles these dependencies automatically. When planning, it will build the whole dependency graph to order its operations, and it will process resources in parallel whenever it can.

Expressions support references to variables, resources, and other elements like data sources, but also handles some math and provides built-in functions. We can for example extract the user_data attribute value to a templates/user_data.tpl file and insert it like so:

References are prefixed depending on the element type (except resources which have no prefix) then hierarchically. So resources attributes are referenced by “type.name.attribute”. Variables are simply referenced by a var prefix then their name as we have seen with “var.key_name”.

Flexibility with data sources

Let’s introduce some flexibility and replace the hard coded AMI id in the instance with something dynamic. To do so we will introduce data sources, which are essentially inputs or views of external objects instead of being descriptions of what we want to manage like resources are. Similar to resources, they are provider specific and their attributes vary with their type.

Add the following block above the instance resource:

This data source of type aws_ami provides information about an AMI filtered on several properties, in this case we essentially select the equivalent of the hard coded AMI id. We can then use it to replace the previous value, using the same dotted notation to reference it, note the “data.” prefix:

If you run a plan, you will see the data source being refreshed first, then the creation planned for our instance will have a resolved AMI id. One bonus to using a data source in this case is that AMI ids are region dependent on AWS, our data source will find ids in the region our provider is configured in, so no id to replace when changing region, no obscure comment trying to describe what that hard coded id refers to.

Reminder

Do not forget to cleanup resources if you applied along the changes above.

A word about state

We mentioned Terraform’s state a bit, and the natural reaction would be to ask about how it is maintained, where it is stored. The state storage mechanism is configurable, the default being to store it locally in a terraform.tfstate file, accompanied with a backup in terraform.tfstate.backup.

First, obviously don’t delete those files or Terraform would lose track of what it is managing, and its plan would be to create everything again — which will most probably create duplicate resources, and may also fail (as existing resources on your cloud provider might have unique properties that would prevent creating duplicates like if you try to create a second S3 bucket with the same name).

A second important point to keep in mind is that the Terraform state is likely to have sensitive content, so do not commit them. Essentially all resources attributes are kept in the state, so you may quickly end up with secrets in it.

The default local nature of the state is an important aspect as this sample testing project could not really be worked on by different people, or with additional automated systems because of this. We will look at a way to handle this in the next article.

Next steps

That’s it for our first contact with Terraform. You should have a better idea of the overall workflow and the basics of a Terraform configuration. Providers are pretty rich and going over all the resources they offer can take a while and is essentially more about the provider itself. In an upcoming article we will look at other essentials, like modules that enable reusing code across stacks, workspaces which can help setting up different environments for your stack configuration, and as we have alluded to, a way to handle a distributed state when using AWS so as to have collaborators or to set up automated systems.

If you have questions, feedback, or other Terraform or infrastructure as code matters you’d like to talk about, feel free to leave a comment or contact us at BeTomorrow.

BeTomorrow is a consulting, design and software agency. We provide a tech accelerator program, combining the agility of startups with robust digital delivery. Click here to have a look at what we do!

--

--