Terraform — Tech Guide

01

Quick Reference

Essential Terraform commands at a glance. The core workflow and every tool you need for day-to-day infrastructure management.

Core Workflow

Initialize

terraform init

Download providers, initialize backend, install modules. Run first in any new configuration directory.

Init — Upgrade

terraform init -upgrade

Re-initialize and upgrade all providers and modules to the latest allowed versions.

Plan

terraform plan

Preview changes Terraform will make. Shows additions, modifications, and destructions without applying.

Plan — Save

terraform plan -out=tfplan

Save the plan to a binary file. Guarantees that apply executes exactly what was reviewed.

Apply

terraform apply

Execute the planned changes. Prompts for confirmation unless -auto-approve is passed.

Apply — Saved Plan

terraform apply tfplan

Apply a previously saved plan file. No confirmation prompt needed since the plan was already reviewed.

Apply — Target

terraform apply -target=aws_instance.web

Apply changes to a specific resource only. Useful for debugging but not recommended for production.

Destroy

terraform destroy

Tear down all managed infrastructure. Prompts for confirmation. Use -target for selective destruction.

Code Quality

Format

terraform fmt

Rewrite configuration files to canonical HCL style. Add -recursive for all subdirectories.

Format — Check

terraform fmt -check

Check if files are formatted without modifying them. Returns non-zero exit code if changes needed. Ideal for CI.

Validate

terraform validate

Check configuration for syntax errors and internal consistency. Does not access remote state or providers.

State Commands

List Resources

terraform state list

Show all resources tracked in the current state file. Filter with address patterns.

Show Resource

terraform state show aws_instance.web

Display detailed attributes of a single resource in state. Includes all computed values.

Move Resource

terraform state mv old_name new_name

Rename a resource in state without destroying and recreating. Works across modules too.

Remove from State

terraform state rm aws_instance.web

Remove a resource from state without destroying the real infrastructure. Use to stop managing a resource.

Import & Output

Import

terraform import aws_instance.web i-abc123

Bring existing infrastructure under Terraform management. You must write the corresponding resource block first.

Generate Import Block

terraform plan -generate-config-out=gen.tf

Generate HCL configuration for import blocks. Requires Terraform 1.5+ with import {} blocks defined.

Show Outputs

terraform output

Display all output values. Add -json for machine-readable format or -raw for unquoted strings.

Specific Output

terraform output db_endpoint

Retrieve a single output value. Combine with -raw for use in shell scripts.

Workspace Commands

List Workspaces

terraform workspace list

Show all workspaces. Current workspace is marked with an asterisk.

New Workspace

terraform workspace new staging

Create and switch to a new workspace. Each workspace has its own state file.

Select Workspace

terraform workspace select production

Switch to an existing workspace. Subsequent commands operate against that workspace's state.

Show Current

terraform workspace show

Print the name of the current workspace. Useful in scripts and CI pipelines.

Modern Replacements

As of Terraform 1.x, terraform taint is replaced by terraform apply -replace=RESOURCE_ADDRESS. The terraform refresh command is replaced by terraform apply -refresh-only. Both legacy commands still work but are deprecated.

02

Installation & Setup

Getting Terraform installed, configuring providers, and setting up remote backends for team collaboration.

Installation Methods

Terraform ships as a single binary. Download it for your platform or use a package manager.

Platform	Method	Command
macOS	Homebrew	`brew install hashicorp/tap/terraform`
Ubuntu / Debian	APT	`sudo apt-get install terraform`
Windows	Chocolatey	`choco install terraform`
Any	Manual	Download from releases.hashicorp.com, unzip, add to PATH
Any	tfenv	`tfenv install 1.9.0 && tfenv use 1.9.0`

For the APT method, you must first add the HashiCorp GPG key and repository:

# Add HashiCorp GPG key
wget -O- https://apt.releases.hashicorp.com/gpg | \
  sudo gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg

# Add the repository
echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] \
  https://apt.releases.hashicorp.com $(lsb_release -cs) main" | \
  sudo tee /etc/apt/sources.list.d/hashicorp.list

# Update and install
sudo apt-get update && sudo apt-get install terraform

# Verify
terraform version

Version Management

Use tfenv to manage multiple Terraform versions. Add a .terraform-version file to your project root to pin the version per project: echo "1.9.0" > .terraform-version

Provider Configuration

Every Terraform project declares the providers it requires in a required_providers block. This tells Terraform where to download the provider plugin and which versions are acceptable.

# terraform.tf or versions.tf
terraform {
  required_version = ">= 1.5.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
    random = {
      source  = "hashicorp/random"
      version = "~> 3.6"
    }
  }
}

# Provider configuration with region and default tags
provider "aws" {
  region = "us-east-1"

  default_tags {
    tags = {
      Environment = "production"
      ManagedBy   = "terraform"
    }
  }
}

Version Constraints

~> 5.0 means any version >= 5.0 and < 6.0 (pessimistic constraint). Use = 5.31.0 for an exact pin, or >= 5.0, < 5.50 for a custom range. The ~> operator is the most common choice for production.

Backend Configuration

Remote backends store state outside your local filesystem, enabling team collaboration and state locking. The most common setups:

# AWS S3 backend with DynamoDB locking
terraform {
  backend "s3" {
    bucket         = "my-terraform-state"
    key            = "prod/network/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-locks"
    encrypt        = true
  }
}

# Azure Blob Storage backend
terraform {
  backend "azurerm" {
    resource_group_name  = "tfstate-rg"
    storage_account_name = "tfstateaccount"
    container_name       = "tfstate"
    key                  = "prod.terraform.tfstate"
  }
}

# Google Cloud Storage backend
terraform {
  backend "gcs" {
    bucket = "my-terraform-state"
    prefix = "terraform/state"
  }
}

Critical

Never store state files in version control. They may contain sensitive values like passwords and API keys in plaintext. Always use a remote backend with encryption enabled for anything beyond local experimentation.

Init Workflow

The terraform init command is the first command you run in any Terraform configuration directory. It performs three main tasks:

1. Backend Init

terraform init

Configures the backend for state storage. Creates the .terraform/ directory. Prompts if migrating between backends.

2. Provider Install

terraform init

Downloads provider plugins matching version constraints. Stored in .terraform/providers/.

3. Module Install

terraform init

Downloads referenced modules from registries or Git. Stored in .terraform/modules/.

Version Pinning

After terraform init, Terraform creates a .terraform.lock.hcl file that records the exact provider versions and checksums selected. Commit this file to version control.

# .terraform.lock.hcl (auto-generated, commit to VCS)
provider "registry.terraform.io/hashicorp/aws" {
  version     = "5.31.0"
  constraints = "~> 5.0"
  hashes = [
    "h1:abc123...",
    "zh:def456...",
  ]
}

# .gitignore for Terraform projects
.terraform/
*.tfstate
*.tfstate.*
*.tfplan
crash.log
override.tf
override.tf.json
*_override.tf
*_override.tf.json

Lock File Maintenance

Run terraform init -upgrade to update the lock file to the latest allowed versions. Always review the diff in the lock file before committing — it shows exactly which versions changed.

03

HCL Fundamentals

HashiCorp Configuration Language — the declarative syntax at the heart of every Terraform configuration. Types, variables, locals, outputs, functions, and expressions.

Syntax Basics

HCL uses a block-based syntax. Every configuration element is either an argument (key-value assignment) or a block (a labeled container for other arguments and blocks).

# Arguments: key = value
ami           = "ami-0c55b159cbfafe1f0"
instance_type = "t3.micro"
count         = 3
enabled       = true

# Blocks: type "label" "label" { ... }
resource "aws_instance" "web" {
  ami           = "ami-0c55b159cbfafe1f0"
  instance_type = "t3.micro"

  tags = {
    Name = "web-server"
  }
}

# Comments
# Single-line comment (hash)
// Single-line comment (double-slash)
/* Multi-line
   comment block */

Variable Types

Terraform supports primitive types and complex collection types. Every variable should declare its type for validation and documentation.

Type	Description	Example
`string`	Unicode text	`"hello"`
`number`	Numeric value (int or float)	`42`, `3.14`
`bool`	Boolean	`true`, `false`
`list(type)`	Ordered sequence	`["a", "b", "c"]`
`set(type)`	Unordered unique values	`toset(["a", "b"])`
`map(type)`	Key-value pairs (string keys)	`{ key = "val" }`
`object({...})`	Structured type with named attributes	`{ name = string, port = number }`
`tuple([...])`	Fixed-length sequence with per-element types	`[string, number, bool]`

Variable Blocks

Input variables parameterize your configuration. Declare them with type constraints, defaults, descriptions, and validation rules.

# Basic variable with default
variable "region" {
  description = "AWS region for all resources"
  type        = string
  default     = "us-east-1"
}

# Required variable (no default)
variable "environment" {
  description = "Deployment environment name"
  type        = string
}

# Complex type with validation
variable "instance_config" {
  description = "EC2 instance configuration"
  type = object({
    instance_type = string
    ami_id        = string
    volume_size   = number
    public        = bool
  })

  default = {
    instance_type = "t3.micro"
    ami_id        = "ami-0c55b159cbfafe1f0"
    volume_size   = 20
    public        = false
  }

  validation {
    condition     = contains(["t3.micro", "t3.small", "t3.medium"], var.instance_config.instance_type)
    error_message = "Instance type must be t3.micro, t3.small, or t3.medium."
  }
}

# List variable
variable "availability_zones" {
  description = "AZs to deploy across"
  type        = list(string)
  default     = ["us-east-1a", "us-east-1b", "us-east-1c"]
}

# Map variable
variable "instance_sizes" {
  description = "Instance type per environment"
  type        = map(string)
  default = {
    dev     = "t3.micro"
    staging = "t3.small"
    prod    = "t3.medium"
  }
}

# Sensitive variable
variable "db_password" {
  description = "Database master password"
  type        = string
  sensitive   = true
}

Set variable values via files, environment variables, or command-line flags:

# terraform.tfvars (auto-loaded)
region      = "us-west-2"
environment = "production"

# Named .tfvars file (loaded with -var-file)
# terraform apply -var-file="prod.tfvars"

# Environment variable (prefix with TF_VAR_)
# export TF_VAR_region="us-west-2"

# Command-line flag
# terraform apply -var="environment=staging"

Variable Precedence (lowest to highest)

Default value → terraform.tfvars → *.auto.tfvars (alphabetical) → -var-file → -var flag → TF_VAR_* environment variables. The last value set wins.

Locals

Local values are named expressions computed within the module. Use them to reduce repetition and clarify intent. Unlike variables, locals are not configurable by the caller.

locals {
  # Computed name prefix
  name_prefix = "${var.project}-${var.environment}"

  # Merged tags
  common_tags = {
    Project     = var.project
    Environment = var.environment
    ManagedBy   = "terraform"
    UpdatedAt   = timestamp()
  }

  # Conditional logic
  is_production = var.environment == "production"
  instance_type = local.is_production ? "t3.large" : "t3.micro"

  # Complex computation
  subnet_cidrs = [
    for i in range(3) : cidrsubnet(var.vpc_cidr, 8, i)
  ]
}

# Reference locals with local.name
resource "aws_instance" "web" {
  instance_type = local.instance_type
  tags          = local.common_tags
}

Outputs

Output values expose data from your module, making it available to parent modules, the CLI, and remote state data sources.

# Simple output
output "instance_id" {
  description = "EC2 instance ID"
  value       = aws_instance.web.id
}

# Output with a complex value
output "instance_info" {
  description = "Instance details"
  value = {
    id         = aws_instance.web.id
    public_ip  = aws_instance.web.public_ip
    private_ip = aws_instance.web.private_ip
    az         = aws_instance.web.availability_zone
  }
}

# Sensitive output (hidden in CLI, still in state)
output "db_connection_string" {
  description = "Database connection URL"
  value       = "postgresql://${var.db_user}:${var.db_password}@${aws_db_instance.main.endpoint}/app"
  sensitive   = true
}

# Output that depends on a condition
output "lb_dns" {
  description = "Load balancer DNS name"
  value       = var.create_lb ? aws_lb.main[0].dns_name : null
}

String Interpolation & Heredocs

HCL supports string interpolation with ${...} and multiline strings with heredoc syntax.

# String interpolation
name = "${var.project}-${var.environment}-instance"

# Directive interpolation (conditionals and loops in strings)
greeting = "Hello, %{ if var.name != "" }${var.name}%{ else }stranger%{ endif }!"

# Heredoc (indented)
user_data = <<-EOF
  #!/bin/bash
  apt-get update
  apt-get install -y nginx
  systemctl start nginx
  echo "Hello from ${var.environment}" > /var/www/html/index.html
EOF

# Heredoc without interpolation (single quotes)
policy = <<-'EOF'
  {
    "Version": "2012-10-17",
    "Statement": [
      {
        "Effect": "Allow",
        "Action": "s3:GetObject",
        "Resource": "*"
      }
    ]
  }
EOF

Built-in Functions

Terraform includes a rich library of built-in functions. Here are the most essential ones, grouped by category.

Function	Category	Description	Example
`file(path)`	Filesystem	Read file contents as string	`file("scripts/init.sh")`
`templatefile(path, vars)`	Filesystem	Render a template with variables	`templatefile("tpl.sh", { port = 8080 })`
`lookup(map, key, default)`	Collection	Map lookup with fallback	`lookup(var.sizes, "dev", "t3.micro")`
`merge(maps...)`	Collection	Merge multiple maps	`merge(local.tags, { Name = "web" })`
`concat(lists...)`	Collection	Combine multiple lists	`concat(var.public, var.private)`
`element(list, idx)`	Collection	Get element by index (wraps)	`element(var.azs, count.index)`
`length(val)`	Collection	Length of list, map, or string	`length(var.subnets)`
`format(spec, vals...)`	String	Printf-style formatting	`format("ip-%s", var.name)`
`join(sep, list)`	String	Join list elements	`join(",", var.cidrs)`
`split(sep, string)`	String	Split string into list	`split(",", "a,b,c")`
`try(exprs...)`	Error	Return first non-error result	`try(var.config.port, 8080)`
`can(expr)`	Error	Test if expression is valid	`can(regex("^ami-", var.ami))`
`tostring(val)`	Conversion	Convert to string	`tostring(42)` → `"42"`
`tonumber(val)`	Conversion	Convert to number	`tonumber("42")` → `42`
`tolist(val)`	Conversion	Convert set to list	`tolist(toset(["b","a"]))`
`tomap(val)`	Conversion	Convert to map	`tomap({ a = 1 })`

Additional commonly used functions:

# Numeric
min(5, 3, 9)           # 3
max(5, 3, 9)           # 9
ceil(4.3)              # 5
floor(4.9)             # 4

# String manipulation
lower("HELLO")         # "hello"
upper("hello")         # "HELLO"
replace("hi-there", "-", "_")  # "hi_there"
trimspace("  hi  ")    # "hi"
substr("hello", 0, 3)  # "hel"
regex("^ami-(.*)", "ami-abc123")  # ["abc123"]

# Collection operations
flatten([["a"], ["b", "c"]])    # ["a", "b", "c"]
zipmap(["a","b"], [1,2])       # { a=1, b=2 }
keys({ a = 1, b = 2 })        # ["a", "b"]
values({ a = 1, b = 2 })      # [1, 2]
contains(["a","b"], "a")       # true
distinct(["a","b","a"])        # ["a", "b"]
sort(["c","a","b"])             # ["a", "b", "c"]

# Encoding
jsonencode({ name = "app" })    # '{"name":"app"}'
jsondecode("{\"name\":\"app\"}") # { name = "app" }
base64encode("hello")           # "aGVsbG8="
base64decode("aGVsbG8=")        # "hello"

# Cryptographic
sha256("content")               # hex-encoded SHA-256 hash
md5("content")                  # hex-encoded MD5 hash

# Date/Time
timestamp()                      # "2025-01-15T10:30:00Z"
formatdate("YYYY-MM-DD", timestamp())

Terraform Console

Use terraform console to interactively test expressions, functions, and variable references against the current state and configuration.

# Launch the console
$ terraform console

# Test functions
> length(["a", "b", "c"])
3

> upper("terraform")
"TERRAFORM"

> cidrsubnet("10.0.0.0/16", 8, 1)
"10.0.1.0/24"

# Reference variables
> var.region
"us-east-1"

# Reference resources (if state exists)
> aws_instance.web.public_ip
"54.123.45.67"

# Test complex expressions
> { for k, v in var.instance_sizes : k => upper(v) }
{
  "dev"     = "T3.MICRO"
  "prod"    = "T3.MEDIUM"
  "staging" = "T3.SMALL"
}

# Test try/can
> try(var.missing_var, "fallback")
"fallback"

> can(tonumber("hello"))
false

# Exit
> exit

One-liner Console

Pipe expressions directly: echo 'cidrsubnet("10.0.0.0/16", 8, 3)' | terraform console. This is particularly useful in CI scripts or when you need a quick calculation without entering interactive mode.

04

Resources & Data Sources

The building blocks of every Terraform configuration. Resources create, update, and destroy infrastructure. Data sources query what already exists.

Resource Blocks

A resource block declares a piece of infrastructure. Terraform manages the full lifecycle: creation, in-place updates, replacement, and destruction. Each resource has a type (determined by the provider) and a local name used for references within the configuration.

# Syntax: resource "TYPE" "LOCAL_NAME" { ... }
resource "aws_instance" "web" {
  ami           = "ami-0c55b159cbfafe1f0"
  instance_type = "t3.micro"
  subnet_id     = aws_subnet.public[0].id

  root_block_device {
    volume_size = 20
    volume_type = "gp3"
    encrypted   = true
  }

  tags = {
    Name = "${local.name_prefix}-web"
  }
}

# Reference attributes: RESOURCE_TYPE.LOCAL_NAME.ATTRIBUTE
output "web_ip" {
  value = aws_instance.web.public_ip
}

When you run terraform apply, Terraform compares the desired state (your HCL) against the current state (the state file). It then calculates the minimum set of API calls needed to reconcile the two. Resources are created if new, updated in-place when possible, or destroyed and recreated when a change forces replacement.

Data Sources

Data sources are read-only queries to existing infrastructure or external systems. They do not create or modify anything. Use them to look up AMIs, fetch existing VPC IDs, read secrets, or reference resources managed outside your configuration.

# Look up the latest Ubuntu AMI
data "aws_ami" "ubuntu" {
  most_recent = true
  owners      = ["099720109477"] # Canonical

  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-*"]
  }

  filter {
    name   = "virtualization-type"
    values = ["hvm"]
  }
}

# Use the data source result in a resource
resource "aws_instance" "web" {
  ami           = data.aws_ami.ubuntu.id
  instance_type = "t3.micro"
}

# Fetch an existing VPC by tag
data "aws_vpc" "main" {
  filter {
    name   = "tag:Name"
    values = ["production-vpc"]
  }
}

# Read remote state from another project
data "terraform_remote_state" "network" {
  backend = "s3"

  config = {
    bucket = "my-terraform-state"
    key    = "network/terraform.tfstate"
    region = "us-east-1"
  }
}

# Reference: data.terraform_remote_state.network.outputs.vpc_id

Resources vs Data Sources

Resources use resource "type" "name" and are referenced as type.name. Data sources use data "type" "name" and are referenced as data.type.name. Resources manage lifecycle; data sources only read.

Meta-Arguments

Meta-arguments are special arguments available on every resource block, regardless of provider. They control how Terraform manages the resource rather than configuring its attributes.

Feature	`count`	`for_each`
Input type	Integer	Map or set of strings
Access index/key	`count.index`	`each.key` / `each.value`
Resource address	`type.name[0]`, `type.name[1]`	`type.name["key"]`
Reordering behavior	Removing item 0 shifts all indexes, causing replacement	Removing a key only affects that key
Best for	Identical copies, simple conditionals	Unique resources from a collection

# count — create multiple identical instances
resource "aws_instance" "worker" {
  count         = 3
  ami           = data.aws_ami.ubuntu.id
  instance_type = "t3.micro"
  subnet_id     = element(var.subnet_ids, count.index)

  tags = {
    Name = "worker-${count.index}"
  }
}

# count as a conditional (0 or 1)
resource "aws_lb" "main" {
  count = var.create_lb ? 1 : 0
  name  = "${local.name_prefix}-lb"
  # ...
}

# Reference: aws_lb.main[0].dns_name (must check length)

# for_each with a map — create subnets per AZ
resource "aws_subnet" "private" {
  for_each = {
    "us-east-1a" = "10.0.1.0/24"
    "us-east-1b" = "10.0.2.0/24"
    "us-east-1c" = "10.0.3.0/24"
  }

  vpc_id            = aws_vpc.main.id
  availability_zone = each.key
  cidr_block        = each.value

  tags = {
    Name = "private-${each.key}"
  }
}

# for_each with a set of strings
resource "aws_iam_user" "devs" {
  for_each = toset(["alice", "bob", "carol"])
  name     = each.value
}

# Reference: aws_subnet.private["us-east-1a"].id

# depends_on — explicit dependency when Terraform can't infer it
resource "aws_instance" "app" {
  ami           = data.aws_ami.ubuntu.id
  instance_type = "t3.micro"

  # Wait for the IAM role policy to be attached before launching
  depends_on = [aws_iam_role_policy_attachment.app_policy]
}

# provider — select a specific provider configuration
provider "aws" {
  alias  = "west"
  region = "us-west-2"
}

resource "aws_instance" "west_server" {
  provider      = aws.west
  ami           = "ami-0abcdef1234567890"
  instance_type = "t3.micro"
}

Prefer for_each Over count

Use for_each when each instance is logically distinct. With count, removing an item in the middle shifts all subsequent indexes and forces replacement of those resources. With for_each, each resource is keyed by name, so removals are surgical.

Lifecycle Rules

The lifecycle block customizes how Terraform handles resource creation, update, and destruction. Every rule is declared inside a lifecycle { } block within the resource.

# create_before_destroy — zero-downtime replacements
resource "aws_instance" "web" {
  ami           = var.ami_id
  instance_type = "t3.micro"

  lifecycle {
    create_before_destroy = true
  }
}

# prevent_destroy — protect critical resources
resource "aws_db_instance" "primary" {
  identifier     = "production-db"
  engine         = "postgres"
  instance_class = "db.r6g.xlarge"

  lifecycle {
    prevent_destroy = true
  }
}

# ignore_changes — ignore external modifications
resource "aws_instance" "managed" {
  ami           = "ami-0c55b159cbfafe1f0"
  instance_type = "t3.micro"

  lifecycle {
    # ASG or external process may update these tags
    ignore_changes = [tags, instance_type]
  }
}

# replace_triggered_by — force replacement when another resource changes
resource "aws_instance" "app" {
  ami           = var.ami_id
  instance_type = "t3.micro"

  lifecycle {
    replace_triggered_by = [
      null_resource.config_hash.id
    ]
  }
}

# Combining multiple lifecycle rules
resource "aws_launch_template" "web" {
  name_prefix   = "web-"
  image_id      = var.ami_id
  instance_type = "t3.micro"

  lifecycle {
    create_before_destroy = true
    ignore_changes        = [description]
  }
}

create_before_destroy

lifecycle { create_before_destroy = true }

Create the replacement resource before destroying the old one. Essential for zero-downtime deployments.

prevent_destroy

lifecycle { prevent_destroy = true }

Terraform will error if a plan would destroy this resource. Protects databases, S3 buckets, and other critical infrastructure.

ignore_changes

lifecycle { ignore_changes = [tags] }

Ignore specific attribute changes made outside Terraform. Use all to ignore every attribute after initial creation.

replace_triggered_by

lifecycle { replace_triggered_by = [res.id] }

Force resource replacement when a referenced resource or attribute changes. Added in Terraform 1.2.

Provisioners

Provisioners execute scripts on a local or remote machine as part of resource creation or destruction. They are a last resort for bootstrapping that cannot be done through provider-native features.

# local-exec — run a command on the machine running Terraform
resource "aws_instance" "web" {
  ami           = data.aws_ami.ubuntu.id
  instance_type = "t3.micro"

  provisioner "local-exec" {
    command = "echo ${self.private_ip} >> inventory.txt"
  }
}

# remote-exec — run commands on the created resource via SSH
resource "aws_instance" "app" {
  ami           = data.aws_ami.ubuntu.id
  instance_type = "t3.micro"
  key_name      = aws_key_pair.deploy.key_name

  connection {
    type        = "ssh"
    user        = "ubuntu"
    private_key = file("~/.ssh/deploy.pem")
    host        = self.public_ip
  }

  provisioner "remote-exec" {
    inline = [
      "sudo apt-get update",
      "sudo apt-get install -y nginx",
      "sudo systemctl start nginx",
    ]
  }
}

HashiCorp Recommends Avoiding Provisioners

Provisioners break the declarative model: they are not reflected in state, cannot be planned, and run only on create (not update). Prefer user_data for EC2 bootstrapping, configuration management tools (Ansible, Chef), or Packer for pre-baked AMIs. Use provisioners only when no provider-native alternative exists.

terraform_data vs null_resource

The terraform_data resource (Terraform 1.4+) is the modern replacement for null_resource. Both serve as generic containers for provisioners and triggers, but terraform_data is built into Terraform core and does not require the hashicorp/null provider.

# Legacy approach — null_resource (requires null provider)
resource "null_resource" "config_update" {
  triggers = {
    config_hash = sha256(file("config.yaml"))
  }

  provisioner "local-exec" {
    command = "./deploy-config.sh"
  }
}

# Modern approach — terraform_data (TF 1.4+, no extra provider)
resource "terraform_data" "config_update" {
  input = sha256(file("config.yaml"))

  provisioner "local-exec" {
    command = "./deploy-config.sh"
  }
}

# terraform_data with replacement triggers
resource "terraform_data" "bootstrap" {
  triggers_replace = [
    aws_instance.web.id,
    var.app_version,
  ]

  provisioner "local-exec" {
    command = "ansible-playbook -i '${aws_instance.web.public_ip},' deploy.yml"
  }
}

Migration Path

To migrate from null_resource to terraform_data, replace triggers with triggers_replace (which forces recreation) or input (whose value is stored in state and available as output). Then use terraform state mv null_resource.name terraform_data.name to avoid recreation.

05

State Management

Terraform state is the source of truth that maps your configuration to real-world resources. Understanding state is essential for safe, collaborative infrastructure management.

What Is State

Terraform records every resource it manages in a state file (terraform.tfstate). This JSON file maps resource addresses in your configuration to real infrastructure IDs in the cloud. Without state, Terraform would have no way to know which resources it created or what their current attributes are.

The state file contains:

# Simplified structure of terraform.tfstate
{
  "version": 4,
  "terraform_version": "1.9.0",
  "serial": 42,
  "lineage": "abc-123-def",
  "outputs": {
    "vpc_id": {
      "value": "vpc-0abc123",
      "type": "string"
    }
  },
  "resources": [
    {
      "mode": "managed",
      "type": "aws_instance",
      "name": "web",
      "instances": [{
        "attributes": {
          "id": "i-0abc123def456",
          "ami": "ami-0c55b159cbfafe1f0",
          "public_ip": "54.123.45.67"
        }
      }]
    }
  ]
}

State File Security

The state file contains every attribute of every managed resource, including sensitive values like database passwords, API keys, and TLS private keys — all in plaintext. Never commit terraform.tfstate to version control. Always use a remote backend with encryption enabled. Restrict access to the state storage bucket/container using IAM policies.

Remote Backends

Remote backends store state in a shared, encrypted location. This enables team collaboration and prevents state file conflicts.

# AWS S3 + DynamoDB (most common)
terraform {
  backend "s3" {
    bucket         = "company-terraform-state"
    key            = "projects/webapp/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-state-locks"
    encrypt        = true
  }
}

# Azure Blob Storage
terraform {
  backend "azurerm" {
    resource_group_name  = "terraform-state-rg"
    storage_account_name = "tfstatesa"
    container_name       = "tfstate"
    key                  = "webapp.terraform.tfstate"
  }
}

# Google Cloud Storage
terraform {
  backend "gcs" {
    bucket = "company-terraform-state"
    prefix = "webapp"
  }
}

# Terraform Cloud / HCP Terraform
terraform {
  cloud {
    organization = "my-company"

    workspaces {
      name = "webapp-production"
    }
  }
}

Bootstrap Problem

The S3 bucket and DynamoDB table must exist before you can use them as a backend. Bootstrap them with a separate Terraform project that uses a local backend, or create them manually. Some teams use a bootstrap/ directory for this purpose.

State Locking

State locking prevents concurrent operations from corrupting the state file. When one user runs terraform apply, the backend acquires a lock. Any other operations against the same state will wait or fail until the lock is released.

Backend	Locking Mechanism	Automatic
S3	DynamoDB table (required separately)	Yes, if `dynamodb_table` is set
Azure Blob	Native blob lease	Yes
GCS	Native object locking	Yes
Terraform Cloud	Built-in run queue	Yes
Consul	KV lock sessions	Yes
Local	File system lock	Yes (single machine only)

# Force-unlock a stuck lock (use with extreme caution)
terraform force-unlock LOCK_ID

# The lock ID is shown in the error message when a lock conflict occurs
# Example error:
# Error: Error locking state: Error acquiring the state lock
# Lock Info:
#   ID:        a1b2c3d4-e5f6-7890-abcd-ef1234567890
#   Path:      s3://my-bucket/terraform.tfstate
#   Operation: OperationTypeApply
#   Who:       user@hostname
#   Created:   2025-01-15 10:30:00.000000 +0000 UTC

State Commands

Terraform provides CLI commands for inspecting and manipulating state directly. These are essential for refactoring, debugging, and disaster recovery.

List All Resources

terraform state list

Show every resource address tracked in the state file. Accepts an optional address filter.

Filter by Module

terraform state list module.network

List only resources within a specific module. Useful for large configurations.

Show Resource Detail

terraform state show aws_instance.web

Display all attributes for a single resource as stored in state, including computed values.

Move / Rename

terraform state mv aws_instance.old aws_instance.new

Rename a resource in state without destroying and recreating it. Also moves between modules.

Remove from State

terraform state rm aws_instance.web

Stop managing a resource without destroying the real infrastructure. Terraform "forgets" it.

Pull Remote State

terraform state pull

Download the current remote state and print it to stdout as JSON. Useful for inspection and backup.

Push State

terraform state push terraform.tfstate

Upload a local state file to the remote backend. Dangerous — use only for disaster recovery.

Replace Provider

terraform state replace-provider old new

Update provider references in state. Used when providers are forked or change their registry address.

# Move a resource into a module
terraform state mv \
  aws_security_group.web \
  module.network.aws_security_group.web

# Move a resource between for_each keys
terraform state mv \
  'aws_subnet.private["us-east-1a"]' \
  'aws_subnet.private["use1-az1"]'

# Backup state before risky operations
terraform state pull > state-backup-$(date +%Y%m%d).json

Terraform Import

Import brings existing infrastructure under Terraform management. There are two approaches: the legacy CLI import and the modern import block (Terraform 1.5+).

# Legacy CLI import (still supported)
# Step 1: Write the resource block in your configuration
resource "aws_instance" "legacy_server" {
  # Configuration will be filled in after import
}

# Step 2: Run the import command
terraform import aws_instance.legacy_server i-0abc123def456789

# Step 3: Run terraform plan and fill in attributes until plan is clean

# Modern import blocks (TF 1.5+) — declarative and plannable
import {
  to = aws_instance.legacy_server
  id = "i-0abc123def456789"
}

import {
  to = aws_s3_bucket.assets
  id = "my-company-assets-bucket"
}

# Generate HCL configuration automatically
# (run this after writing the import blocks above)
terraform plan -generate-config-out=generated.tf

# Review generated.tf, clean it up, then:
terraform plan   # should show no changes
terraform apply  # imports into state

Import Block Advantages

Import blocks are declarative: they can be code-reviewed, they appear in terraform plan output, and they can auto-generate configuration with -generate-config-out. The legacy terraform import command is imperative and requires you to manually write the resource configuration.

State Migration

When switching between backends (for example, from local to S3, or from S3 to Terraform Cloud), use the -migrate-state flag during init.

# Step 1: Update the backend block in your configuration
terraform {
  backend "s3" {
    bucket         = "new-state-bucket"
    key            = "app/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-locks"
    encrypt        = true
  }
}

# Step 2: Reinitialize with migration
terraform init -migrate-state

# Terraform will prompt:
#   Do you want to copy existing state to the new backend?
#   Enter "yes" to copy, "no" to start fresh.

# Step 3: Verify state was migrated
terraform state list
terraform plan   # should show no changes

Sensitive Data Reminder

Always enable encryption on your remote backend. For S3, set encrypt = true and consider using a KMS key (kms_key_id). For Azure, the storage account should have encryption at rest. Mark sensitive outputs with sensitive = true to prevent them from appearing in CLI output, though they are still stored in the state file in plaintext.

06

Modules

Reusable, composable packages of Terraform configuration. Modules are the primary mechanism for organizing, encapsulating, and sharing infrastructure code.

Module Structure

A well-structured module follows a standard directory layout. Every Terraform directory is implicitly a module — the root directory is the root module, and any module called via a module block is a child module.

# Standard module directory layout
modules/
  vpc/
    main.tf           # Primary resource definitions
    variables.tf      # Input variable declarations
    outputs.tf        # Output value declarations
    versions.tf       # Required providers and Terraform version
    README.md         # Documentation (used by registry)
    locals.tf         # Local value computations (optional)
    data.tf           # Data source lookups (optional)

# versions.tf — pin provider requirements in child modules
terraform {
  required_version = ">= 1.5.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = ">= 5.0"
    }
  }
}

# variables.tf — define the module's inputs
variable "vpc_cidr" {
  description = "CIDR block for the VPC"
  type        = string
  default     = "10.0.0.0/16"
}

variable "environment" {
  description = "Environment name (dev, staging, prod)"
  type        = string
}

# outputs.tf — expose values to the caller
output "vpc_id" {
  description = "ID of the created VPC"
  value       = aws_vpc.main.id
}

output "private_subnet_ids" {
  description = "List of private subnet IDs"
  value       = aws_subnet.private[*].id
}

Module Sources

The source argument in a module block tells Terraform where to find the module code. Terraform supports many source types.

# Local path — relative to the calling module
module "vpc" {
  source      = "./modules/vpc"
  vpc_cidr    = "10.0.0.0/16"
  environment = "production"
}

# Terraform Registry (public)
module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "~> 5.0"

  name = "production-vpc"
  cidr = "10.0.0.0/16"

  azs             = ["us-east-1a", "us-east-1b", "us-east-1c"]
  private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
  public_subnets  = ["10.0.101.0/24", "10.0.102.0/24", "10.0.103.0/24"]

  enable_nat_gateway = true
  single_nat_gateway = true
}

# GitHub (HTTPS)
module "app" {
  source = "github.com/my-org/terraform-modules//modules/app?ref=v2.1.0"
}

# GitHub (SSH)
module "app" {
  source = "git@github.com:my-org/terraform-modules.git//modules/app?ref=v2.1.0"
}

# Generic Git repository
module "networking" {
  source = "git::https://git.example.com/infra/modules.git//networking?ref=main"
}

# S3 bucket (versioned zip archive)
module "legacy" {
  source = "s3::https://s3-us-east-1.amazonaws.com/my-modules/vpc/v1.2.0.zip"
}

Double-Slash Syntax

The // in Git and GitHub URLs separates the repository root from the subdirectory path. Everything after // is a path within the repo. The ?ref= parameter specifies a Git tag, branch, or commit SHA.

Module Versioning

Version constraints control which versions of a module are acceptable. Versioning is supported for Terraform Registry modules and can be simulated for Git sources using ref.

Constraint	Meaning	Example Matches
`= 3.2.0`	Exact version only	3.2.0
`~> 3.2`	>= 3.2.0 and < 4.0.0	3.2.0, 3.9.5
`~> 3.2.0`	>= 3.2.0 and < 3.3.0	3.2.0, 3.2.7
`>= 3.0, < 4.0`	Custom range	3.0.0, 3.5.2
`>= 3.0`	Minimum version, no upper bound	3.0.0, 5.0.0

# Registry module with pessimistic version constraint
module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "~> 20.0"  # any 20.x.x
  # ...
}

# Git source with tag
module "network" {
  source = "git@github.com:my-org/modules.git//network?ref=v3.2.0"
}

# Git source with branch
module "network" {
  source = "git@github.com:my-org/modules.git//network?ref=main"
}

# Git source with commit SHA (most reproducible)
module "network" {
  source = "git@github.com:my-org/modules.git//network?ref=a1b2c3d"
}

Version Pinning Strategy

For production, use ~> MAJOR.MINOR (e.g., ~> 5.0) to allow patch updates while preventing breaking changes. For maximum stability, pin to an exact version. Always run terraform init -upgrade and review changes before updating module versions.

Module Composition

Large infrastructure configurations should be composed from small, focused modules rather than written as a single monolith. Two common patterns emerge: the composition pattern and the facade pattern.

# Composition pattern — assembling small modules in the root
# Each module handles one concern

module "network" {
  source      = "./modules/network"
  vpc_cidr    = "10.0.0.0/16"
  environment = var.environment
}

module "database" {
  source     = "./modules/database"
  subnet_ids = module.network.private_subnet_ids
  vpc_id     = module.network.vpc_id
}

module "application" {
  source          = "./modules/application"
  subnet_ids      = module.network.public_subnet_ids
  db_endpoint     = module.database.endpoint
  db_port         = module.database.port
  security_groups = [module.network.app_sg_id]
}

# Facade pattern — a high-level module that wraps lower-level modules
# modules/platform/main.tf

module "network" {
  source      = "../network"
  vpc_cidr    = var.vpc_cidr
  environment = var.environment
}

module "database" {
  source     = "../database"
  subnet_ids = module.network.private_subnet_ids
  vpc_id     = module.network.vpc_id
}

module "application" {
  source      = "../application"
  subnet_ids  = module.network.public_subnet_ids
  db_endpoint = module.database.endpoint
}

# Caller uses the facade with minimal configuration
# root main.tf
module "platform" {
  source      = "./modules/platform"
  environment = "production"
  vpc_cidr    = "10.0.0.0/16"
}

Root vs Child Modules

Every Terraform run operates on a root module — the directory where you invoke terraform plan/apply. Any module called from the root (or from other child modules) is a child module. Understanding their different responsibilities is important.

Aspect	Root Module	Child Module
Purpose	Orchestrates the deployment	Encapsulates a reusable component
Provider config	Declares and configures providers	Inherits providers from parent (should not configure)
Backend config	Declares the backend	Cannot declare a backend
Variables	Set via .tfvars, CLI, or env vars	Set via `module` block arguments
Outputs	Displayed to the CLI user	Available to the calling module via `module.name.output`
State	Owns the state file	Resources stored in the same state as root

Passing Data Between Modules

Modules communicate through inputs (variables) and outputs. The output of one module becomes the input to another. This creates an explicit dependency graph that Terraform uses for ordering.

# Module A (network) — outputs.tf
output "vpc_id" {
  description = "The VPC ID"
  value       = aws_vpc.main.id
}

output "private_subnet_ids" {
  description = "Private subnet IDs"
  value       = aws_subnet.private[*].id
}

# Module B (compute) — variables.tf
variable "vpc_id" {
  description = "VPC to deploy into"
  type        = string
}

variable "subnet_ids" {
  description = "Subnets for the instances"
  type        = list(string)
}

# Root module — wiring modules together
module "network" {
  source      = "./modules/network"
  environment = var.environment
}

module "compute" {
  source     = "./modules/compute"
  vpc_id     = module.network.vpc_id            # output -> input
  subnet_ids = module.network.private_subnet_ids # output -> input
}

# Terraform automatically knows to create network before compute

Terraform Registry

The Terraform Registry hosts public modules and providers. Public modules follow a strict naming convention and are versioned with semantic versioning.

Naming Convention

terraform-<PROVIDER>-<NAME>

Registry modules must follow this naming pattern. Example: terraform-aws-vpc, terraform-google-kubernetes-engine.

Source Format

<NAMESPACE>/<NAME>/<PROVIDER>

Registry modules are referenced as "terraform-aws-modules/vpc/aws" in the source argument.

Private Registry

app.terraform.io/<ORG>/<NAME>/<PROVIDER>

Terraform Cloud and Enterprise support private module registries for internal modules.

Submodules

source = "terraform-aws-modules/vpc/aws//modules/vpc-endpoints"

Registry modules may expose submodules via the // path separator.

# Using popular registry modules

# AWS VPC module (most downloaded Terraform module)
module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "~> 5.0"

  name = "my-vpc"
  cidr = "10.0.0.0/16"

  azs             = ["us-east-1a", "us-east-1b", "us-east-1c"]
  private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
  public_subnets  = ["10.0.101.0/24", "10.0.102.0/24", "10.0.103.0/24"]

  enable_nat_gateway   = true
  single_nat_gateway   = true
  enable_dns_hostnames = true

  tags = local.common_tags
}

# AWS EKS module
module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "~> 20.0"

  cluster_name    = "production"
  cluster_version = "1.30"

  vpc_id     = module.vpc.vpc_id
  subnet_ids = module.vpc.private_subnets

  eks_managed_node_groups = {
    default = {
      min_size     = 2
      max_size     = 10
      desired_size = 3

      instance_types = ["t3.medium"]
    }
  }
}

# Using a private registry module
module "internal_app" {
  source  = "app.terraform.io/my-company/app-template/aws"
  version = "~> 2.0"

  app_name    = "billing-service"
  environment = "production"
}

Module Best Practices

Keep modules focused on a single concern. Expose only what callers need through outputs. Always declare required_providers in child modules but do not configure providers there — let the root module handle provider configuration. Document every variable and output with description attributes.

07

Provider Ecosystem

Providers are the bridge between Terraform and the outside world. Each provider is a plugin that translates HCL into API calls for a specific platform or service.

How Providers Work

Providers are standalone Go binaries distributed as plugins. When you run terraform init, Terraform downloads the provider binaries matching your version constraints and stores them in .terraform/providers/. Each provider manages a set of resource types and data sources for its target API.

Providers are identified by a three-part source address: namespace/type within a given registry. The default registry is registry.terraform.io. For example, hashicorp/aws resolves to registry.terraform.io/hashicorp/aws.

Provider Cache

Set TF_PLUGIN_CACHE_DIR to a shared directory (e.g., ~/.terraform.d/plugin-cache) to avoid re-downloading providers across projects. Terraform creates symlinks instead of copies, saving both time and disk space.

required_providers Block

Declare every provider your configuration depends on in the required_providers block inside a terraform block. This is the single source of truth for provider versions.

terraform {
  # Pin the Terraform CLI version
  required_version = ">= 1.5.0, < 2.0.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"       # >= 5.0, < 6.0
    }
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "~> 3.80"     # >= 3.80, < 4.0
    }
    kubernetes = {
      source  = "hashicorp/kubernetes"
      version = ">= 2.24, < 3.0"
    }
    datadog = {
      source  = "DataDog/datadog"
      version = "~> 3.30"
    }
  }
}

Constraint	Meaning	Example Range
`= 5.31.0`	Exact version only	5.31.0
`~> 5.0`	Pessimistic (rightmost increment)	>= 5.0, < 6.0
`~> 5.31`	Pessimistic (patch level)	>= 5.31.0, < 5.32.0
`>= 5.0, < 5.50`	Custom range	5.0.0 through 5.49.x
`!= 5.25.0`	Exclude a specific version	Any version except 5.25.0

Provider Configuration

The provider block configures a specific provider instance. Provider configuration typically includes the region, authentication method, and default behaviors.

# Basic AWS provider configuration
provider "aws" {
  region = "us-east-1"

  default_tags {
    tags = {
      Environment = var.environment
      ManagedBy   = "terraform"
      Team        = "platform"
      CostCenter  = "eng-12345"
    }
  }
}

# Azure provider
provider "azurerm" {
  features {}
  subscription_id = var.subscription_id
}

# GCP provider
provider "google" {
  project = var.project_id
  region  = "us-central1"
}

Never Hardcode Credentials

Do not put access keys, secrets, or tokens in provider blocks or .tf files. Use environment variables (AWS_ACCESS_KEY_ID, ARM_CLIENT_ID), OIDC federation, IAM roles for service accounts, or a credentials helper. In CI/CD, use GOOGLE_CREDENTIALS with a service account JSON or workload identity federation.

Provider Aliases

Aliases let you configure multiple instances of the same provider — for multi-region deployments, cross-account access, or different authentication contexts.

# Default AWS provider (us-east-1)
provider "aws" {
  region = "us-east-1"
}

# Aliased provider for us-west-2
provider "aws" {
  alias  = "west"
  region = "us-west-2"
}

# Aliased provider for a different AWS account
provider "aws" {
  alias  = "shared_services"
  region = "us-east-1"

  assume_role {
    role_arn = "arn:aws:iam::123456789012:role/TerraformRole"
  }
}

# Use aliased provider in a resource
resource "aws_s3_bucket" "replica" {
  provider = aws.west
  bucket   = "my-replica-bucket"
}

# Pass aliased provider to a module
module "west_vpc" {
  source = "./modules/vpc"

  providers = {
    aws = aws.west
  }

  cidr_block = "10.1.0.0/16"
}

Multi-Region Pattern

For multi-region deployments, define one default provider and one aliased provider per additional region. Resources default to the un-aliased provider unless you explicitly set provider = aws.alias_name. Modules must receive aliased providers through their providers argument.

Major Providers

The Terraform Registry hosts thousands of providers. The average enterprise uses 8–12 providers. Here are the most widely adopted:

Provider	Source	Description	Typical Use Cases
AWS 4B+ DL	`hashicorp/aws`	Amazon Web Services — the most downloaded Terraform provider	EC2, S3, RDS, Lambda, VPC, IAM
Azure	`hashicorp/azurerm`	Microsoft Azure Resource Manager	VMs, AKS, Storage, SQL, Networking
GCP	`hashicorp/google`	Google Cloud Platform	GKE, Cloud Run, BigQuery, VPC
Kubernetes	`hashicorp/kubernetes`	Manage K8s resources declaratively	Deployments, Services, ConfigMaps
Docker	`kreuzwerker/docker`	Manage Docker containers and images	Local dev environments, container orchestration
GitHub	`integrations/github`	Manage GitHub repos, teams, and settings	Repo creation, branch protection, team membership
Cloudflare	`cloudflare/cloudflare`	DNS, WAF, Workers, and edge services	DNS records, page rules, access policies
Datadog	`DataDog/datadog`	Monitoring, dashboards, and alerting	Monitors, dashboards, SLOs, downtime

Community & Custom Providers

Beyond official HashiCorp providers, the Terraform Registry hosts thousands of community-maintained providers for services ranging from PagerDuty and Snowflake to 1Password and Spotify. Custom in-house providers can be distributed via a private registry or local filesystem mirrors.

# Community provider example
terraform {
  required_providers {
    snowflake = {
      source  = "Snowflake-Labs/snowflake"
      version = "~> 0.76"
    }
    pagerduty = {
      source  = "PagerDuty/pagerduty"
      version = "~> 3.6"
    }
  }
}

# Filesystem mirror for air-gapped environments
provider_installation {
  filesystem_mirror {
    path    = "/opt/terraform/providers"
    include = ["registry.terraform.io/hashicorp/*"]
  }
  direct {
    exclude = ["registry.terraform.io/hashicorp/*"]
  }
}

08

Workspaces & Environments

Strategies for managing multiple environments — dev, staging, production — with the same Terraform configuration. CLI workspaces, directory structures, and when to use each.

CLI Workspaces

Terraform CLI workspaces let you maintain separate state instances from a single configuration directory. Each workspace has its own terraform.tfstate file, but all share the same code and backend configuration.

Create Workspace

terraform workspace new staging

Create a new workspace and switch to it. The state file is created in the backend under a workspace-specific path.

List Workspaces

terraform workspace list

Show all available workspaces. Current workspace is marked with an asterisk (*).

Select Workspace

terraform workspace select production

Switch to an existing workspace. All subsequent commands use that workspace's state.

Show Current

terraform workspace show

Print the name of the currently selected workspace. Useful in scripts and CI/CD pipelines.

Delete Workspace

terraform workspace delete staging

Remove a workspace and its state. The workspace must have an empty state (all resources destroyed) first.

Workspace-Based Environments

Use terraform.workspace to vary configuration based on the active workspace. A common pattern uses a locals map to define environment-specific settings:

# locals.tf — environment-specific configuration via workspace
locals {
  env_config = {
    dev = {
      instance_type = "t3.micro"
      instance_count = 1
      db_class       = "db.t3.micro"
      multi_az       = false
    }
    staging = {
      instance_type  = "t3.small"
      instance_count = 2
      db_class       = "db.t3.small"
      multi_az       = false
    }
    production = {
      instance_type  = "t3.large"
      instance_count = 3
      db_class       = "db.r6g.large"
      multi_az       = true
    }
  }

  # Select config for the active workspace
  config = local.env_config[terraform.workspace]
}

# main.tf — use workspace-driven values
resource "aws_instance" "app" {
  count         = local.config.instance_count
  instance_type = local.config.instance_type
  ami           = data.aws_ami.ubuntu.id

  tags = {
    Name        = "app-${terraform.workspace}-${count.index}"
    Environment = terraform.workspace
  }
}

resource "aws_db_instance" "main" {
  instance_class = local.config.db_class
  multi_az       = local.config.multi_az
  identifier     = "app-db-${terraform.workspace}"
}

Directory-Based Structure

The recommended approach for production environments is a directory-based layout. Each environment gets its own directory with its own backend configuration and state file, providing full isolation.

# Recommended directory layout
infrastructure/
  modules/
    vpc/
      main.tf
      variables.tf
      outputs.tf
    app/
      main.tf
      variables.tf
      outputs.tf
  environments/
    dev/
      main.tf           # Calls modules with dev values
      backend.tf        # Dev-specific state backend
      terraform.tfvars  # Dev variable values
    staging/
      main.tf
      backend.tf
      terraform.tfvars
    production/
      main.tf
      backend.tf
      terraform.tfvars

Best Practice

Use directory-based structure for production workloads where blast radius and access control matter. Reserve CLI workspaces for local development and testing where simplicity outweighs isolation. HCP Terraform workspaces offer a managed middle ground with RBAC, policy enforcement, and run history.

Approach Comparison

Criteria	CLI Workspaces	Directory-Based	Terragrunt
State Isolation	Shared backend, separate state files	Fully separate backends	Fully separate backends
Code Duplication	None — single config directory	Some — root modules per env	Minimal — DRY via `terragrunt.hcl`
Access Control	Limited (same backend IAM)	Full (per-env IAM policies)	Full (per-env IAM policies)
Blast Radius	Risk of wrong workspace	Isolated by design	Isolated by design
Complexity	Low — built-in to Terraform	Medium — more files to manage	Medium — extra tool dependency
CI/CD Integration	Select workspace before plan/apply	Target directory per pipeline	`terragrunt run-all` for orchestration
Best For	Dev/testing, small teams	Production, regulated environments	Large-scale multi-account setups

Pros & Cons of CLI Workspaces

Benefits	Drawbacks
Simple — built into Terraform, no extra tools	Shared backend — same IAM for all environments
Zero code duplication across environments	Risk of applying to the wrong workspace
Quick to set up for prototyping	Limited isolation — a misconfigured backend affects all envs
Workspace name available in config via `terraform.workspace`	No per-workspace variable files by default
Single codebase to maintain and review	Cannot use different Terraform or provider versions per env

09

Advanced Patterns

Power-user techniques for writing expressive, maintainable Terraform configurations. Dynamic blocks, conditional resources, validation, refactoring, and cross-configuration data sharing.

Dynamic Blocks

Dynamic blocks generate repeated nested blocks within a resource. Instead of copying and pasting ingress or setting blocks, use dynamic with for_each to iterate over a collection.

variable "ingress_rules" {
  type = list(object({
    port        = number
    protocol    = string
    cidr_blocks = list(string)
    description = string
  }))
  default = [
    { port = 80,  protocol = "tcp", cidr_blocks = ["0.0.0.0/0"], description = "HTTP" },
    { port = 443, protocol = "tcp", cidr_blocks = ["0.0.0.0/0"], description = "HTTPS" },
    { port = 22,  protocol = "tcp", cidr_blocks = ["10.0.0.0/8"], description = "SSH internal" },
  ]
}

resource "aws_security_group" "app" {
  name        = "app-sg"
  description = "Application security group"
  vpc_id      = var.vpc_id

  # Dynamic block generates one ingress block per rule
  dynamic "ingress" {
    for_each = var.ingress_rules
    content {
      from_port   = ingress.value.port
      to_port     = ingress.value.port
      protocol    = ingress.value.protocol
      cidr_blocks = ingress.value.cidr_blocks
      description = ingress.value.description
    }
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

Conditional dynamic blocks — use an empty list to suppress the block entirely:

# Only create the logging block if logging is enabled
resource "aws_s3_bucket" "data" {
  bucket = "my-data-bucket"

  dynamic "logging" {
    for_each = var.enable_logging ? [1] : []
    content {
      target_bucket = var.log_bucket_id
      target_prefix = "logs/"
    }
  }
}

# Nested dynamic blocks
resource "aws_lb_listener" "https" {
  load_balancer_arn = aws_lb.main.arn
  port              = 443
  protocol          = "HTTPS"

  dynamic "default_action" {
    for_each = var.listener_rules
    content {
      type             = default_action.value.type
      target_group_arn = default_action.value.target_group_arn

      dynamic "redirect" {
        for_each = default_action.value.type == "redirect" ? [1] : []
        content {
          status_code = "HTTP_301"
          protocol    = "HTTPS"
        }
      }
    }
  }
}

for_each with Maps

Using for_each with a map produces resources keyed by the map keys, giving stable resource addresses that survive reordering. This is strongly preferred over count with lists.

variable "buckets" {
  type = map(object({
    versioning = bool
    acl        = string
  }))
  default = {
    logs = { versioning = true,  acl = "log-delivery-write" }
    data = { versioning = true,  acl = "private" }
    tmp  = { versioning = false, acl = "private" }
  }
}

resource "aws_s3_bucket" "this" {
  for_each = var.buckets

  bucket = "${var.project}-${each.key}"
  # each.key   = "logs", "data", "tmp"
  # each.value = { versioning = true, acl = "..." }

  tags = {
    Name = each.key
  }
}

# Resources are addressed by key, not index:
# aws_s3_bucket.this["logs"]
# aws_s3_bucket.this["data"]
# aws_s3_bucket.this["tmp"]

Why Maps Over Lists

With count, resources are addressed by index ([0], [1]). Removing an item from the middle shifts all subsequent indices, causing unnecessary destroy/recreate operations. With for_each on a map, each resource has a stable string key. Removing "tmp" from the map only destroys that one bucket — "logs" and "data" are untouched.

Conditional Resources

The count = var.enabled ? 1 : 0 pattern is the standard way to conditionally create a resource. When the condition is false, the resource is not created at all.

variable "create_cloudwatch_alarm" {
  type    = bool
  default = true
}

resource "aws_cloudwatch_metric_alarm" "cpu" {
  count = var.create_cloudwatch_alarm ? 1 : 0

  alarm_name          = "high-cpu"
  comparison_operator = "GreaterThanThreshold"
  evaluation_periods  = 2
  metric_name         = "CPUUtilization"
  namespace           = "AWS/EC2"
  period              = 300
  statistic           = "Average"
  threshold           = 80
}

# Reference a conditional resource with [0] index
output "alarm_arn" {
  value = var.create_cloudwatch_alarm ? aws_cloudwatch_metric_alarm.cpu[0].arn : ""
}

# Or use one() to get the single element or null
output "alarm_arn_v2" {
  value = one(aws_cloudwatch_metric_alarm.cpu[*].arn)
}

Moved Blocks TF 1.1+

The moved block tells Terraform that a resource has been renamed or refactored, preventing a destroy/recreate cycle. This is essential when reorganizing code.

# Rename a resource
moved {
  from = aws_instance.web_server
  to   = aws_instance.app
}

# Move a resource into a module
moved {
  from = aws_s3_bucket.logs
  to   = module.logging.aws_s3_bucket.main
}

# Migrate from count to for_each
moved {
  from = aws_subnet.private[0]
  to   = aws_subnet.private["us-east-1a"]
}
moved {
  from = aws_subnet.private[1]
  to   = aws_subnet.private["us-east-1b"]
}
moved {
  from = aws_subnet.private[2]
  to   = aws_subnet.private["us-east-1c"]
}

# Rename a module
moved {
  from = module.web
  to   = module.frontend
}

Moved Block Lifecycle

Keep moved blocks in your configuration long enough for all state files (across all workspaces and environments) to have been updated. After that, you can safely remove them. A common practice is to keep them for one or two release cycles.

Preconditions & Postconditions TF 1.2+

Lifecycle conditions let you assert invariants that Terraform checks during plan and apply. Preconditions are checked before a resource action; postconditions are checked after.

resource "aws_instance" "app" {
  ami           = var.ami_id
  instance_type = var.instance_type
  subnet_id     = var.subnet_id

  lifecycle {
    # Precondition: checked BEFORE create/update
    precondition {
      condition     = data.aws_ami.selected.architecture == "x86_64"
      error_message = "AMI must be x86_64 architecture for this instance type."
    }

    # Postcondition: checked AFTER create/update
    postcondition {
      condition     = self.public_ip != ""
      error_message = "Instance must receive a public IP address."
    }
  }
}

# Output-level postcondition
output "api_url" {
  value = "https://${aws_lb.main.dns_name}/api"

  precondition {
    condition     = aws_lb.main.dns_name != ""
    error_message = "Load balancer DNS name must not be empty."
  }
}

Custom Validation Rules

Variable-level validation blocks run during terraform plan before any resource is created. Use them to catch invalid input early with clear error messages.

# CIDR validation
variable "vpc_cidr" {
  type = string

  validation {
    condition     = can(cidrhost(var.vpc_cidr, 0))
    error_message = "vpc_cidr must be a valid CIDR block (e.g., 10.0.0.0/16)."
  }

  validation {
    condition     = tonumber(split("/", var.vpc_cidr)[1]) <= 24
    error_message = "VPC CIDR prefix must be /24 or larger."
  }
}

# Environment constraint
variable "environment" {
  type = string

  validation {
    condition     = contains(["dev", "staging", "production"], var.environment)
    error_message = "Environment must be dev, staging, or production."
  }
}

# Tag requirements
variable "tags" {
  type = map(string)

  validation {
    condition     = contains(keys(var.tags), "CostCenter")
    error_message = "Tags must include a CostCenter key for billing."
  }

  validation {
    condition     = contains(keys(var.tags), "Owner")
    error_message = "Tags must include an Owner key for accountability."
  }
}

# Regex-based validation
variable "project_name" {
  type = string

  validation {
    condition     = can(regex("^[a-z][a-z0-9-]{2,28}[a-z0-9]$", var.project_name))
    error_message = "Project name must be 4-30 chars, lowercase alphanumeric with hyphens, starting with a letter."
  }
}

Optional Attributes TF 1.3+

Object type constraints can declare attributes as optional, with an optional default value. This simplifies module interfaces by reducing the number of required fields callers must specify.

variable "database" {
  type = object({
    engine         = string
    engine_version = string
    instance_class = string
    # Optional with defaults
    storage_gb     = optional(number, 20)
    multi_az       = optional(bool, false)
    backup_days    = optional(number, 7)
    port           = optional(number)  # Defaults to null
  })
}

# Callers only need to specify required fields
# database = {
#   engine         = "postgres"
#   engine_version = "15.4"
#   instance_class = "db.t3.medium"
# }

Splat Expressions

The splat operator [*] extracts a single attribute from a list of objects. It works with count-based resources but not with for_each, which requires a for expression instead.

# count-based resources: use splat
resource "aws_instance" "web" {
  count         = 3
  ami           = data.aws_ami.ubuntu.id
  instance_type = "t3.micro"
}

output "instance_ids" {
  value = aws_instance.web[*].id
  # Returns: ["i-abc", "i-def", "i-ghi"]
}

output "private_ips" {
  value = aws_instance.web[*].private_ip
}

# for_each-based resources: use for expression
resource "aws_s3_bucket" "this" {
  for_each = var.bucket_names
  bucket   = each.value
}

output "bucket_arns" {
  value = [for b in aws_s3_bucket.this : b.arn]
}

output "bucket_map" {
  value = { for k, b in aws_s3_bucket.this : k => b.arn }
}

terraform_remote_state

The terraform_remote_state data source reads output values from another Terraform configuration's state. This enables cross-configuration data sharing without hardcoding values.

# In the networking configuration, expose the VPC ID
# (networking/outputs.tf)
output "vpc_id" {
  value = aws_vpc.main.id
}

output "private_subnet_ids" {
  value = aws_subnet.private[*].id
}

# In the application configuration, read networking outputs
# (application/data.tf)
data "terraform_remote_state" "network" {
  backend = "s3"

  config = {
    bucket = "my-terraform-state"
    key    = "prod/network/terraform.tfstate"
    region = "us-east-1"
  }
}

# Use the remote state outputs
resource "aws_instance" "app" {
  subnet_id = data.terraform_remote_state.network.outputs.private_subnet_ids[0]
  vpc_security_group_ids = [aws_security_group.app.id]
}

resource "aws_security_group" "app" {
  vpc_id = data.terraform_remote_state.network.outputs.vpc_id
}

Security Consideration

Any configuration that reads another state via terraform_remote_state gains access to all outputs from that state, including any marked sensitive. Consider using purpose-built data sharing mechanisms (SSM Parameter Store, Consul KV, or a dedicated outputs module) if you need finer-grained access control.

10

Testing & Validation

From quick syntax checks to full integration tests. Terraform's native testing framework, provider mocking, and the ecosystem of third-party validation tools.

terraform validate

The fastest feedback loop. terraform validate checks configuration syntax and internal consistency without accessing any remote services or state. It catches typos, missing required arguments, and invalid references — no credentials needed.

# Basic validation (requires terraform init first)
terraform validate

# JSON output for CI/CD pipelines
terraform validate -json

# Example JSON output
{
  "valid": false,
  "error_count": 1,
  "warning_count": 0,
  "diagnostics": [
    {
      "severity": "error",
      "summary": "Missing required argument",
      "detail": "The argument \"region\" is required."
    }
  ]
}

CI Quick Check

Run terraform init -backend=false followed by terraform validate -json in your CI pipeline for sub-second syntax validation without configuring any backend or credentials.

Native Testing Framework

Terraform 1.6 introduced a built-in testing framework using .tftest.hcl files. Tests live alongside your configuration and use run blocks with assertions to verify plan and apply behavior. No external tools or languages required.

# tests/basic.tftest.hcl

# Global variables for all run blocks
variables {
  project_name = "test-project"
  environment  = "test"
}

# Plan-only test — fast, no real infrastructure
run "validates_instance_type" {
  command = plan

  assert {
    condition     = aws_instance.web.instance_type == "t3.micro"
    error_message = "Instance type must be t3.micro"
  }
}

# Apply test — creates real resources, then destroys
run "creates_s3_bucket" {
  command = apply

  assert {
    condition     = aws_s3_bucket.data.bucket_regional_domain_name != ""
    error_message = "Bucket should have a regional domain name after creation"
  }

  assert {
    condition     = aws_s3_bucket.data.tags["Environment"] == "test"
    error_message = "Bucket must be tagged with the correct environment"
  }
}

# Override variables in a specific run block
run "validates_production_sizing" {
  command = plan

  variables {
    environment  = "production"
    instance_type = "m5.xlarge"
  }

  assert {
    condition     = aws_instance.web.instance_type == "m5.xlarge"
    error_message = "Production should use m5.xlarge"
  }
}

# Run all tests
terraform test

# Run tests with verbose output
terraform test -verbose

# Run a specific test file
terraform test -filter=tests/basic.tftest.hcl

# JSON output for CI
terraform test -json

Provider Mocking

Terraform 1.7 added mock providers for fast unit tests that never touch real infrastructure. Mock providers simulate provider behavior, returning placeholder values so you can test module logic, variable validation, and output expressions without any API calls.

# tests/unit.tftest.hcl — mock provider tests

# Mock the AWS provider entirely
mock_provider "aws" {}

# All resources get synthetic values; no real API calls
run "test_naming_convention" {
  command = plan

  variables {
    project_name = "payments"
    environment  = "staging"
  }

  assert {
    condition     = aws_s3_bucket.data.bucket == "payments-staging-data"
    error_message = "Bucket name should follow {project}-{env}-data pattern"
  }
}

# Mock with overridden data for specific resources
mock_provider "aws" {
  alias = "with_data"

  mock_data "aws_caller_identity" {
    defaults = {
      account_id = "123456789012"
      arn        = "arn:aws:iam::123456789012:root"
    }
  }
}

run "test_account_id_usage" {
  command = plan

  providers = {
    aws = aws.with_data
  }

  assert {
    condition     = contains(aws_iam_policy.deploy.policy, "123456789012")
    error_message = "Policy should reference the correct account ID"
  }
}

Third-Party Testing Tools

Tool	Type	Language	Description
`terraform test`	Native	HCL	Built-in testing framework with plan/apply assertions and mock providers
`Terratest`	E2E	Go	Full integration testing — deploys real infrastructure, validates, destroys
`tflint`	Linter	Config	Pluggable linter that catches provider-specific errors (e.g., invalid instance types)
`tfsec`	Security	Config	Static analysis for security misconfigurations (now part of Trivy)
`checkov`	Compliance	Python	Policy-as-code scanner with 1000+ built-in checks for cloud security best practices
`Sentinel`	Policy	Sentinel	HashiCorp's enterprise policy framework — enforced in Terraform Cloud/Enterprise
`OPA / Conftest`	Policy	Rego	Open-source policy engine. Evaluate `terraform plan -out` JSON against Rego rules

CI/CD Integration

Integrate testing into your pipeline with a layered approach: fast checks first, then progressively more expensive validations.

# Example: GitHub Actions workflow for Terraform CI
name: Terraform CI
on: [pull_request]

jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: hashicorp/setup-terraform@v3

      # Layer 1: Format check (instant)
      - name: Check formatting
        run: terraform fmt -check -recursive

      # Layer 2: Validate syntax (fast, no creds)
      - name: Init & validate
        run: |
          terraform init -backend=false
          terraform validate -json

      # Layer 3: Lint (seconds)
      - name: TFLint
        uses: terraform-linters/setup-tflint@v4
        run: tflint --recursive

      # Layer 4: Security scan (seconds)
      - name: Trivy / tfsec
        uses: aquasecurity/trivy-action@master
        with:
          scan-type: config

      # Layer 5: Native tests (seconds for mocked, minutes for apply)
      - name: Terraform test
        run: terraform test

      # Layer 6: Plan (requires credentials)
      - name: Plan
        run: terraform plan -out=tfplan -no-color
        env:
          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}

Plan as a Test

Even without a formal testing framework, terraform plan -detailed-exitcode is a powerful test: exit code 0 means no changes, 1 means error, and 2 means changes detected. Use this in CI to catch drift or unexpected modifications.

11

Troubleshooting

Diagnosing and resolving common Terraform errors. Debug logging, state recovery, drift detection, and systematic approaches to the issues you will inevitably encounter.

Error Categories

Language Errors

terraform validate

Syntax errors, invalid references, type mismatches, missing required arguments. Caught before any API calls.

State Errors

terraform state list

State lock conflicts, corrupt state files, state/reality drift, missing resources in state.

Provider Errors

terraform providers

Authentication failures, API rate limits, permission denied, version incompatibilities.

Core Errors

terraform version

Dependency cycles, provider crashes, memory issues, Terraform version incompatibilities.

Common Errors & Solutions

Error	Cause	Solution
`Resource already exists`	Resource exists in the cloud but not in state	Import with `terraform import` or rename the resource
`Error acquiring the state lock`	Another process holds the lock, or a previous run crashed	`terraform force-unlock <LOCK_ID>`
`Provider version constraints`	Lock file conflicts with version constraints in config	Align constraints in `required_providers`, run `terraform init -upgrade`
`Cycle detected`	Two or more resources reference each other	Refactor to break the cycle; use `depends_on` sparingly or introduce an intermediate resource
`(known after apply)`	Value depends on a resource that hasn't been created yet	Normal behavior — not an error. The value will be resolved during `apply`
`Error: No valid credential sources`	Provider cannot find authentication credentials	Check env vars, shared credentials file, IAM role, or SSO configuration
`AccessDeniedException`	Authenticated but lacking required IAM permissions	Review and expand the IAM policy attached to the Terraform execution role

Force-Unlock Safety

terraform force-unlock is a dangerous operation. Only use it when you are certain no other process is running Terraform against this state. If a colleague's apply is genuinely in progress, force-unlocking can corrupt your state. Always verify that the lock holder process has actually terminated before unlocking.

Debug Logging

Terraform uses the TF_LOG environment variable to control log verbosity. Logs go to stderr by default. Set TF_LOG_PATH to write logs to a file for easier analysis.

# Set log level (most verbose to least)
export TF_LOG=TRACE   # Everything — extremely verbose
export TF_LOG=DEBUG   # Detailed internal operations
export TF_LOG=INFO    # General operational messages
export TF_LOG=WARN    # Warnings only
export TF_LOG=ERROR   # Errors only

# Write logs to a file instead of stderr
export TF_LOG_PATH="./terraform-debug.log"

# Provider-specific logging
export TF_LOG_CORE=WARN       # Core Terraform at WARN level
export TF_LOG_PROVIDER=TRACE  # Provider plugins at TRACE level

# Run with debug logging for a single command
TF_LOG=DEBUG terraform plan

# Disable logging
unset TF_LOG
unset TF_LOG_PATH

Provider Debug Tip

When troubleshooting API issues, set TF_LOG_PROVIDER=TRACE while keeping TF_LOG_CORE=WARN. This shows the raw HTTP requests and responses from the provider without flooding your logs with Terraform's internal graph operations.

State Recovery

If your state file becomes corrupted or lost, you have several recovery options depending on your backend configuration.

# Pull the current state from the backend to a local file
terraform state pull > terraform.tfstate.backup

# Push a local state file to the backend (use with extreme caution)
terraform state push terraform.tfstate.backup

# S3 backend: enable versioning for automatic state backups
resource "aws_s3_bucket_versioning" "state" {
  bucket = aws_s3_bucket.terraform_state.id
  versioning_configuration {
    status = "Enabled"
  }
}

# Recover a previous state version from S3
aws s3api list-object-versions \
  --bucket my-terraform-state \
  --prefix prod/terraform.tfstate

aws s3api get-object \
  --bucket my-terraform-state \
  --key prod/terraform.tfstate \
  --version-id "VERSION_ID" \
  recovered.tfstate

Local State Backups

Terraform automatically creates a terraform.tfstate.backup file before writing a new state locally. With remote backends like S3, enable bucket versioning to maintain a full history of state changes that you can restore from.

Handling Drift

Drift occurs when real infrastructure diverges from the Terraform state — caused by manual changes, other tools, or external processes. Terraform detects drift during the refresh phase of every plan and apply.

# Detect drift: plan compares state with reality
terraform plan

# Sync state with reality without changing infrastructure
# (updates state to match what actually exists)
terraform apply -refresh-only

# Import an unmanaged resource into state
terraform import aws_instance.web i-0abc123def456789

# Import block (TF 1.5+) — declarative import
import {
  to = aws_instance.web
  id = "i-0abc123def456789"
}

# Generate configuration for imported resources (TF 1.5+)
terraform plan -generate-config-out=generated.tf

Detect Drift

terraform plan

Shows differences between desired configuration, state, and actual infrastructure. Any drift appears as planned changes.

Sync State Only

terraform apply -refresh-only

Updates state to match reality without applying config changes. Use when you accept the manual change.

Import Resource

terraform import TYPE.NAME ID

Bring an existing resource under Terraform management. You must also write the corresponding resource block.

Force Replace

terraform apply -replace=TYPE.NAME

Destroy and recreate a specific resource. Useful when a resource is in a bad state that in-place updates cannot fix.

12

Tips & Best Practices

Battle-tested conventions for organizing, securing, documenting, and scaling Terraform projects. The patterns that separate quick experiments from production-grade infrastructure code.

Project Structure

Simple layout — for small projects or single environments:

project/
  main.tf           # Resources
  variables.tf      # Input variable declarations
  outputs.tf        # Output declarations
  terraform.tf      # Provider & backend config
  terraform.tfvars  # Variable values (not committed)

Modular layout — for reusable components:

project/
  modules/
    networking/
      main.tf
      variables.tf
      outputs.tf
    compute/
      main.tf
      variables.tf
      outputs.tf
  environments/
    dev/
      main.tf       # Calls modules with dev settings
      terraform.tf
      dev.tfvars
    staging/
      main.tf
      terraform.tf
      staging.tfvars
    production/
      main.tf
      terraform.tf
      prod.tfvars

Component-based layout — for large organizations with independent teams:

infrastructure/
  components/
    networking/       # Separate state per component
      main.tf
      backend.tf
    database/
      main.tf
      backend.tf
    application/
      main.tf
      backend.tf      # References networking/database via remote state
  modules/            # Shared modules used by components
    vpc/
    rds/
    ecs/

Naming Conventions

Element	Convention	Example
Resources	Use underscores, descriptive nouns	`aws_instance.web_server`
Variables	Descriptive, snake_case	`var.instance_count`
Outputs	Describe the value being exported	`output "vpc_id"`
Modules	Short, noun-based names	`module "networking"`
Locals	Computed names, prefixed composites	`local.name_prefix`
Data sources	Describe what you're looking up	`data.aws_ami.ubuntu_latest`
Files	Functional grouping	`networking.tf`, `iam.tf`, `outputs.tf`

Single-Resource Naming

When a module has only one resource of a given type, name it "this" or "main" rather than repeating the type name. For example: aws_vpc.this instead of aws_vpc.vpc. This is the convention used by the official Terraform AWS modules.

Tagging Strategy

Consistent tagging is essential for cost tracking, access control, and operational management. Define standard tags in the provider's default_tags block and supplement per-resource.

# Provider-level default tags (applied to all resources)
provider "aws" {
  region = var.region

  default_tags {
    tags = {
      Environment = var.environment
      Project     = var.project_name
      ManagedBy   = "terraform"
      Owner       = var.team_name
      CostCenter  = var.cost_center
    }
  }
}

# Per-resource tags (merged with default_tags)
resource "aws_instance" "web" {
  ami           = data.aws_ami.ubuntu.id
  instance_type = var.instance_type

  tags = {
    Name = "${var.project_name}-web-${var.environment}"
    Role = "webserver"
  }
}

.gitignore for Terraform

# Terraform .gitignore

# Local .terraform directories
**/.terraform/*

# .tfstate files (state should live in remote backend)
*.tfstate
*.tfstate.*

# Crash log files
crash.log
crash.*.log

# Variable files that may contain secrets
*.tfvars
*.tfvars.json

# Override files (used for local dev overrides)
override.tf
override.tf.json
*_override.tf
*_override.tf.json

# CLI configuration files
.terraformrc
terraform.rc

# KEEP the lock file committed
# !.terraform.lock.hcl

Always Commit the Lock File

The .terraform.lock.hcl file records the exact provider versions and hashes used. Always commit this file to version control. It ensures every team member and CI pipeline uses identical provider binaries, preventing subtle inconsistencies.

Formatting & Linting

# Format all files in current directory (recursive)
terraform fmt -recursive

# Check formatting without modifying (for CI)
terraform fmt -check -recursive -diff

# Pre-commit hook configuration (.pre-commit-config.yaml)
repos:
  - repo: https://github.com/antonbabenko/pre-commit-terraform
    rev: v1.96.1
    hooks:
      - id: terraform_fmt
      - id: terraform_validate
      - id: terraform_tflint
      - id: terraform_docs
      - id: terraform_trivy

Security

Never Hardcode Secrets

variable "db_password" { sensitive = true }

Use Vault, AWS SSM Parameter Store, environment variables, or .tfvars files excluded from version control. Never put secrets in .tf files.

Mark Sensitive Outputs

output "password" { sensitive = true }

Prevents values from appearing in CLI output or logs. Downstream modules still receive the value but it is redacted in plan output.

Encrypt State

encrypt = true # in backend config

State files contain sensitive data (passwords, keys). Use encrypted backends (S3 with SSE, GCS with CMEK) and restrict access with IAM.

Least Privilege

iam_role_arn = "arn:aws:iam::..."

Give Terraform only the permissions it needs. Use separate roles for plan (read-only) and apply (write). Audit with CloudTrail.

Cost Estimation & Documentation

# Infracost — cost breakdown from plan
infracost breakdown --path=.

# Infracost in CI — comment on PR with cost diff
infracost diff --path=. \
  --compare-to=infracost-base.json \
  --format=json > infracost-diff.json

infracost comment github \
  --path=infracost-diff.json \
  --repo=myorg/myrepo \
  --pull-request=$PR_NUMBER

# terraform-docs — auto-generate module README
terraform-docs markdown table --output-file README.md .

# terraform-docs in pre-commit
repos:
  - repo: https://github.com/terraform-docs/terraform-docs
    rev: v0.19.0
    hooks:
      - id: terraform-docs-go
        args: ["markdown", "table", "--output-file", "README.md"]

Version Pinning

terraform {
  # Pin Terraform itself to a minor version range
  required_version = ">= 1.7.0, < 2.0.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      # Pin provider to a minor version range
      version = "~> 5.40"  # Allows 5.40.x, 5.41.x, etc.
    }
    random = {
      source  = "hashicorp/random"
      version = "~> 3.6"
    }
  }
}

# Version constraint operators:
# =  1.0.0    Exact version
# != 1.0.0    Exclude version
# >  1.0.0    Greater than
# >= 1.0.0    Greater than or equal
# <  2.0.0    Less than
# ~> 1.0      Pessimistic (allows 1.x, not 2.0)
# ~> 1.0.0    Pessimistic (allows 1.0.x, not 1.1.0)

Top 10 Best Practices

#	Practice	Why
1	Use remote state with locking	Prevents concurrent modifications and state corruption in teams
2	Pin provider and Terraform versions	Ensures reproducible builds; prevents surprise breaking changes
3	Commit the `.terraform.lock.hcl` file	Guarantees identical provider binaries across all environments
4	Use modules for reusable components	DRY principle; consistent infrastructure patterns across projects
5	Never hardcode secrets	Secrets in code end up in state files, logs, and version control
6	Use `for_each` over `count`	Map-keyed resources survive reordering; count causes index-shift cascades
7	Always review the plan before apply	The plan is your safety net — never skip it, especially in production
8	Tag everything	Essential for cost allocation, ownership tracking, and automated operations
9	Keep state blast radius small	Split large monoliths into components; one bad apply shouldn't risk everything
10	Automate formatting and validation in CI	Catch issues early; enforce consistency without manual review burden

Start Simple, Scale Gradually

Don't over-engineer your Terraform setup from day one. Start with a flat file structure, add modules when you see duplication, split state when the blast radius grows too large, and adopt policy tools when your team and compliance requirements demand it. Every layer of abstraction has a cost.