Terraform on AWS: Infrastructure as Code Guide
Terraform manages AWS infrastructure as code — reproducible, version-controlled, reviewable. Here's the practical Terraform setup I use across six products on AWS.
Terraform is the infrastructure layer that makes solo engineering of multiple products tractable. Without it, I'd be clicking through the AWS console, forgetting what I configured, and unable to reproduce environments. Here's the Terraform setup I use across six products.
Why Terraform over AWS CDK / Pulumi / CloudFormation
| Tool | Language | State | My take | |------|----------|-------|---------| | Terraform | HCL | Remote (S3) | Declarative, provider-agnostic, largest ecosystem | | AWS CDK | TypeScript/Python | CloudFormation | Good for AWS-only, more code than config | | Pulumi | TypeScript/Python | Remote | Full programming language, more power than needed | | CloudFormation | YAML/JSON | AWS | Verbose, AWS-only, no good local development |
I use Terraform because HCL is readable, the ecosystem is massive, and it works across AWS + Cloudflare + GitHub from one tool.
Remote state with S3 + DynamoDB
# backend.tf
terraform {
backend "s3" {
bucket = "shahriar-terraform-state"
key = "letx/terraform.tfstate"
region = "ap-south-1"
dynamodb_table = "terraform-locks"
encrypt = true
}
}
# One-time setup: create state bucket and lock table
aws s3 mb s3://shahriar-terraform-state --region ap-south-1
aws s3api put-bucket-versioning \
--bucket shahriar-terraform-state \
--versioning-configuration Status=Enabled
aws dynamodb create-table \
--table-name terraform-locks \
--attribute-definitions AttributeName=LockID,AttributeType=S \
--key-schema AttributeName=LockID,KeyType=HASH \
--billing-mode PAY_PER_REQUEST \
--region ap-south-1
S3 backend: state is stored remotely (shareable, not on local disk). DynamoDB: prevents concurrent applies via lock table. Encryption: state may contain sensitive values.
Module structure
infrastructure/
├── modules/
│ ├── ecs-service/ # Reusable ECS Fargate service
│ ├── rds-postgres/ # PostgreSQL instance
│ ├── s3-bucket/ # S3 + CloudFront
│ └── vpc/ # VPC, subnets, security groups
├── environments/
│ ├── prod/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ └── terraform.tfvars
│ └── staging/
│ ├── main.tf
│ └── terraform.tfvars
└── shared/
├── ecr.tf # ECR repositories
└── iam.tf # IAM roles
Modules encapsulate reusable infrastructure. Each environment (prod, staging) uses the same modules with different variable values.
The ECS service module
The most-used module — every service deploys via it:
# modules/ecs-service/main.tf
variable "name" { type = string }
variable "image" { type = string }
variable "cpu" { type = number; default = 256 }
variable "memory" { type = number; default = 512 }
variable "port" { type = number; default = 8080 }
variable "desired_count" { type = number; default = 2 }
variable "environment" { type = map(string); default = {} }
variable "health_check_path" { type = string; default = "/health" }
resource "aws_ecs_task_definition" "this" {
family = var.name
network_mode = "awsvpc"
requires_compatibilities = ["FARGATE"]
cpu = var.cpu
memory = var.memory
execution_role_arn = aws_iam_role.execution.arn
task_role_arn = aws_iam_role.task.arn
container_definitions = jsonencode([{
name = var.name
image = var.image
portMappings = [{ containerPort = var.port, protocol = "tcp" }]
environment = [for k, v in var.environment : { name = k, value = v }]
logConfiguration = {
logDriver = "awslogs"
options = {
"awslogs-group" = "/ecs/${var.name}"
"awslogs-region" = data.aws_region.current.name
"awslogs-stream-prefix" = "ecs"
}
}
}])
}
resource "aws_ecs_service" "this" {
name = var.name
cluster = data.aws_ecs_cluster.main.id
task_definition = aws_ecs_task_definition.this.arn
desired_count = var.desired_count
launch_type = "FARGATE"
network_configuration {
subnets = data.aws_subnets.private.ids
security_groups = [aws_security_group.service.id]
assign_public_ip = false
}
load_balancer {
target_group_arn = aws_lb_target_group.this.arn
container_name = var.name
container_port = var.port
}
deployment_minimum_healthy_percent = 100
deployment_maximum_percent = 200
}
Using this module:
# environments/prod/main.tf
module "letx_api" {
source = "../../modules/ecs-service"
name = "letx-api"
image = "${module.ecr.letx_api_url}:${var.image_tag}"
cpu = 512
memory = 1024
desired_count = 2
environment = {
DATABASE_URL = module.rds.connection_string
REDIS_URL = module.elasticache.connection_string
JWT_SECRET = var.jwt_secret
}
}
Adding a new service to production: 10 lines of HCL.
Secrets management
Never put secrets in terraform.tfvars or variables.tf as plaintext. Use AWS Secrets Manager:
# Store secret
resource "aws_secretsmanager_secret" "jwt_secret" {
name = "letx/prod/jwt_secret"
}
resource "aws_secretsmanager_secret_version" "jwt_secret" {
secret_id = aws_secretsmanager_secret.jwt_secret.id
secret_string = var.jwt_secret # from TF_VAR_jwt_secret env var
}
# Pass secret to ECS task via secrets (not environment variables)
container_definitions = jsonencode([{
secrets = [{
name = "JWT_SECRET"
valueFrom = aws_secretsmanager_secret.jwt_secret.arn
}]
}])
ECS Fargate injects secrets from Secrets Manager at task start — they never appear in the task definition as plaintext.
CI/CD: apply on merge
# .github/workflows/terraform.yml
on:
push:
branches: [main]
paths: ["infrastructure/**"]
jobs:
terraform:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v3
with:
terraform_version: "1.9.0"
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ap-south-1
- working-directory: infrastructure/environments/prod
run: |
terraform init
terraform plan -out=tfplan
terraform apply tfplan
terraform plan before apply — even in CI, the plan is output to logs for review. For production infrastructure changes, I prefer manual apply after reviewing the plan.
FAQ
Why use Terraform for a solo project?
Because you will forget how you set up your infrastructure. Terraform is documentation that also enforces itself. terraform plan shows exactly what will change before anything touches production.
Should I use workspaces or separate directories for environments?
Separate directories (prod/, staging/) over workspaces. Workspaces share state files and can be confusing — a terraform apply in the wrong workspace is dangerous. Separate directories make the separation explicit.
How do you handle Terraform state for multiple products?
Separate state keys in S3 per product/service: letx/terraform.tfstate, quantumsketch/terraform.tfstate. Same S3 bucket, same DynamoDB lock table, separate state files. This way, applying changes to LetX doesn't lock QuantumSketch's state.
What's the difference between cpu in ECS task vs container definition?
The task-level cpu and memory allocate the Fargate vCPU and RAM for the entire task. Container-level values are for scheduling/monitoring only on Fargate (the task limits take precedence). Set them at the task level; the container values are informational.
Written by Shihab Shahriar Antor — AI Engineer & Founder of Shahriar Labs. See also: Microservices as One Engineer · Deploy Always-On AI Agents on AWS for ~$17/mo.