Skip to content

Multi-Region Active-Active Architecture

Advanced

Design and deploy a highly available, multi-region active-active application using Route 53, Aurora Global Database, and DynamoDB Global Tables.

Overview

This project covers one of the most demanding architectural challenges in AWS: running a production workload that is fully active in two or more AWS regions simultaneously. Both regions accept live traffic, and data is replicated bi-directionally with sub-second latency.

What You Will Learn

Prerequisites

Architecture

                        ┌─────────────────────────┐
                        │       Route 53           │
                        │  Latency-based routing   │
                        │  + Health checks         │
                        └────────┬────────┬────────┘
                                 │        │
               ┌─────────────────▼─┐   ┌──▼─────────────────┐
               │   us-east-1       │   │   eu-west-1        │
               │  ┌─────────────┐  │   │  ┌─────────────┐   │
               │  │     ALB     │  │   │  │     ALB     │   │
               │  └──────┬──────┘  │   │  └──────┬──────┘   │
               │  ┌──────▼──────┐  │   │  ┌──────▼──────┐   │
               │  │  ECS/EKS    │  │   │  │  ECS/EKS    │   │
               │  │  App Layer  │  │   │  │  App Layer  │   │
               │  └──────┬──────┘  │   │  └──────┬──────┘   │
               │  ┌──────▼──────┐  │   │  ┌──────▼──────┐   │
               │  │  DynamoDB   │◄─┼───┼─►│  DynamoDB   │   │
               │  │  Global     │  │   │  │  Global     │   │
               │  │  Table      │  │   │  │  Table      │   │
               └──┴─────────────┴──┘   └──┴─────────────┴───┘

Steps

1. Set Up the VPCs in Each Region

Use the same CIDR structure in both regions to keep things predictable (use non-overlapping ranges if you plan to peer them):

# Terraform example - repeat for eu-west-1
resource "aws_vpc" "main" {
  provider   = aws.us_east_1
  cidr_block = "10.0.0.0/16"
  tags = { Name = "main-us-east-1" }
}

resource "aws_subnet" "private" {
  for_each          = toset(["us-east-1a", "us-east-1b", "us-east-1c"])
  provider          = aws.us_east_1
  vpc_id            = aws_vpc.main.id
  availability_zone = each.value
  cidr_block        = cidrsubnet("10.0.0.0/16", 8, index(["us-east-1a", "us-east-1b", "us-east-1c"], each.value))
}

2. Deploy DynamoDB Global Tables

  1. Create the table in us-east-1:
aws dynamodb create-table \
  --table-name Orders \
  --attribute-definitions AttributeName=orderId,AttributeType=S \
  --key-schema AttributeName=orderId,KeyType=HASH \
  --billing-mode PAY_PER_REQUEST \
  --region us-east-1
  1. Add eu-west-1 as a replica:
aws dynamodb update-table \
  --table-name Orders \
  --replica-updates '[{"Create": {"RegionName": "eu-west-1"}}]' \
  --region us-east-1

Writes in either region are asynchronously replicated — typically within 1 second.

Conflict resolution: DynamoDB uses last-writer-wins (LWW) based on wall-clock time. Design your application to avoid conflicting concurrent writes to the same item, or use a version attribute with conditional expressions.

3. Set Up Aurora Global Database

  1. Create the primary cluster in us-east-1:
aws rds create-global-cluster \
  --global-cluster-identifier my-global-db \
  --engine aurora-mysql \
  --engine-version 8.0.mysql_aurora.3.04.0 \
  --deletion-protection
  1. Add a secondary region:
aws rds create-db-cluster \
  --db-cluster-identifier my-db-eu-west-1 \
  --engine aurora-mysql \
  --global-cluster-identifier my-global-db \
  --region eu-west-1

Aurora Global Database provides a single write endpoint (primary) with read replicas in secondary regions. Use it for workloads requiring strong consistency — route all writes to the primary and reads to the nearest region.

4. Deploy Application Layer in Both Regions

Use ECS Fargate or EKS with the same container image:

# ECS Task Definition (simplified)
containerDefinitions:
  - name: api
    image: 123456789.dkr.ecr.us-east-1.amazonaws.com/api:latest
    environment:
      - name: DYNAMO_TABLE
        value: Orders
      - name: AWS_REGION
        value: us-east-1 # Override per region at deploy time
    portMappings:
      - containerPort: 3000

5. Configure Route 53 for Global Traffic Routing

# Create health checks for each regional ALB
aws route53 create-health-check \
  --caller-reference "us-east-1-$(date +%s)" \
  --health-check-config '{
    "Type": "HTTPS",
    "FullyQualifiedDomainName": "alb.us-east-1.example.com",
    "ResourcePath": "/health",
    "RequestInterval": 10,
    "FailureThreshold": 2
  }'

# Create latency records pointing to each ALB
# (Repeat with SetIdentifier "eu-west-1" for the European endpoint)
aws route53 change-resource-record-sets \
  --hosted-zone-id YOUR_ZONE_ID \
  --change-batch '{
    "Changes": [{
      "Action": "UPSERT",
      "ResourceRecordSet": {
        "Name": "api.example.com",
        "Type": "A",
        "SetIdentifier": "us-east-1",
        "Region": "us-east-1",
        "HealthCheckId": "HEALTH_CHECK_ID",
        "AliasTarget": {
          "HostedZoneId": "ALB_HOSTED_ZONE_ID",
          "DNSName": "alb.us-east-1.example.com",
          "EvaluateTargetHealth": true
        }
      }
    }]
  }'

6. Observability and Failover Testing

Set up CloudWatch cross-region dashboards and alarms:

# Push a cross-region metric alarm to SNS
aws cloudwatch put-metric-alarm \
  --alarm-name "us-east-1-error-rate" \
  --metric-name 5XXError \
  --namespace AWS/ApplicationELB \
  --statistic Sum \
  --period 60 \
  --evaluation-periods 2 \
  --threshold 50 \
  --comparison-operator GreaterThanThreshold \
  --alarm-actions arn:aws:sns:us-east-1:ACCOUNT:ops-alert

To test failover: deliberately fail the health check endpoint in one region and verify Route 53 redirects traffic within the configured TTL.

Cost Considerations

Multi-region active-active is expensive. Key cost drivers:

Use AWS Pricing Calculator to model costs before committing.

Clean Up

  1. Remove Route 53 records.
  2. Delete ECS services and clusters in both regions.
  3. Remove Aurora Global Database secondary, then primary.
  4. Delete DynamoDB Global Table replicas, then the table.
  5. Delete VPCs and associated resources.

Going Further