The Multi-Region road - Amazon API Gateway

The Multi-Region road - Amazon API Gateway

In this post, I will look at how Amazon API Gateway HTTP API fit in a multi-region design.

As a part of the series The Multi-Region road, you can check out the other parts:

  • Part 1 - a reflection on what to consider before starting a multi-region architecture.
  • Part 2 - CloudFront failover configuration.

Amazon API Gateway

Amazon API Gateway for my use cases is coming in two flavours:

  • REST
  • HTTP

The main differences are reported here and other to mentions are the price HTTP is cheaper and 60% faster than REST API.

I have opted for the HTTP API, and it is worthy to say that caching support is missing. It is only present for the REST API.

I think it is not a problem because I can cache the response in CloudFront, and if I need some cache key manipulation, I can always use CloudFront functions.

Multi-Region

The requirements for the multi-region are:

  • The application should have low latency.
  • The application should be able to stand to a region failure.
  • The application should scale.

All the above rules need to be applied to this architecture:

multi-region1.jpeg

To achieve such a complex configuration, I must apply multiple Route53 policies, and you can read from AWS:

When you create a record, you choose a routing policy, which determines how Amazon Route 53 responds to queries:

  • Simple routing policy – Use a single resource that performs a given function for your domain, for example, a web server that serves content for the example.com website.
  • Failover routing policy – Use when you want to configure active-passive failover.
  • Geolocation routing policy – Use when you want to route traffic based on the location of your users.
  • Geoproximity routing policy – Use when you want to route traffic based on the location of your resources and, optionally, shift traffic from resources in one location to resources in another.
  • Latency routing policy – Use when you have resources in multiple AWS Regions, and you want to route traffic to the Region that provides the best latency with less round-trip time.
  • Multivalue answer routing policy – Use when you want Route 53 to respond to DNS queries with up to eight healthy records selected at random.
  • Weighted routing policy – Use to route traffic to multiple resources in proportions you specify.

I want to leverage:

  • Failover routing policy
  • Latency routing policy
  • Weighted routing policy (too much for a multi-account setup on the same region)

Failover routing policy

Use when you want to configure active-passive failover.

AWSTemplateFormatVersion: 2010-09-09

##########################################################################
#  Parameters                                                            #
##########################################################################
Parameters:
  Route53HostedZoneId:
    Type: String
  Route53DomainName:
    Type: String
  ApiPrimaryHostedZoneId:
    Type: String
  ApiPrimaryDomainName:
    Type: String
  ApiPrimaryEndpoint:
    Type: String
  ApiSecondaryHostedZoneId:
    Type: String
  ApiSecondaryDomainName:
    Type: String

Resources:
##########################################################################
#  Health Check                                                          #
##########################################################################
  HealthCheck:
    Type: AWS::Route53::HealthCheck
    Properties:
      HealthCheckConfig:
        FullyQualifiedDomainName: !Ref ApiPrimaryEndpoint
        Port: 443
        RequestInterval: 10
        FailureThreshold: 3
        ResourcePath: /api
        Type: HTTPS
        Regions:
          - eu-west-1 #Ireland
          - ap-southeast-2 #Sydney
          - us-west-1 #California
      HealthCheckTags:
      - Key: Name
        Value: !Ref ApiPrimaryEndpoint

##########################################################################
#  Route53                                                               #
##########################################################################
  FailOverGroup:    
    Type: AWS::Route53::RecordSetGroup
    Properties:
      HostedZoneId: !Ref Route53HostedZoneId
      RecordSets: 
        - Name:
            !Ref Route53DomainName
          Type: A
          Failover: PRIMARY
          SetIdentifier: Primary
          HealthCheckId: !Ref HealthCheck
          AliasTarget:
            HostedZoneId: !Ref ApiPrimaryHostedZoneId
            DNSName: !Ref ApiPrimaryDomainName
            EvaluateTargetHealth: True
        - Name:
            !Ref Route53DomainName
          Type: A
          Failover: SECONDARY
          SetIdentifier: Secondary
          AliasTarget:
            HostedZoneId: !Ref ApiSecondaryHostedZoneId
            DNSName: !Ref ApiSecondaryDomainName
            EvaluateTargetHealth: False

In this example, I use two regions, but if I want to create a cascade failover like

DE -> IRL -> UK -> IT

I must repeat the code and pass more parameters as input of the template.

Latency routing policy

Use when you have resources in multiple AWS Regions, and you want to route traffic to the Region that provides the best latency with less round-trip time.

AWSTemplateFormatVersion: 2010-09-09

##########################################################################
#  Parameters                                                            #
##########################################################################
Parameters:
  Route53HostedZoneId:
    Type: String
  Route53DomainName:
    Type: String

  ApiGermanyHostedZoneId:
    Type: String
  ApiGermanyDomainName:
    Type: String
  ApiGermanyEndpoint:
    Type: String

  ApiIrelandHostedZoneId:
    Type: String
  ApiIrelandDomainName:
    Type: String
  ApiIrelandEndpoint:
    Type: String

Resources:
##########################################################################
#  Health Check                                                          #
##########################################################################
  HealthCheckGermany:
    Type: AWS::Route53::HealthCheck
    Properties:
      HealthCheckConfig:
        FullyQualifiedDomainName: !Ref ApiGermanyEndpoint
        Port: 443
        RequestInterval: 10 #sec
        FailureThreshold: 3
        ResourcePath: /api
        Type: HTTPS
        Regions:
          - eu-west-1 #Ireland
          - ap-southeast-2 #Sydney
          - us-west-1 #California
      HealthCheckTags:
      - Key: Name
        Value: !Ref ApiGermanyEndpoint

  HealthCheckIreland:
    Type: AWS::Route53::HealthCheck
    Properties:
      HealthCheckConfig:
        FullyQualifiedDomainName: !Ref ApiIrelandEndpoint
        Port: 443
        RequestInterval: 10 #sec
        FailureThreshold: 3
        ResourcePath: /api
        Type: HTTPS
        Regions:
          - eu-west-1 #Ireland
          - ap-southeast-2 #Sydney
          - us-west-1 #California
      HealthCheckTags:
      - Key: Name
        Value: !Ref ApiIrelandEndpoint


##########################################################################
#  Route53                                                               #
##########################################################################
  DNSARecordGroup:    
    Type: AWS::Route53::RecordSetGroup
    Properties:
      HostedZoneId: !Ref Route53HostedZoneId
      RecordSets: 
        - Name:
            !Ref Route53DomainName
          Type: A
          Region: eu-central-1
          SetIdentifier: Germany
          HealthCheckId: !Ref HealthCheckGermany
          AliasTarget:
            HostedZoneId: !Ref ApiGermanyHostedZoneId
            DNSName: !Ref ApiGermanyDomainName
            EvaluateTargetHealth: True
        - Name:
            !Ref Route53DomainName
          Type: A
          Region: eu-west-1
          SetIdentifier: Ireland
          HealthCheckId: !Ref HealthCheckIreland
          AliasTarget:
            HostedZoneId: !Ref ApiIrelandHostedZoneId
            DNSName: !Ref ApiIrelandDomainName
            EvaluateTargetHealth: True

I use two regions in this example, but if I want to use all the regions, I must repeat the code into the RecordSetGroup and create a health check for each endpoint.

Curiosity

I am forced to use a custom domain at the API level to make both policies work.

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31

##########################################################################
#  Parameters                                                            #
##########################################################################
Parameters:
  StageName:
    Description: The name of the stage is the first path segment in the Uniform Resource Identifier (URI) of a call to API Gateway
    Type: String
    Default: dev
  Domain:
    Type: String
  CertId:
    Type: String
  Stage:
    Type: String
    Default: $default

Resources:
##########################################################################
#  API Gateway Custom Domain                               
# ##########################################################################
  CustomDomainName:
    Type: AWS::ApiGatewayV2::DomainName
    Properties:
      DomainName: !Ref Domain
      DomainNameConfigurations:
        - EndpointType: REGIONAL
          CertificateArn: !Sub "arn:aws:acm:${AWS::Region}:${AWS::AccountId}:certificate/${CertId}"
          CertificateName: gatewayCertificate

  ApiMapping:
    Type: AWS::ApiGatewayV2::ApiMapping
    Properties:
      ApiId: 
        Fn::ImportValue: 
          !Sub ApiId-${AWS::Region}
      DomainName: !Ref CustomDomainName
      Stage: !Ref Stage

Outputs:
  RegionalHostedZoneId:
    Description: The regional hosted zone id of the custom domain
    Value: !GetAtt CustomDomainName.RegionalHostedZoneId
    Export:
      Name: !Sub RegionalHostedZoneId-${AWS::Region}

  RegionalDomainName:
    Description: The regional domain name of the custom domain
    Value: !GetAtt CustomDomainName.RegionalDomainName
    Export:
      Name: !Sub RegionalDomainName-${AWS::Region}

Custom domain names are more intuitive URLs to provide to users than a random combination of letters. For example, route53 uses the custom domain for the route policy while the health checks the endpoint. However, this could be misleading if you try to simulate an error to trigger the failover or the next best latency region by removing the custom domain or the endpoint mapping of the custom domain.

Conclusion

As you know, it is not possible to have multiple routing policies on the same record. I have tried to avoid my memory playing some tricks on me, and the error is:

(InvalidChangeBatch 400: RRSet with DNS name api.xxxxx.com., type A, SetIdentifier Italy, and Region Name=eu-south-1 cannot be created because a non-latency RRSet with the same name and type already exists.)

The health checks are the key for both routing policies because Route53 will redirect the request to the next endpoint only when the health check status is unhealthy. There is a How health checks work in complex Amazon Route 53 configurations that is taking into consideration a few cases, but it is using instances.

While I do serverless, the concept is the same and to simulate a failure of one endpoint, I need to make sure that the Lambda function will return a 500 error status and not remove, for example, the API mapping from the custom domain.

I now have solved the second step of moving this architecture to a multi-region setup.

sample.png

Next, I will look at DynamoDB and the complexity of sharing data in a multi-region scenario.