DynamoDB Global Secondary Index: Detailed Guide

0
622
DynamoDB Global Secondary Index

In this post, we are going to learn about DynamoDB Global Secondary Index, its features, use cases, examples and how can we create it using the AWS Serverless framework, so let’s get started, to know more about DynamoDB GSI check out this official AWS Documentation.

Also, check out how AWS DynamoDB Pricing gets calculated if you want to learn about it before using GSI.

What is DynamoDB Global Secondary Index?

Whenever we are using AWS DynamoDB we might want to query the table using different kinds of attributes which are non-key attributes for the table, for this purpose, we create a GSI, and then we can query that index with different attributes to get the data.

If we don’t create a DynamoDB Global Secondary Index then to query with non-key attributes we will need to scan the whole table which can become very costly very quickly, so it’s always better to work with indexes in case we want to query with non-key attributes.

Now let’s see some of the key points regarding GSI.

Key points

There are many things key points which can be taken our from GSI, I will try to summarize the most important things you need to know about GSI.

  • Freedom to choose which attributes from the base table we want to project to the DynamoDB Global Secondary Index.
  • Index can have different key attribute than of the base table.
  • Partition key attributes are always projected to the index.
  • Each of the index must have a partition key, sort key is optional to have.
  • Partition key attribute can be of String, Number and Binary type.
  • If we are querying the index then we can only get the attributes in the index and not in the base table (if they are not projected).
  • Query and Scan operations can be done on the index, as of now AWS doesn’t support GetItem and GetBatchItem operations on the index.
  • When any data gets written or updated in the base table those changes automatically gets updated in the indexes as well.
  • Data in an index gets updated in eventual consistent manner.
  • While writing or updating data into the base table, if we don’t specify the index attribute for the index then DynamoDB won’t update that data into the index.
  • Key attributes in the index doesn’t have to be unique.
  • Read and Write capacity units are independent for each index from it’s base table.
  • Read operations on the index can be throttled if there are not enough RCU.
  • AWS charges for storage of items in the index seperately from the base table.

These are some of the key points which I found out from the official AWS documentation for AWS DynamoDB Global Secondary Index, do check out the documentation for more detailed information.

AWS DynamoDB Global Secondary CLI Example

We can perform all kinds of operations on a DynamoDB Table using AWS CLI commands, let’s see how indexes work by using AWS CLI to demonstrate.

Create Table and Global Secondary Index

Create a JSON file named createTable.json and add this object to it.

{
  "TableName": "UserTable",
  "KeySchema": [
    {
      "AttributeName": "id",
      "KeyType": "HASH"
    },
    {
      "AttributeName": "name",
      "KeyType": "RANGE"
    }
  ],
  "AttributeDefinitions": [
    {
      "AttributeName": "id",
      "AttributeType": "S"
    },
    {
      "AttributeName": "name",
      "AttributeType": "S"
    },
    {
      "AttributeName": "age",
      "AttributeType": "N"
    }
  ],
  "ProvisionedThroughput": {
    "ReadCapacityUnits": 1,
    "WriteCapacityUnits": 1
  },
  "GlobalSecondaryIndexes": [
    {
      "IndexName": "AgeIndex",
      "KeySchema": [
        {
          "AttributeName": "age",
          "KeyType": "HASH"
        }
      ],
      "Projection": {
        "ProjectionType": "ALL"
      },
      "ProvisionedThroughput": {
        "ReadCapacityUnits": 1,
        "WriteCapacityUnits": 1
      }
    }
  ]
}

This Object is containing all the parameters needed to create a table and DynamoDB Global Secondary Index.

Command

After saving the above file, cd into the file location and execute this command from the terminal.

 aws dynamodb create-table --cli-input-json file://createTable.json 

Response

After you execute the command, you should see a response similar to this, if this kind of response is received then command execution was successful.

{
  "TableDescription": {
    "AttributeDefinitions": [
      {
        "AttributeName": "age",
        "AttributeType": "N"
      },
      {
        "AttributeName": "id",
        "AttributeType": "S"
      },
      {
        "AttributeName": "name",
        "AttributeType": "S"
      }
    ],
    "TableName": "UserTable",
    "KeySchema": [
      {
        "AttributeName": "id",
        "KeyType": "HASH"
      },
      {
        "AttributeName": "name",
        "KeyType": "RANGE"
      }
    ],
    "TableStatus": "CREATING",
    "CreationDateTime": "2022-01-08T21:45:11.039000+05:30",
    "ProvisionedThroughput": {
      "NumberOfDecreasesToday": 0,
      "ReadCapacityUnits": 1,
      "WriteCapacityUnits": 1
    },
    "TableSizeBytes": 0,
    "ItemCount": 0,
    "TableArn": "arn:aws:dynamodb:us-east-2:[YOUR_AWS_ACC_ID]:table/UserTable",
    "TableId": "306900fc-4673-4409-86e7-f36a011a6c73",
    "GlobalSecondaryIndexes": [
      {
        "IndexName": "AgeIndex",
        "KeySchema": [
          {
            "AttributeName": "age",
            "KeyType": "HASH"
          }
        ],
        "Projection": {
          "ProjectionType": "ALL"
        },
        "IndexStatus": "CREATING",
        "ProvisionedThroughput": {
          "NumberOfDecreasesToday": 0,
          "ReadCapacityUnits": 1,
          "WriteCapacityUnits": 1
        },
        "IndexSizeBytes": 0,
        "ItemCount": 0,
        "IndexArn": "arn:aws:dynamodb:us-east-2:[YOUR_AWS_ACC_ID]8:table/UserTable/index/AgeIndex"
      }
    ]
  }
}

Batch Write Items

After we created our table, we need to add some items to it, as we know that whenever we add items to our base table, those items get updated in indexes as well.

Create a new JSON file and name it batchWrite.json, this file will contain some tes data to put into the table.

{
  "RequestItems": {
    "UserTable": [
      {
        "PutRequest": {
          "Item": {
            "id": { "S": "1" },
            "name": { "S": "John" },
            "age": { "N": "19" }
          }
        }
      },
      {
        "PutRequest": {
          "Item": {
            "id": { "S": "2" },
            "name": { "S": "Harry" },
            "age": { "N": "18" }
          }
        }
      },
      {
        "PutRequest": {
          "Item": {
            "id": { "S": "3" },
            "name": { "S": "Gary" },
            "age": { "N": "18" }
          }
        }
      }
    ]
  }
}

Command

aws dynamodb batch-write-item --request-items --cli-input-json file://batchWrite.json

Response

If you see this kind of response then it means command execution was successful.

{
  "UnprocessedItems": {}
}

Scan Table

Let’s verify that our write operation was successful by scanning the table.

Create a file named scanTable.json, which contains just the table name.

{
  "TableName": "UserTable"
}

Command

aws dynamodb scan --cli-input-json file://scanTable.json

Response

You should see all the items saved in the table.

{
  "Items": [
    {
      "id": {
        "S": "2"
      },
      "name": {
        "S": "Harry"
      },
      "age": {
        "N": "18"
      }
    },
    {
      "id": {
        "S": "1"
      },
      "name": {
        "S": "John"
      },
      "age": {
        "N": "19"
      }
    },
    {
      "id": {
        "S": "3"
      },
      "name": {
        "S": "Gary"
      },
      "age": {
        "N": "18"
      }
    }
  ],
  "Count": 3,
  "ScannedCount": 3,
  "ConsumedCapacity": null
}

Now as we can see that there are two entries with age set as 18, let’s suppose we want to get all the data where age is 18, we cannot do this kind of query directly on the table as age is not the partition or sort key.

But we do have AgeIndex with age set as the partition key, so we can query our index with age and get the data.

Query the index

We are going to query the index now by creating a file named queryIndex.json, this file will contain parameters needed to query the data with our condition.

{
  "TableName": "UserTable",
  "IndexName": "AgeIndex",
  "KeyConditionExpression": "age = :age",
  "ExpressionAttributeValues": { ":age": { "N": "18" } }
}

Command

aws dynamodb query --cli-input-json file://queryIndex.json

Response

Only two items should get returned from this query, where the age is 18.

{
  "Items": [
    {
      "id": {
        "S": "3"
      },
      "name": {
        "S": "Gary"
      },
      "age": {
        "N": "18"
      }
    },
    {
      "id": {
        "S": "2"
      },
      "name": {
        "S": "Harry"
      },
      "age": {
        "N": "18"
      }
    }
  ],
  "Count": 2,
  "ScannedCount": 2,
  "ConsumedCapacity": null
}

If you see this response then Congratulations, our query works and we successfully queried our DynamoDB Global Secondary index according to the age attribute.

To know more about AWS DynamoDB CLI commands, check out the official AWS Documentation here.

DynamoDB Global Secondary Index Vs Local Secondary Index

There are a couple of differences between these type types of indexes, let’s try to list a few in points.

  • The partition key for the GSI can be different from the base table but in GSI partition key will be same as the base table.
  • For GSI there are no size limitations but in case of LSI the total size must be less than or equal to 10 GB.
  • GSI can be added while table creation and for the existing tables as well but LSI can only be created while table creation.
  • GSI only supports eventual consistent reads but LSI supports both eventual consistent and strongly consistent reads.
  • Each GSI has it’s own provisioned throughput capacity but LSI uses RCU and WCU of the base table itself.
  • While reading data from GSI attributes which are not projected to the index cannot be selected but in case of LSI attributes from the base table can be selected even if they are not projected in the index.

To check out the official documentation which is the source of this information.

How To Create GSI and DynamoDB Table Using AWS Serverless Template?

It is very easy to create a DynamoDB table and Global Secondary Index resource using AWS Serverless yml template file, let’s see how it is done.

resources:
  Resources:
    UserTable:
      Type: AWS::DynamoDB::Table
      DeletionPolicy: Retain
      Properties:
        AttributeDefinitions:
          - AttributeName: id
            AttributeType: S
          - AttributeName: name
            AttributeType: S
          - AttributeName: age
            AttributeType: N
        KeySchema:
          - AttributeName: id
            KeyType: HASH
          - AttributeName: name
            KeyType: RANGE
        ProvisionedThroughput:
          ReadCapacityUnits: 1
          WriteCapacityUnits: 1
        TableName: UserTable

        GlobalSecondaryIndexes:
          - IndexName: AgeIndex
            KeySchema:
              - AttributeName: age
                KeyType: HASH

            Projection:
              ProjectionType: ALL
            ProvisionedThroughput:
              ReadCapacityUnits: "1"
              WriteCapacityUnits: "1"

To know about more options for AWS CloudFormation templates, check out this official documentation.

Conclusion

In this post, we discussed some of the key points about DynamoDB Global Secondary Index, some of its usage with AWS CLI, and how to create a DynamoDB table with Global Secondary Index.

Check more posts:

AWS Cognito Pricing

DynamoDB VS MongoDB: Detailed Comparison

Most Common Methods Used In Javascript and FAQ

LEAVE A REPLY

Please enter your comment!
Please enter your name here