Using Python/BOTO3 code to create a DynamoDB table

This week I have been experimenting with the interface between DynamoDB and Python/BOTO3. In this example, I am creating a DynamoDB table along with a local secondary index (LSI) and a global secondary index (GSI). It is important that the following order be maintained in the table specification within the “dynamodb.create_table” structure:

a. Specify the key schema (Primary key in the RDBMS world)

b. Next specify the attributes of the table (Columns in the RDBMS world). Note that I have specified one extra attribute (alt_sort_key), which will be used in the LSI

c. In the next chunk of code, I create an LSI with the partition key matching the table’s key and the alternate sort key, alt_sort_key. Also included in the LSI specification is the projection clause which is the set of attributes that is to be copied from the table into the LSI. DynamoDB provides three different options for this:

KEYS_ONLY – Each item in the index consists only of the table partition key and sort key values, plus the index key values

INCLUDE – In addition to the attributes described in KEYS_ONLY, the secondary index will include other non-key attributes that you specify.

ALL – The secondary index includes all of the attributes from the source table.

d. The last structure is the GSI. Note that the LSI uses capacity from the table, while the GSI requires that you specify the capacity separately.

The entire create table code is structured to be within a try/catch logic to handle errors.

#! /usr/bin/env python
#--------------------------------------------------------------------
#
# Author      : Dean Capps 
# Description : Create a DynamoDB table
#
#--------------------------------------------------------------------
#
print("Starting")

import os
os.system('clear')

#
## import Python SDK for AWS
#
import boto3

#
## create a boto3 client for DynamoDB operations
#
dynamodb = boto3.client("dynamodb")

#
## Create the table
##
## Keep the order of
##   a. key schema
##   b. attributes
##   c. LSI
##   d. GSI
#
try:
    response = dynamodb.create_table(
        TableName="dean_test_table",  
        KeySchema=[
            {
                'AttributeName': 'part_key',
                'KeyType': 'HASH'
            },
            {
                "AttributeName": "sort_key",                
                'KeyType': 'RANGE'                
            }
        ],
        AttributeDefinitions=[
            {
                "AttributeName": "part_key",
                "AttributeType": "S"
            },
            {
                "AttributeName": "sort_key",
                "AttributeType": "S"
            },
            {
                "AttributeName": "alt_sort_key",
                "AttributeType": "S"
            }
        ],
        LocalSecondaryIndexes=[
            {
                'IndexName': 'dean_test_table_lsi',
                'KeySchema': [
                    {
                        'AttributeName': 'part_key',
                        'KeyType': 'HASH'
                    },
                    {
                        'AttributeName': 'alt_sort_key',
                        'KeyType': 'RANGE'
                    }
                ],
                'Projection': {
                    'ProjectionType': 'ALL'
                },
            }
        ],      
        GlobalSecondaryIndexes=[
            {
                'IndexName': 'dean_table_gsi',
                'KeySchema': [
                    {
                        'AttributeName': 'alt_sort_key',
                        'KeyType': 'HASH'
                    },
                ],
                'Projection': {
                    'ProjectionType': 'ALL'
                },
                'ProvisionedThroughput' :{
                    'ReadCapacityUnits': 1,
                    'WriteCapacityUnits': 1,
                }
            }
        ],        
        ProvisionedThroughput={
            "ReadCapacityUnits": 5,
            "WriteCapacityUnits": 5
        }
    )
    print("Table created successfully")
except Exception as e:
    print("Error creating table:")
    print(e)

Manipulating CSV files with Python

I had a CSV file with over 50 columns of which I only needed 11 columns in a slightly different order. I had been manipulating the file manually but got frustrated after the second time I had to do this repetitive manual task and turned to Python to see if I could write some quick and dirty code. As with most things in Python, it was relatively easy and quick to accomplish this:

import csv
with open("file_with_many_columns.csv","r") as source:
    rdr= csv.reader( source )
    with open("file_with_columns_needed.csv","w") as result:
        wtr= csv.writer( result )
        for r in rdr:
            #
            ## 
            #
            wtr.writerow( (r[2], r[1], r[0], r[3], r[4], r[5], r[6], r[7], r[27], r[44], r[45] ) )

The number in the square brackets corresponds to the column numbers in the original file. Like all things in Python numbering starts with [0] being the first or “A” column. The order of columns in the write statement is the order in which the files will be in the output file.

Hope this helps with your use case.