Skip to content

Dynamo Data Transform is an easy to use data transformation tool for DynamoDB

License

Notifications You must be signed in to change notification settings

jitsecurity/dynamo-data-transform

Repository files navigation

ddt_graphic1x_tl

Dynamo Data Transform is an easy to use data transformation tool for DynamoDB.

It allows performing powerful data transformations using simple Javascript commands, without the risk of breaking your database. Available as a Serverless plugin, npm package and even as an interactive CLI, Dynamo Data Transform saves you time and keeps you safe with features like dry-running a data transformation and even rolling back your last trasnformation if needed.

Features

  • Seemless data transformations management.
  • Support for multiple stages.
  • History of executed data transformations.
  • Dry run option for each command (by suppling --dry flag, the data will be printed instead of stored).
  • Safe & Secure preparation data
  • Store preparation data in a private s3 bucket. Prepare data for your data transformation

Table of contents

Quick Start

⚡ Serverless plugin

  • Install
npm install dynamo-data-transform --save-dev
  • Add the tool to your serverless.yml Run:
npx serverless plugin install -n dynamo-data-transform

Or add manually to your serverless.yml:

plugins:
  - dynamo-data-transform
  • Run
sls dynamodt --help

Standalone npm package

  • Install the tool
npm install -g dynamo-data-transform -s
  • Run the tool
dynamodt help

Or with the shortcut

ddt help

💻 Interactive CLI

After installing the npm package, run:

dynamodt -i

cli gif

Creating your first data transformation

  1. Intialize data-transformations folder Serverless (the plugin reads the table names from the serverless.yml file):
sls dynamodt init --stage <stage>

Standalone:

ddt init --tableNames <table_names>

Open the generated data transformation file 'v1_script-name.js' file and implement the following functions:

  • transformUp: Executed when running dynamodt up
  • transformDown: Executed when running dynamodt down -t <table>
  • prepare (optional): Executed when running dynamodt prepare -t <table> --tNumber <transformation_number>

The function parameters:

  • ddb: The DynamoDB Document client object see DynamoDB Client
  • isDryRun: Boolean indicating if --dry run supplied. You can use it to print/log the data instead of storing it.
  • preparationData: if you stored the preparation data using dynamodt prepare, you can use it here.
  1. Run the data transformation
dynamodt up

Data Transformation Script Format

Make sure your script name contains the transformation number, for example: v1_transformation_script

const { utils } = require('dynamo-data-transform')

const TABLE_NAME = 'UsersExample'

const transformUp = async ({ ddb, isDryRun, preparationData }) => {
  // your code here... 
  // return { transformed: 50 } // return the number of transformed items
}

const transformDown = async ({ ddb, isDryRun, preparationData }) => {
  // your code here...
  // return { transformed: 50 } // return the number of transformed items
}

const prepare = async ({ ddb, isDryRun }) => {
  // your code here...
  // return { transformed: 50 } // return the number of transformed items
}

module.exports = {
  transformUp,
  transformDown,
  prepare, // optional
  transformationNumber: 1,
}

Usage and command-line options

List available commands: Serverless plugin:

sls dynamodt --help

Standalone npm package:

dynamodt help

To list all of the options for a specific command run: Serverless plugin:

sls dynamodt <command> --help

Standalone npm package:

dynamodt <command> --help

What happens behind the scenes

  • When a data transformation runs for the first time, a record in your table is created. This record is for tracking the executed transformations on a specific table.

Examples

Examples of data transformation code

Insert records

// Seed users data transformation
const { utils } = require('dynamo-data-transform');
const { USERS_DATA } = require('../../usersData');

const TABLE_NAME = 'UsersExample';

/**
 * @param {DynamoDBDocumentClient} ddb - dynamo db document client https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/clients/client-dynamodb
 * @param {boolean} isDryRun - true if this is a dry run
 */
const transformUp = async ({ ddb, isDryRun }) => {
  return utils.insertItems(ddb, TABLE_NAME, USERS_DATA, isDryRun);
};

const transformDown = async ({ ddb, isDryRun }) => {
  return utils.deleteItems(ddb, TABLE_NAME, USERS_DATA, isDryRun);
};

module.exports = {
  transformUp,
  transformDown,
  transformationNumber: 1,
};

Add a new field to each record

// Adding a "randomNumber" field to each item
const { utils } = require('dynamo-data-transform');

const TABLE_NAME = 'UsersExample';

const transformUp = async ({ ddb, isDryRun }) => {
  const addRandomNumberField = (item) => {
    const updatedItem = { ...item, randomNumber: Math.random() };
    return updatedItem;
  };
  return utils.transformItems(ddb, TABLE_NAME, addRandomNumberField, isDryRun);
};

const transformDown = async ({ ddb, isDryRun }) => {
  const removeRandomNumberField = (item) => {
    const { randomNumber, ...oldItem } = item;
    return oldItem;
  };
  return utils.transformItems(ddb, TABLE_NAME, removeRandomNumberField, isDryRun);
};

module.exports = {
  transformUp,
  transformDown,
  transformationNumber: 2,
};

For more examples of data transformation code, see the examples folder in the repository.

Read how to automate data-transformation process here: https://www.infoq.com/articles/dynamoDB-data-transformation-safety/