AWS Database Migration Service

AWS Database Migration Service or DMS is a mature process to move on premises data to the AWS cloud, including to a S3 Data Lake.  It is not recommended that firms build their own python libraries and utilities to do the same.  Use DMS for migration, or hydrating the Data Lake.

When using DMS consider the following:

  1. Premigration assessments – A premigration assessment evaluates specified components of a database migration task to help identify any problems that might prevent a migration task from running as expected. By using this assessment, you can identify potential problems before you run a new or modified task. See Enabling and working with premigration assessments for a task.
  2. To connect source and target databases to an AWS DMS replication instance, configure a network.  VPN and Direct Connect can be used.  For example, use a VPN to connect an on-premises database to an Amazon RDS DB instance over a virtual private network (VPN).  See Network configurations for database migration.
  3. Source and target endpoints – Understand what information and tables in the source database need to be migrated to the target database. AWS DMS supports basic schema migration, including the creation of tables and primary keys.
  4. AWS DMS doesn’t automatically create secondary indexes, foreign keys, user accounts, and so on, in the target database. Depending on your source and target database engine, you might need to set up supplemental logging or modify other settings for a source or target database. For more information, see Sources for data migration and Targets for data migration.
  5. AWS DMS doesn’t perform schema or code conversion. If you want to convert an existing schema to a different database engine, you can use AWS SCT.   AWS SCT converts your source objects, table, indexes, views, triggers, and other system objects into the target data definition language (DDL) format. You can also use AWS SCT to convert most of your application code, like PL/SQL or TSQL, to the equivalent target language.  You can get AWS SCT as a free download from AWS – AWS SCT User Guide.
  6. Schema and code migration – AWS DMS doesn’t perform schema or code conversion. You can use tools such as Oracle SQL Developer, MySQL Workbench, and pgAdmin III to convert your schema. To convert an existing schema to a different database engine, you can use the AWS Schema Conversion Tool (AWS SCT). It can create a target schema and can generate and create an entire schema: tables, indexes, views, and so on. You can also use the tool to convert PL/SQL or TSQL to PgSQL and other formats. For more information on the AWS SCT, see the AWS SCT User Guide.
  7. Unsupported data types – Make sure that you can convert source data types into the equivalent data types for the target database. For more information on supported data types, see the source or target section for your data store.
  8. Diagnostic support script results – When you plan your migration  run diagnostic support scripts. With the results from these scripts, you can find advance information about potential migration failures.
  9. If a support script is available for your database, run it according to the procedure described in the script topic in your local environment. When the script run is complete, you can review the results. Run these scripts as a first step of any troubleshooting effort. The results can be useful while working with an AWS Support team. For more information, see Working with diagnostic support scripts in AWS DMS.
  10. AWS DMS provides ongoing replication of data, keeping the source and target databases in sync. It replicates only a limited amount of data definition language (DDL) statements. AWS DMS doesn’t propagate items such as indexes, users, privileges, stored procedures, and other database changes not directly related to table data.
  11. In general, AWS DMS migrates LOB data in two phases: AWS DMS creates a new row in the target table and populates the row with all data except the associated LOB value. AWS DMS updates the row in the target table with the LOB data.
  12. AWS DMS uses some resources on your source database. During a full load task, AWS DMS performs a full table scan of the source table for each table processed in parallel. Also, each task that you create as part of a migration queries the source for changes as part of the CDC process. For AWS DMS to perform CDC for some sources, such as Oracle, you might need to increase the amount of data written to your database’s change log.
  13. To ensure that your data was migrated accurately from the source to the target, we highly recommend that you use data validation. If you turn on data validation for a task, AWS DMS begins comparing the source and target data immediately after a full load is performed for a table.
  14. Observability with CloudWatch – host, replication tasks, table metrics.
  15. There might be issuess with warnings or error messages awhich ppear only in the task log. In particular, data truncation issues or row rejections due to foreign key violations are only written in the task log. Therefore, be sure to review the task log when migrating a database. To view the task log, configure Amazon CloudWatch as part of task creation. For more information, see Monitoring replication tasks using Amazon CloudWatch.
  16. AWS DMS uses Amazon SNS to provide notifications when an AWS DMS event occurs, for example the creation or deletion of a replication instance. You can work with these notifications in any form supported by Amazon SNS for an AWS Region.
  17. To troubleshoot AWS DMS migration issues, you can work with Time Travel. For more information about Time Travel, see Time Travel task settings.
  18. Optimise performance:  Consider the following:  Resource availability on the source,  available network throughput,  resource capacity of the replication server,  ability of the target to ingest changes, type and distribution of source data,  number of objects to be migrated.

See https://docs.aws.amazon.com/dms/latest/userguide/CHAP_BestPractices.html