+91 955 582 1832 

Migrating from AWS to GCP for a Retail E-Commerce Platform

Migrating from AWS to GCP for a Retail E-Commerce Platform

January 3, 2025

Problem Statement

A retail e-commerce company, ShopEase, currently hosts its database on Amazon RDS for MySQL in AWS. The database contains critical data such as customer information, product catalogs, orders, and payment details. Due to cost optimization and the need for better integration with GCP’s data analytics tools (BigQuery, Looker), the company has decided to migrate its database to Google Cloud SQL for MySQL.

The objective of Migration is to:

  • Minimal downtime to avoid disruption to online sales.
  • Data integrity and consistency.
  • Compatibility of database objects (tables, stored procedures, functions, etc.).
  • Scalability to handle future growth.
  • Cost optimization to reduce operational expenses.

Agenthum, as a migration partner did the following things:

  • Started analyzing the Sample Data and identified in-scope and out-scope objects. 
  • Identified best suited Migration Approach based on the sample data. 
  • Created a robust Implementation plan and classified the whole migration process into 6 crucial phases /Milestones. 
  • Performed Cost Optimization Strategies during and post Migration. 
  • Overcome the challenges and hurdles during the migration.

 

Let’s Go Deeper And See The Whole Migration Plan:


Sample Data

The database contains the following tables:

1. Customers
customer_id (Primary Key)
name
email
address
created_at
2. Products
product_id (Primary Key)
name
price
category
stock_quantity
3. Orders
order_id (Primary Key)
customer_id (Foreign Key)
order_date
total_amount
4. Order_Items
order_item_id (Primary Key)
order_id (Foreign Key)
product_id (Foreign Key)
quantity
price
5. Payments
payment_id (Primary Key)
order_id (Foreign Key)
payment_method
amount
payment_date

 

Data Size

Total Database Size is 4 TB (including indexes and logs).

Breakdown:

    • Customers: 500 GB (25 million records).
    • Products: 100 GB (1 million records).
    • Orders: 800 GB (50 million records).
    • Order_Items: 500 GB (200 million records).
    • Payments: 100 GB (50 million records).

 

Migration Approach


1. Assessment and Planning

  • Inventory:
    Identify all tables, stored procedures, and functions in the AWS RDS MySQL database.

  • Compatibility Check:
    Verify that all MySQL features used in AWS RDS are supported in GCP Cloud SQL.

  • Migration Strategy:
    Use AWS Database Migration Service (DMS) for continuous replication to minimize downtime.

 

2. Schema Migration

  • Export the schema from AWS RDS using mysqldump or AWS Schema Conversion Tool (SCT).
  • Adjust the schema for GCP compatibility (e.g., remove AWS-specific extensions).
  • Import the schema into GCP Cloud SQL.

 

3. Data Migration

  • Use AWS DMS to replicate data from AWS RDS to GCP Cloud SQL.
  • Perform an initial bulk load followed by continuous replication for real-time sync.
  • Validate data integrity using row counts and checksums.

 

4. Stored Procedures and Functions

  • Export stored procedures and functions from AWS RDS.
  • Refactor any AWS-specific SQL code to be compatible with GCP Cloud SQL.
  • Import the refactored code into GCP.

 

5. Cutover and Go-Live

  • Stop writes to the AWS database during the final sync.
  • Redirect the application to the GCP Cloud SQL database.
  • Monitor performance and resolve any issues.

 

Challenges and Fixes

Challenge Fix
Large Data Volume Use chunked migration and parallel processing.
Network Latency Use AWS Direct Connect and GCP Cloud Interconnect for high-speed data transfer.
Downtime Use AWS DMS for continuous replication to minimize downtime.
Compatibility Issues Refactor AWS-specific SQL code and test thoroughly.
Data Integrity Perform pre- and post-migration validation using checksums and sample queries.

 

Implementation Plan

Phase 1: Pre-Migration

    • Inventory database objects.
    • Analyze compatibility and plan migration strategy.
    • Set up GCP Cloud SQL instance.

Phase 2: Schema Migration

    • Export schema from AWS RDS.
    • Adjust schema for GCP compatibility.
    • Import schema into GCP Cloud SQL.

Phase 3: Data Migration

    • Use AWS DMS for initial bulk load and continuous replication.
    • Validate data integrity.

Phase 4: Code Migration

    • Export stored procedures and functions from AWS RDS.
    • Refactor and import into GCP Cloud SQL.

Phase 5: Cutover

    • Stop writes to AWS RDS during the final sync.
    • Redirect application to GCP Cloud SQL.

Phase 6: Post-Migration

    • Monitor performance and resolve issues.
    • Optimize the database for GCP.

 

Tools and Applications Involved

AWS Tools
  • Amazon RDS: Hosts the MySQL database.
  • AWS Database Migration Service (DMS): For continuous data replication.
  • AWS Schema Conversion Tool (SCT): For schema and code conversion.
  • AWS Direct Connect: For high-speed, secure connectivity between AWS and GCP.

GCP Tools
  • Google Cloud SQL: Target MySQL database.
  • BigQuery: For analytics and reporting post-migration.
  • Cloud Monitoring: For performance monitoring and troubleshooting.
  • Cloud Interconnect: For high-speed connectivity between AWS and GCP.

Third-Party Tools
  • Striim: For real-time data integration and migration.
  • Fivetran: For automated data pipeline creation.
  • Apache Airflow: For orchestrating ETL workflows.
  • dbt (Data Build Tool): For data validation and testing.

Open-Source Tools
  • mysqldump: For exporting schema and data from AWS RDS.
  • mysql: For importing schema and data into GCP Cloud SQL.
  • pgloader: For data migration (if PostgreSQL is involved).

 

Cost Optimization Strategies

  1. Right-Sizing Resources

    • Analyze current AWS RDS instance usage (CPU, memory, storage).
    • Choose a GCP Cloud SQL instance type that matches the workload (e.g., db-n1-standard-4 instead of over-provisioned resources).
    • Use GCP’s Committed Use Discounts for long-term cost savings.

  2. Storage Optimization

    • Use GCP’s Standard or SSD storage based on performance needs.
    • Enable automatic storage scaling to avoid over-provisioning.

  3. Network Cost Reduction

    • Use GCP’s Premium Tier Network for predictable pricing and low latency.
    • Leverage AWS Direct Connect and GCP Cloud Interconnect to reduce data transfer costs between AWS and GCP.

  4. Serverless and Managed Services

    • Use GCP BigQuery for analytics instead of running expensive ETL pipelines on AWS.
    • Leverage Cloud SQL’s automated backups and maintenance to reduce operational overhead.

  5. Monitoring and Cost Management

    • Use GCP’s Cost Management Tools to monitor and optimize spending.
    • Set up budget alerts to avoid unexpected costs.

  6. Data Lifecycle Management

    • Archive historical data to GCP Coldline Storage for cost-effective long-term storage.
    • Use BigQuery partitioning to reduce query costs for large datasets.

 

Cost Benefits Post-Migration

Cost Component AWS Cost (Monthly) GCP Cost (Monthly) Savings (%)
Database Instance $1,500
(db.m5.xlarge)
$1,000
(db-n1-standard-4)
33%
Storage $300 (500 GB SSD) $200 (500 GB SSD) 33%
Backups $100 $50 (automated) 50%
Data Transfer $200
(AWS to Internet)
$100
(GCP Premium Tier)
50%
Analytics (BigQuery) $500
(AWS Redshift)
$300
(BigQuery)
40%
Total $2,600 $1,650 36.5%

 

Conclusion

By following a structured approach and leveraging tools like AWS DMS, GCP Cloud SQL, and third-party applications like Striim and Fivetran, ShopEase successfully migrated its database from AWS to GCP with minimal downtime and ensured data integrity. The migration enabled the company to integrate seamlessly with GCP's analytics tools, paving the way for data-driven decision-making. Additionally, cost optimization strategies such as right-sizing resources, storage optimization, and leveraging managed services helped ShopEase reduce operational expenses significantly, achieving 36.5% in monthly savings.