Full Scale Pre-Prod Environment

Problem Statement
The company's existing test environments were proving ineffective for validating new features and bug fixes as they didn't give QA personnel a reliable way to:
- Test code changes against a full-scale production dataset
- Test UI changes against the full spectrum of production data: missing values, malformed input, long/short strings of text, boundary conditions
- Identify potential query bottlenecks caused by poor database schema designs
- Perform realistic "dry runs" of production releases to help achieve zero-downtime deployments
- Proactively address issues linked to data growth e.g. data drift, query degradation
- Spot scalability issues with data-dependent features like search, pagination and caching
- Reliably reproduce real customer issues
As a result, the team needed a test environment which would provide a more realistic simulation of the production environment to improve release confidence and reduce the number of customer defects.
Solution
The solution was to create a test environment connected to a full-scale anonymised clone of the production databases. Anonymisation was critical so as to achieve ISO compliance. As the full architecture is fairly complex, the list below summarises the key components which are also pictured in the next section:
- An AWS Step Functions workflow to orchestrate all the components
- Amazon Macie to identify PII and other sensitive data
- AWS Glue to catalog the Macie findings in S3
- Amazon Athena to structure and query the Macie findings
- Amazon S3 to store production data in Apache Parquet format for Macie to scan
- AWS DMS replication instances to perform extraction, loading and masking of production data
- Amazon KMS to handle encryption at rest and in transit
- AWS Secrets Manager to store database credentials
- Amazon ECS Fargate to run schema replication tasks for PostgreSQL and MySQL
- AWS Lambda functions to perform additional masking and enrichment
- Terraform to provision the infrastructure
Architecture Diagrams
Iteration One
Iteration Two
Related Projects
Heatmap Service
System design for a rapid prototype heatmap service designed to display global video data
cloud · consulting




