MN Implements Resiliency for Windows Workloads

Related services
Cloud, Amazon Web Services (AWS)

Industry
Finance

Company
Xebia

MN Pensioen is a Dutch financial service provider specializing in corporate and industry-wide pension funds and investments. The company manages over €175 billion in pension assets while serving two million people in the metal and technology, and motor vehicle sector. Responsible investing is at MN’s core, taking an environmental, social, and governance (ESG) model of business. Fiduciary analysis, management, and communication make up a large part of MN’s services. They also administer income insurance and work with Research and Development funds and social funds in their respective industries. MN has their headquarters in Den Haag with five business units.

Why

Improve resilience and disaster recovery to ensure business continuity

What

Implementation of automatic failover services on AWS

How

Migrating applications to AWS and leveraging cloud native services

The Challenge

MN Pensioen asked Xebia to support them in migrating their Windows-based workloads from the datacenter of their outsourcer to AWS. For timing reasons (as the migration needed to be finalized before the outsource contract expired), as well as for technical reasons, the applications where largely migrated as-is, with limited adaptation for running on AWS. This migration approach shortens the lead time for the migration, but is not leveraging the full capabilities and value of AWS. MN Pensioen asked Xebia to prioritize business continuity and resiliency for these workloads, without being able to rely on the manual processes that the previous outsourcing company provided.

The Solution

Most datacenter-hosted applications were not able to run active-active across two or more Availability Zones. To enable this refactoring was not an option due to costs, the limited timeframe available and in some cases, MN did not possess the source code.

Design
Xebia optimized the architecture by implementing resilience capabilities while working with the given limitations. That resulted in a solution with in-depth mitigations. For example: to recover from a single EC2 host failure it is not necessary to do a full failover to another datacenter. It has less impact (time to recover, changes to moving parts, etc) to relaunch the failed EC2 instance on a different EC2 host in the same region. The restart in the same AZ (Availability Zone) is often within minutes and requires no further changes or manual intervention.

Operations
This new solution design also requires new capabilities, processes and skills to operate the environment. Xebia worked together with the customer to adjust and improve their way of working with AWS. Resulting updated processes and procedures, resulting in a higher degree of automation and less manual intervention. MN staff is also trained by Xebia to run the daily operations and how to operate during a failover.

Recovery
To recover from a (rare) failure of an entire AZ, Xebia implemented AWS Backup for non-critical applications, where there is extensive time to recover in a second AZ. For critical applications, Xebia implemented a combination of making daily backups and continuous replication, using AWS Elastic Disaster Recovery Service (AWS DRS). The implemented technical controls are aligned with business needs to align costs versus requirements.

"This Cloud Foundation is exactly what we needed. It is scalable, secure and cost-efficient. The best part is that we came from zero to hero in less than three months, surpassing every expectation. I am proud of what our team achieved by working with Xebia!"

Jan-Paul Lottering MN Cloud and Infrastructure Manager

A True Resilient Architecture

Where in the past a failover due to a data center outage required a lot of manual intervention, MN now has a fully automated solution for their critical workloads. This significantly improves the chances of success in a real disaster scenario and reduces the time to restore service availability significantly, allowing IT to improve the SLA for these applications to the business. Testing disaster recovery is also improved. In the past this was a stressful situation, as within the timeframe of a single weekend, both failover to the outsourcer’s DR datacenters as well as failback to the main production site had to be done. Due to the replication times required, this was always a challenge and allowed little room for error.

With the new solution, the failover will be done to an alternate AWS Availability Zone, and once successful, this will be declared the new production site, so no failback is required.

MN Implements Resiliency for Windows Workloads

The Challenge

The Solution

A True Resilient Architecture

Explore related customer stories

Get in touch with us