Posted on March 13, 2018
Client introduction
Our Client is one of the world’s leading biotechnology companies. It is a value-based company, deeply rooted in science and innovation to transform new ideas and discoveries into medicines for patients with serious illnesses.
The Opportunity
The biotech company’s Enterprise Data Lake consisted of multiple Hadoop clusters which were hosted on premise in their network. These clusters contained over 300 terabytes of processed and unprocessed datasets critical to business function.
The company needed a clean and efficient disaster recovery solution which would do the following:
 Backup datasets which are regularly updated on a daily/weekly basis to AWS S3
 Backup cluster configurations regularly to AWS S3
 Backup all cluster metadata and logs stored in Oracle tables to AWS S3
 Restore the on-premise cluster from the backed up configurations
 Restore the entire data backed up in AWS S3
 Restore all metadata and logs backed up in AWS S3
Solution
from Knowledge Lens was the chosen solution which satisfied all conditions of backup and recovery. Backup datasets which are regularly updated on a daily/weekly basis to S3
MLens Data Migration HDFS backup synchronized HDFS directories to AWS S3 buckets and scheduled incremental backups which detected changes in the datasets and only transfered the updated files.
 Backup cluster configurations regularly to AWS S3
MLens Platform Migration Backup connected to Cloudera Manager and kept backing up the latest configurations as per the defined frequency.
 Backup all cluster metadata and logs stored in Oracle tables to AWS S3
MLens Data Migration RDBMS backup feature backed up tables from Oracle Database using connection details, table names and specific queries.
 Restore the on premise cluster from the backed up configurations
MLens Platform Migration Restore created a new Hadoop cluster based on backed up configurations from AWS S3 and by mapping new hosts to the old ones for a disaster recovery scenario.
 Fast recovery of the entire data backed up in AWS S3
MLens Data Migration Restore feature recovered the directories which were backed up with MLens Data Migration job. Using it’s distributed framework it ensured that the recovery is fast and business impact is minimal.
 Restore all metadata and logs backed up in AWS S3
MLens Data Migration Restore simply restored the RDBMS tables and records that had been scheduled for backup using MLens Data Migration RDBMS job.
Additional benefits
 Zero downtime achieved by MLens unique feature of Query support directly on Live backups.
 MLens software delivered high speed parallel data processing without landing zones.
 It supported compression and format conversion during backup.
Key takeway
While there are many data migration, backup and recovery tools available, no other solution provides such a wide range of disaster recovery solutions which not only ensures data migration but recovers the cluster and the data."