March 13, 2018 | Written by: Cariappa Monaiah
Tags: Big Data
Our Client is one of the world’s leading biotechnology companies. It is a value-based company, deeply rooted in science and innovation to transform new ideas and discoveries into medicines for patients with serious illnesses.
The bio-tech company’s Enterprise Data Lake consisted of multiple Hadoop clusters which were hosted on premise in their network. These clusters contained over 300 terabytes of processed and unprocessed datasets critical to business function.
The company needed a clean and efficient disaster recovery solution which would do the following:MLens from Knowledge Lens was the chosen solution which satisfied all conditions of backup and recovery.
MLens Data Migration HDFS backup synchronized HDFS directories to AWS S3 buckets and scheduled incremental backups which detected changes in the datasets and only transfered the updated files.
MLens Platform Migration Backup connected to Cloudera Manager and kept backing up the latest configurations as per the defined frequency.
MLens Data Migration RDBMS backup feature backed up tables from Oracle Database using connection details, table names and specific queries.
MLens Platform Migration Restore created a new Hadoop cluster based on backed up configurations from AWS S3 and by mapping new hosts to the old ones for a disaster recovery scenario.
MLens Data Migration Restore feature recovered the directories which were backed up with MLens Data Migration job. Using it’s distributed framework it ensured that the recovery is fast and business impact is minimal.
MLens Data Migration Restore simply restored the RDBMS tables and records that had been scheduled for backup using MLens Data Migration RDBMS job.
''Key takeway While there are many data ingestion, backup and recovery tools available, no other solution provides such a wide range of features as MLens, which not only ensures accurate data migration but also recovers the cluster and the data."