Case Study Overview:
The fortune 100 Firm set out to revamp their Enterprise Data Lake which was on a Cloudera platform to a new data platform primarily for (a) cost reduction (b) adopt to modern data architectures and (c) importantly, to enable the data lake to be easily governed and consumable.
Business Challenges:
The data lake development at this customer was in works for many years. With iterative development and on-the-fly data governance, the lake over time has become very difficult to consume with lack of proper metadata management, access provisioning and cataloging. The lake was built on a Hadoop cluster and with the costs growing higher, there was a need to retake a look on options on new technologies and platforms.
Solutions Delivered:
Quadratic Systems was the primary partner for designing and implementing the data lake solution on a new platform comprising of:
(1) On-Prem S3 object store (Scality) for data storage (replacing HDFS),
(2) Spark/Scala on Kubernetes containers (CaaS Platform)
(3) Dremio as a query tool (replacing Hive/Impala)
Scality was chosen for Data Storage and a Caas Platform (Kubernetes) for compute to replace the exisiting Hadoop environment.
Data Governance:
Data Ingestion Framework:
Data Consumption:
Copyright © 2022 Quadratic Systems, Inc. - All Rights Reserved.