The client faced consistent performance issues with approximately 1800 ETL scripts developed using the Spark framework. These issues resulted in frequent breaches of service level agreements (SLAs) during batch execution.
Delays in data availability had a detrimental effect on the client’s ability to track business performance, make strategic and tactical decisions, and feed downstream systems in a timely manner.
To tackle this challenge, Ingrity devised an automated framework to identify inefficiencies in Spark ETL scripts. By analysing the scripts, we pinpointed areas for improvement such as script sequencing and code optimization. The comprehensive review of around 1800 scripts, along with improvement suggestions, was completed in less than three weeks, a significantly shorter timeframe than the initial estimation of 3 to 4 months.
We provided a detailed report to the maintenance team, located offshore, outlining gaps in the client’s ETL framework, and offering recommendations for enhancement. Implementing these recommendations resulted in a substantial decrease in ETL failure rates. By conducting timely evaluations, we identified inefficiencies in data routines that could have potentially led to the failure of the entire data infrastructure.
As a result of our intervention, the client successfully eliminated SLA breaches and achieved timely information delivery to all stakeholders, including risk, operations, and management teams. The data operation support teams experienced a significant reduction in the time spent on supporting data delivery production runs.
With improved data availability, the client gained enhanced operational efficiency and the ability to make informed decisions based on up-to-date information. This eliminated disruptions in their business processes, enabling seamless operations and preventing any negative impacts downstream.