top of page
Abstract Background

Retail Data Analytics Case Study – Computing Revenue Leakage


A Fortune 500 retailer

Business Challenge

Our client, a Fortune 500 retailer had around 80 storefront locations in the US.

The challenge faced by the retailer was that these locations had a legacy system wherein the product ordered was recorded in a free-form text field. Hence, it was not clear as to what product was delivered and if the correct price was charged for the item.

Secondly, once the identification of the product was done, they also wanted to do further product correlations to assess top performing products and which customers were giving them the most business through these purchases.


In this case, the client was expecting a 5% hit ratio, but we were able to achieve a 40% hit ratio that enabled the client to:

  • Compute revenue leakage that justified the migration of the legacy system to a more modern and effective system, and

  • Prioritize marketing decisions for their various products.


Menerva used natural language processing algorithms from Python libraries to parse the legacy system data and identify the product matching the order. Apache Spark was used to process all the legacy data from all the locations in a timely manner.

download (2).png


Hit Ratio

From 5% to 40%

bottom of page