Skip to content

Big Data and Procurement. Interview with Shobhit Chugh

by Roberto V. Zicari on May 19, 2015

“The future of procurement lies in optimising cost and managing risk across the entire supplier base; not just the larger suppliers. Easy access to a complete view of supplier relationships across the enterprise will help those responsible for procurement to make favorable decisions, eliminate waste, increase negotiating leverage and manage risk better. “–Shobhit Chugh.

Data Curation, Big Data and the challenges and the future of Procurement/Supply Chain Management are among the topics of the interview with Shobhit Chugh, Product Marketing Lead at Tamr, Inc.


Q1. In your opinion, what is the future of Procurement/Supply Chain Management?

Shobhit Chugh: Procurement spend is one of the largest spend items for most companies; and supplier risk is one of the items that keeps CEOs of manufacturing companies up at night. Just recently, for example, an issue with a haptic device supplier created a shortage of Apple Watches just after the product’s launch.

At the same time, the world is changing: more data sources are available with increasing variety, and that keeps changing with frequent mergers and acquisitions. The future of procurement lies in optimizing cost and managing risk across the entire supplier base; not just the larger suppliers. Easy access to a complete view of supplier relationships across the enterprise will help those responsible for procurement to make favorable decisions, eliminate waste, increase negotiating leverage and manage risk better.

Q2. What are the current key challenges for Procurement/Supply Chain Management?

Shobhit Chugh: Companies looking for efficiency in their supply chains are limited by the siloed nature of procurement. The domain knowledge needed to properly evaluate suppliers typically resides deep in business units and suppliers are managed at ground level, preventing organizations from taking a global view of suppliers across the enterprise. Those people selecting and managing vendors want to drive terms that favor their company, but don’t have reliable cross-enterprise information on suppliers to make those decisions, and the cost of organizing and analyzing the data has been prohibitive.

Q3. What is the impact of Big Data on the Procurement/Supply Chain?

Shobhit Chugh: A brute force, manual effort to get a single view of suppliers on items such as terms, prices, risk metrics, quality, performance, etc. has traditionally been nearly impossible to do cost effectively. Even if the data exists within the organization, data challenges make it hard to consolidate information into a single view across business units. Rule-based approaches for unifying this data have scale limitations and are difficult to enforce given the distributed nature of procurement. And this does not even include the variety of external data sources that companies can take advantage of, which further increases the potential impact of big data.

Big data changes the situation by providing the ability to evaluate supplier contracts and performance in real time, and puts that intelligence in the hands of people working with suppliers so they can make better decisions. Big data holds significant promise, but only when data unification brings the decentralized data and expertise together to serve the greater good.

Q4. Why does this challenge call for data unification?

Shobhit Chugh: The quality of analysis coming out of procurement optimization is directly related to the volume and quality of data going in. Bringing that data together is no minor feat. In our experience, any individual in an organization can effectively use no more than ten percent of the organization’s data even under very good conditions. Given the distributed nature of procurement, that figure is likely dramatically lower in this situation. Cataloging the hundreds or thousands of internal and external data sources related to procurement provides the foundation for improved decision making.

Similarly, the ability to compare data is directly correlated to the ability to match data points in the same category or related to the same supplier. This is where top-down approaches often get bogged down. Part names, supplier names, site IDs and other data attributes need to be normalized and organized. The efficiency of big data is severely limited if like data sets in various formats aren’t brought together for meaningful comparison.

Q5. How is data unification related to Procurement/Supply Chain Management?

Shobhit Chugh: There are several ways for highly trained data scientists to combine a handful of sources for analysis. Procurement optimization across all suppliers is a markedly different challenge. Procurement data for a company could reside in dozens to thousands of places with very little similarity with regard to how the data is organized. Not only is this data hard for a centralized resource to find and collect, it is hard for non-experts to properly organize and prepare for analysis.
This data must be curated so that analysis returns meaningful results.

One thing I want to emphasize is that data unification is an ongoing activity rather than a one-time integration task. Companies that recognize this continue to extract the maximum value out of data, and are also able to adapt to opportunities to bring in more internal and external data sources when the opportunity presents itself.

Q6. Can you put that in the context of a real world example?

Shobhit Chugh: A highly diversified manufacturer we work with wanted a single view of suppliers across numerous information silos spanning multiple business units. A supplier master list would ultimately contain over a hundred thousand supplier records from many ERP systems. Just one business unit was maintaining over a dozen ERP systems, with new ERP systems regularly coming on line or being added through acquisitions. The list of suppliers also changed rapidly, making functions like deduplication nearly impossible to maintain. Additionally, the company wanted to integrate external data to enrich internal data with information on each supplier’s fiscal strength and structure.

A “bottom-up,” probabilistic approach to data integration proved to be more scalable than a traditional “top-down” manual approach, due to the sheer volume and variety of data sources. Specifically, the company leveraged our machine learning algorithms to continuously re-evaluate and remove potential duplicate entries, driving automation supported by expert guidance into a previously manual process performed by non-experts. The initial result was elimination of 33 percent of suppliers from the master list, just through deduplication.

The company then looked across multiple businesses’ governance systems for suppliers that were related through a corporate structure and identified a significant overlap. Using the same core master list, operational teams were able to treat supplier subsidiaries as different entities for payment purposes, while analytics teams got a global view of a supplier to ensure consistent payment terms. From hundreds of single-use sources, the company created a single view of suppliers with multiple important uses.

Q7. When you talk about data curation, who is doing the curation and for whom? Is it centralized?

Shobhit Chugh: Everyone responsible for a supplier relationship, and the corresponding data, has an interest in the completeness of the data pool, and an interest in the most complete analysis possible. They don’t have an interest in committing the time required to unify the data manually. Our approach is to use ever-improving machine learning to handle the bulk of data matching and rely on subject matter experts only when needed. Further, the system learns which experts to ask each time help is needed, depending on the situation. Once the data is unified, it is available for use by all, including data scientists and corporate leaders far removed from the front lines.

Q8. Do all data-enabled organizations need to hire the best data scientists they can find?

Shobhit Chugh: Yes, data-driven companies should create data-driven innovation, and non-obvious insights often take good data scientists who are tasked with looking beyond the next supplier for ways data can impact other areas of the business. Here, too the decentralized model of data unification has dramatic benefits.

The current scarcity of qualified data scientists will only deepen as the growth in demand is expected to far outpace the rate of qualified professionals entering the field. Everyone is looking to hire the best and brightest data scientists to get insights from their data, but relentless hiring is the wrong way to solve the problem. Data scientists spend 80 percent of their time finding and preparing data, and only twenty percent actually finding answers to critical business questions. Therefore, the better path to scaling data scientists is enabling the ones you have to spend more time on analysis rather than data preparation.

Q9. What is the ROI a company could expect from using data unification for procurement?

Shobhit Chugh: Procurement is an exciting area for data unification precisely because once data is unified, value can be derived using existing best practices, now with a much larger percentage of the supplier base.
Value includes better payment terms, cost savings, higher raw material and part quality and lower supplier risk.
Seventy-five to 80 percent of the value of procurement optimization strategies will come from smaller suppliers and contracts, and data unification unlocks this value.

Q10. What do you predict will be the top five challenges for procurement to tackle in the next two years?

Shobhit Chugh: Using data unification and powerful analysis tools, companies will begin to see immediate value from:
• Achieving “most favored” status from suppliers and eliminating poorly structured contracts where suppliers have multiple customers in your organization
• Build holistic relationships with supplier parent organizations based on the full scope of their subsidiaries’ commitments
• Eliminate rules-based approaches to supplier sourcing and other top-down strategies in favor of data-driven, bottom-up strategies that make use of expertise and data spread throughout the organization
• Embrace the variety of pressure points in procurement – price, delivery, quality, minimums, payment terms, risk, etc. – as ways to customize vendor relationships to suit each need rather than a fog that obscures the value of each contract
• Identify the internal procurement “rock stars” and winning strategies that drive the most value for your organization and replicate those ideas enterprise-wide

Qx. Anything else you wish to add?

Shobhit Chugh: The final component we haven’t discussed is the timing associated with these gains.
We’ve seen procurement optimization projects performed in days or weeks that unleash the vast untapped majority of data locked in previously unknown sources. Not long ago, similar projects focused on just the top suppliers took months and quarters. Addressing the full spectrum of suppliers in this way was not feasible. The combination of data unification and big data is perfectly suited to bringing value quickly and sustaining that value by staying on top of the continual tide of new data.

Shobhit Chugh leads product marketing for Tamr, which empowers organizations to leverage all of their data for analytics by automating the cataloging, connection and curation of “hard-to-reach” data with human-guided machine learning. He has spent his career in tech startups including High Start Group, Lattice Engines,Adaptly and Manhattan Associates. He has also worked as a consultant at McKinsey & Company’s Boston and New York offices, where he advised high tech and financial services clients on technology and sales and marketing strategy.
Shobhit holds an MBA from Kellogg School of Management, a Master’s of Engineering Management in Design from McCormick School of Engineering at Northwestern University, and a Bachelor of Technology in Computer Science from Indian Institute of Technology, Delhi.


Procurement: Fueling optimization through a simplified, unified view, White Paper Tamr (Link to Download , Registration required)

Data Curation at Scale: The Data Tamer System (LINK to .PDF)

Can We Trust Probabilistic Machines to Prepare Our Data? By Daniel Bruckner,

Smooth Sailing for Data Lakes: Hadoop/Hive + Data Curation, Tamr, FEATURED CONTENT, INSIGHTS,


Related Posts

On Data Curation. Interview with Andy Palmer, ODBMS Indutry Watch, January 14, 2015

On Data Mining and Data Science. Interview with Charu Aggarwal. ODBMS Industry Watch Published on 2015-05-12 Experts Notes
Selected contributions from experts panel:
Big data, big trouble.
Data Acceleration Architecture/ Agile Analytics.
Critical Success Factors for Analytical Models.
Some Recent Research Insights Operations Research as a Data Science Problem.
Data Wisdom for Data Science.

Follow on Twitter: @odbmsorg


From → Uncategorized

No comments yet

Leave a Reply

Note: HTML is allowed. Your email address will not be published.

Subscribe to this comment feed via RSS