From SQL to NoSQL. Interview with Carlos Fernández
“We like to say that we have the biggest database on companies and sole proprietors in Spain. We handle 7 million national economic agents, and the database undergoes more than 150,000 daily information updates. We have been active since 1992, so our historic file is massive. The database as a whole exceeds 40 Terabytes.” –Carlos Fernández
Q1. Could you describe in a few words what Informa Dun & Bradstreet is and what its figures are?
Carlos Fernández: Informa D&B is the leading business information services company for customer and supplier acquisitions, analyses and management. We maintain this leadership in the three markets in which we compete: Spain, Portugal and Colombia.
We like to say that we have the biggest database on companies and sole proprietors in Spain. We handle 7 million national economic agents, and the database undergoes more than 150,000 daily information updates. We have been active since 1992, so our historic file is massive. The database as a whole exceeds 40 Terabytes.
To maintain and update this massive database, we invest 12 million euros every year in data and data handling procedures and systems, and we have 130 data specialists that take care of every single piece of information that we load into the database. Data quality, accuracy and timeliness as well as the coherence between different sources are essential for us.
Q2. I understand that Informa D&B has begun a profound update of its data architecture in order to continue being a market leader for another 10 years. What does the update consist of?
Carlos Fernández: We really began updating when gigabytes were insufficient for our needs. Now we see that terabytes will follow the same path. Petabytes are the future, and we need to be prepared for it. We usually say that when you need to travel to another continent, you need an airplane, not a car.
What does this mean in practical terms? Our customers are used to online responses to their needs. However, these needs have become more complex and require greater data depth.
If you are able to store hundreds of terabytes, use them very quickly and use complex analytic models to easily find the answer to your question, then you are in good shape.
Q3. You mentioned that you have found a new database manager, LeanXcale, to address the challenges for your data platform. What kind of database manager were you using before and why are you replacing it?
Carlos Fernández: INFORMA was, and still is, an “Oracle” company. Having said that, the more we began to move into a Data Lake design, the more new solutions and new names came into play. Mongo, Cassandra, Spark …
So, having come from an SQL-oriented environment featuring many lines of code, we wondered if we could fulfill our new requirements with the old technology. The answer to that query is a clear NO. Can we rewrite INFORMA as a whole? The answer is again NO. Can we meet our new requirements by increasing our computing capacity? Once more, the answer is NO.
We needed to be smart and find a solution that could bring positive outcomes in an affordable technical environment.
Q4. According to you, one of the main improvements has been the acceleration of the process through leveraging the interfaces of LeanXcale with NoSQL and SQL. Can you elaborate on how it helped you?
Carlos Fernández: As I mentioned before, we have quite challenging business and product performance requirements. On the other hand, business rules are also complex and difficult to rewrite for different environments.
Can we solve our issues without a huge investment in expensive servers? Can we also accommodate these requirements in a scalable fashion?
LeanXcale and its NoSQL and SQL interfaces were the perfect match for our needs.
Q5. What are the technical and business benefits of having a linear scaling database such as LeanXcale?
Carlos Fernández: We have many customers. They range from the biggest Spanish companies to small businesses and sole proprietors. They have completely different needs, but, at the same time, they share many requirements, with the main one being immediate response time.
Of course, the amount of data and model complexity involved in generating a response can vary quite a lot, depending on the size of the company and its portfolio.
Only by being able to accommodate such demands with a scalable solution can we provide the required services under a valid cost structure
Q6. How was your experience with LeanXcale as a provider?
Carlos Fernández: For us, this has been quite an experience. From the very beginning, the LeanXcale team acted as though they worked for INFORMA.
We started with a POC, and it was not an easy one. We had the feeling that we had the best parts of the company involved in the project. Well, not really the feeling since that really was the case.
The key factor, however, was the team’s knowledge, that is, the depth of their technical approach, the extent to which they understood our needs and their ability to reshape many aspects to make our requirements a reality.
Q7. You said that LeanXcale has a high impact on reducing total cost of ownership. Could you provide us with figures comparing it to the previous scenario?
Carlos Fernández: LeanXcale has reduced our processing time by more than 72 times over. The standard LeanXcale licensing and support price means savings of around 85%. In our case, we have maximized these savings by signing an unlimited License Agreement for the next five years.
Additionally, this improved performance reduces the infrastructure used in our hybrid cloud by the same proportion: 72 times over.
However, these savings are less crucial than the operational risk reduction and the enablement of new services. Being ready to react to any unexpected event quickly makes our business more reliable. New services will allow us to maintain our market leadership for the next decade.
Q8. How will this new technology affect the services offered to the customer?
Carlos Fernández: I think that we can consider two periods of time in the answer.
Right now, we are capable to improving our actual product range features. We can deliver updated external databases faster and more frequently and offer a better customer experience in many areas. We can provide more data and more complex solutions to a wider range of customers.
For the future, we are discovering new ways to design new products and services. When you break down barriers, new ideas come up quite easily. Our marketing team is really excited about the new capabilities we will have. I am sure that we will shortly see many new things coming from us.
QX. Anything else you wish to add?
Carlos Fernández: INFORMA D &B is a company that has put innovation at the top of its strategy. We never stop and will find new opportunities through using LeanXcale. We are very pleased and very sure that we will be a market leader for many years to come!
Carlos Fernández holds a Superior Degree in Physics and an MBA from the “Instituto de Empresa” in Madrid. His professional career has included stints at companies such as Saint Gobain, Indra, Reuters and Fedea.
At the present time, he is Deputy General Manager at INFORMA and a member of the board of the XBRL Spanish Jurisdiction. In addition, he is a member of the Alcobendas City Council’s Open Data Advisory Board. This entity is firmly committed to continue advancing and publishing information in a reusable format to generate social and economic value.
Furthermore, he is a former member of various boards, including the boards of ASNEF Logalty, ASFAC Logalty and CTI.
He is a former member of GREFIS (Financial Information Services Group of Experts) and a current member of XBRL CRAS (Credit Risk Services), for which he is Vice President of the Technical Working Group. He is also a former member of the Information Technologies Advisory Council (CATI) and the AMETIC Association (Multi-Sector Partnership of Electronics, Communications Technology, Telecommunications and Digital Content Companies).
NewSQL: principles, systems and current trends Tutorial, IEEE Big Data 2019, Los Angeles, USA, 12 December 2019 by Patrick Valduriez and Ricardo Jimenez-Peris.
Follow us on Twitter: @odbmsorg