Challenges and Opportunities for Big Data. Interview with Mike Hoskins
“We are facing an imminent torrent of machine generated data, creating volumes that will break the back of conventional hardware and software architectures. It is no longer be feasible to move the data to the compute process – the compute process has to be moved to the data” –Mike Hoskins.
On the topic, Challenges and Opportunities for Big Data, I have interviewed Mike Hoskins, Actian Chief Technology Officer.
Q1. What are in your opinion the most interesting opportunities in Big Data?
Mike Hoskins: Until recently, most data projects were solely focused on preparation. Seminal developments in the big data landscape, including Hortonworks Data Platform (HDP) 2.0 and the arrival of YARN (Yet Another Resource Negotiator) – which takes Hadoop’s capabilities in data processing beyond the limitations of the highly regimented and restrictive MapReduce programming model – provides an opportunity to move beyond the initial hype of big data and instead towards the more high-value work of predictive analytics.
As more big data applications are built on the Hadoop platform customized by industry and business needs, we’ll really begin to see organizations leveraging predictive analytics across the enterprise – not just in a sandbox or in the domain of the data scientists, but in the hands of the business users. At that point, more immediate action can be taken on insights.
Q2. What are the most interesting challenges in Big Data?
Mike Hoskins: We are facing an imminent torrent of machine generated data, creating volumes that will break the back of conventional hardware and software architectures. It is no longer be feasible to move the data to the compute process – the compute process has to be moved to the data. Companies need to rethink their static and rigid business intelligence and analytic software architectures in order to continue working at the speed of business. It’s clear that time has become the new gold standard – you can’t produce more of it; you can only increase the speed at which things happen.
Software vendors with the capacity to survive and thrive in this environment will keep pace with the competition by offering a unified platform, underpinned by engineering innovation, completeness of solution and the service integrity and customer support that is essential to market staying power.
Q3. Steve Shine, CEO and President, Actian Corporation, said in a recent interview (*) that “the synergies in data management come not from how the systems connect but how the data is used to derive business value”. Actian has completed a number of acquisitions this year. So, what is your strategy for Big Data at Actian?
Mike Hoskins: Actian has placed its bets on a completely modern unified platform that is designed to deliver on the opportunities presented by the Age of Data. Our technology assets bring a level of maturity and innovation to the space that no other technology vendor can provide – with 30+ years of expertise in ‘all things data’ and over $1M investment in innovation. Our mission is to arm organizations with solutions that irreversibly shift the price/performance curve beyond the reach of traditional legacy stack players, allowing them to get a leg up on the competition, retain customers, detect fraud, predict business trends and effectively use data as their most important asset.
Q4. What are the products synergies related to such a strategy?
Mike Hoskins: Through the acquisition of Pervasive Software (a provider of big data analytics and cloud-based and on-premises data management and integration), Versant (an industry leader in specialized data management), and ParAccel (a leader in high-performance analytics), Actian has compiled a unified end-to-end platform with capabilities to connect, prep, optimize and analyze data natively on Hadoop, and then offer it to the necessary reporting and analytics environments to meet virtually any business need. All the while, operating on commodity hardware at a much lower cost than legacy software can ever evolve to.
Q5. What else still need to be done at Actian to fully deploy this strategy?
Mike Hoskins: There are definitely opportunities to continue integrating the platform experience and improve the user experience overall. Our world-class database technology can be brought closer to Hadoop, and we will continue innovating on analytic techniques to grow our stack upward.
Our development team is working diligently to create a common user interface across all of our platforms, as we bring out technology together. We have the opportunity to create a true first-class SQL engine running natively Hadoop, and to more fully exploit market leading cooperative computing with our On-Demand Integration (ODI) capabilities. I would also like to raise the awareness of the power and speed of our offerings as a general paradigm for analytic applications.
We don’t know what new challenges the Age of Data will bring, but we will continue to look to the future and build out a technology infrastructure to help organizations deal with the only constant – change.
Q6. What about elastic computing in the Cloud? How does it relate to Big Data Analytics?
Mike Hoskins: Elastic cloud computing is a convulsive game changer in the marketplace. It’s positive; if not where you do full production, at the very least it allows people to test, adopt and experiment with their data in a way that they couldn’t before. For cases where data is born in the cloud, using a 100% cloud model makes sense. However, much data is highly distributed in cloud and on-premises systems and applications, so it’s vital to have technology that can run and connect to either environments via a hybrid model.
We will soon see more organizations utilizing cloud platforms to run analytic processes, if that is where their data is born and lives.
Q7. How is your Cloud technology helping Amazon`s Redshift?
Mike Hoskins: Amazon Redshift leverages our high-performance analytics database technology to help users get the most out of their cloud investment. Amazon selected our technology over all other database and data warehouse technologies available in the marketplace because of the incredible performance, extreme scalability, and flexibility.
Q8. Hadoop is still quite new for many enterprises, and different enterprises are at different stages in their Hadoop journey.
When you speak with your customers what are the typical use cases and requirements they have?
Mike Hoskins: A recent survey of data architects and CIOs by Sand Hill Group revealed that the top challenge of Hadoop adoption was knowledge and experience with the Hadoop platform, followed by the availability of Hadoop and big data skills, and finally the amount of technology development required to implement a Hadoop-based solution. This just goes to show how little we have actually begun to fully leverage the capabilities of Hadoop. Businesses are really only just starting to dip their toe in the analytic water. Although it’s still very early, the majority of use cases that we have seen are centered around data prep and ETL.
Q9. What do you think is still needed for big data analytics to be really useful for the enterprise?
Mike Hoskins: If we look at the complete end-to-end data pipeline, there are several things that are still needed for enterprises to take advantage of the opportunities. This includes high productivity, performant integration layers, and analytics that move beyond the sphere of data science and into mainstream business usage, with discovery analytics through a simple UI studio or an analytics-as-a-service offering. Analytics need to be made more available in the critical discovery phase, to bring out the outcomes, patterns, models, discoveries, etc. and begin applying them to business processes.
Qx. Anything else you wish to add?
Mike Hoskins: These kinds of highly disruptive periods are, frankly, unnerving for the marketplace and businesses. Organizations cannot rely on traditional big stack vendors, who are unprepared for the tectonic shift caused by big data, and therefore are not agile enough to rapidly adjust their platforms to deliver on the opportunities. Organizations are forced to embark on new paths and become their own System Integrators (SIs).
On the other hand, organizations cannot tie their future to the vast number of startups, throwing darts to find the one vendor that will prevail. Instead, they need a technology partner somewhere in the middle that understands data in-and-out, and has invested completely and wholly as a dedicated stack to help solve the challenge.
Although it’s uncomfortable, it is urgent that organizations look at modern architectures, next-generation vendors and innovative technology that will allow them to succeed and stay competitive in the Age of Data.
Mike Hoskins, Actian Chief Technology Officer
Actian CTO Michael Hoskins directs Actian’s technology innovation strategies and evangelizes accelerating trends in big data, and cloud-based and on-premises data management and integration. Mike, a Distinguished and Centennial Alumnus of Ohio’s Bowling Green State University, is a respected technology thought leader who has been featured in TechCrunch, Forbes.com, Datanami, The Register and Scobleizer. Mike has been a featured speaker at events worldwide, including Strata NY + Hadoop World 2013, the keynoter at DeployCon 2012, the “Open Standards and Cloud Computing” panel at the Annual Conference on Knowledge Discovery and Data Mining, the “Scaling the Database in the Cloud” panel at Structure 2010, and the “Many Faces of Map Reduce – Hadoop and Beyond” panel at Structure Big Data 2011. Mike received the AITP Austin chapter’s 2007 Information Technologist of the Year Award for his leadership in developing Actian DataRush, a highly parallelized framework to leverage multicore. Follow Mike on Twitter: @MikeHSays.
–Big Data Analytics at Thomson Reuters. Interview with Jochen L. Leidner. November 15, 2013
– On Big Data. Interview with Adam Kocoloski. November 5, 2013
– Data Analytics at NBCUniversal. Interview with Matthew Eric Bassett. September 23, 2013
(*) Acquiring Versant –Interview with Steve Shine. March 6, 2013
– “￼Do You Hadoop? A Survey of Big Data Practitioners”, Bradley Graham M. R. Rangaswami, SandHill Group, October 29, 2013 (.PDF)
–ActianVectorwise 3.0: Fast Analytics and Answers from Hadoop. Actian Corporation
Paper | Technical | English | DOWNLOAD(PDF)| May 2013|
Roberto, nice perspective on Big Data. With the explosion of big data, companies are faced with data challenges in three different areas. First, you know the type of results you want from your data but it’s computationally difficult to obtain. Second, you know the questions to ask but struggle with the answers and need to do data mining to help find those answers. And third is in the area of data exploration where you need to reveal the unknowns and look through the data for patterns and hidden relationships. The open source HPCC Systems big data processing platform can help companies with these challenges by deriving insights from massive data sets quick and simple. Designed by data scientists, it is a complete integrated solution from data ingestion and data processing to data delivery. Their built-in Machine Learning Library and Matrix processing algorithms can assist with business intelligence and predictive analytics. More at http://hpccsystems.com