Apache Flink Offers a Challenge to Spark

Apache Flink Offers a Challenge to Spark

by Nick Heudecker  |  September 13, 2015  |

While Apache Spark has been hogging most of the data processing and analytics (DPA) spotlight over the last year, Apache Flink has managed to turn a few heads for real-time use cases. If you’re unfamiliar with Flink, it’s a memory-centric stream processing engine that can also do batch processing. Spark, on the other hand, is a memory-centric batch processing engine that simulates stream processing through micro batches. (I’m oversimplifying, but feel the rough comparison holds up.)

Based on a few user reports, Flink’s advantages relative to Spark are:

  • True event-based processing
  • Better memory management
  • Less configuration

Despite these potential advantages, information management and operational intelligence vendors haven’t embraced Flink they way the have Spark. (Flink is only commercially supported by data Artisans, the startup formed to offer support and training for Flink.) There are a few possible reasons for the lack of support: geography, attention and incumbents.

First, Flink originated at the Technical University of Berlin, not in Silicon Valley. It’s possible a culture of “not invented here” is keeping attention off Flink. Frankly, I think this is the least probable but it merits a mention. More probable is the finite amount of attention available in the space. Flink became a top-level project in January of 2015, after Spark had already captured a substantial amount of developer and vendor attention. Finally, there are several incumbent open source stream processing frameworks already available. Apache Storm and Samza are probably the best known, but there’s also the newly incubating Apache Apex. And if you want your stream processing in the cloud, GoogleMicrosoft and others have increasingly robust stories to tell.

I don’t expect the lack of broad commercial support to last. Early adopter customers will demand it from their information management and operational intelligence vendors, and the vendors will have little choice but to react. Whether Flink makes a difference to mainstream customers, or is simply another windmill to tilt at, remains to be seen.

 

Nick Heudecker 
Research Director
2 years at Gartner
16 years IT Industry

Nick Heudecker is an Analyst in Gartner Intelligence’s Information Management group. He is responsible for coverage of big data and NoSQL technologies.

Originally posted at Gartner`s blog.

Related Post

On Apache Flink. Interview with Volker Markl. ODBMS Industry Watch June 24, 2015

You may also like...