Q&A with Greg Luck, CTO of Hazelcast Inc
Q1. What are your main lessons learned in deploying cloud-native applications in Kubernetes?
Hazelcast is performance optimized technology. It required extra work to configure Kubernetes networking to work with hazelcast without sacrificing performance. Kubernetes is a powerful technology so you can customize it depending on your architecture and needs. It has a steep learning curve. However, once you know what to do, development for Kubernetes becomes smooth and straightforward.
Kubernetes is also cloud extreme. We had to extensively validate our enterprise product, which has high availability features. You cannot rely on the IP address at all in Kubernetes. So this had to be reworked which we did in Hazelcast 3.12.
Kubernetes is also perfect for Microservices – we would go so far as to say that pretty much every Kubernetes application is also going to be a Microservices application. Hazelcast is embedded in each node of the Microservice for perfect isolation of the operational store and caching. So we think Kubernetes and Microservices reinforce each other.
We have eaten our dog food on this, using Kubernetes and Docker for our own managed service: Hazelcast Cloud, which went live 26 March.
The result for Hazelcast is simplicity for the end user with the power and speed of Hazelcast, all wrapped in a Helm Chart.
Q2. What is Hazelcast IMDG?
Hazelcast is an in-memory data grid (IMDG), which is a set of distributed processes that share memory and compute to form a cluster. A cluster can run over many machines and can store data in different formats; it can also be used to provide parallel computation. IMDGs have been around for over a decade now and started life by solving problems such as storing user sessions for web server farms in a distributed and HA manner. Since then they’ve been applied to many different use cases where low latency is a primary concern; for example trading systems, e-commerce and fraud detection. Architects often turn to an IMDG when their existing NoSQL or RDBMS systems are unable to scale effectively or to provide low latency and reliable, consistent throughput. In this regard, IMDGs are inserted into an existing architecture and sit above a persistent data store via the use of various integration APIs.
The APIs for working with data mirrors those of most popular languages. For example, the Hazelcast Key Value API is based upon the Java Map API. Developers get to choose from many data structures such as Maps, Lists, Sets, Queues, Ringbuffers, etc.
Hazelcast is much more than just a distributed in-memory store. Developers often use the Concurrency APIs to facilitate the creation of their own fault-tolerant distributed services. Hazelcast is also a good fit for deployment of Microservices.
The primary advantage of an IMDG is speed, which has become critical in an environment with billions of mobile, IoT devices and other sources continuously streaming data. With all relevant information in RAM, in an IMDG, there is no need to traverse a network to remote storage for transaction processing. The difference in speed is significant – minutes vs. sub-millisecond response times for complex transactions done millions of times per second.
Q3. How does it work the integration of Hazelcast IMDG with the Kubernetes environment?
I think Hazelcast and Kubernetes are a perfect fit because Hazelcast requires zero manual work when you want to scale up/down Hazelcast clusters. The only challenge is the case if you want to connect the Hazelcast cluster from a client that is outside the Kubernetes cluster. If that is the case, then you need to configure the Kubernetes network carefully for the best performance on the client-side. If you are already deploying your client applications to the same Kubernetes, then you can start using Hazelcast inside Kubernetes without any extra configuration.
See our Kubernetes documentation for step-by-step instructions on how to use Hazelcast in a Kubernetes environment.
Q4. How do you deploy a fully functional Hazelcast cluster?
For both IMDG and Jet there are two ways: embedded and client-server. Let me explain using Hazelcast IMDG as an example.
With embedded, you add the Hazelcast jar to your application. This often a Spring Boot application. You include a hazelcast.yaml configuration file or configure it programmatically. Here it runs in the lifecycle of your application.
With client-server, you install a Hazelcast tarball, use Docker or the Hazelcast Helm Chart for Kubernetes. Once again you need to configure it. You then access it using a client, which you can download from https://hazelcast.org/clients. The client is also configured either declaratively in a configuration file or programmatically.
Q5. What about scalability?
Hazelcast IMDG and Jet are both elastically scalable in and out, without causing interruption or downtime. In Hazelcast Cloud, we use this to allow you to slide in and out on demand or to do it automatically when data goes above or below 40% and 80% of capacity as shown in the screenshot below.
The way this works is that in Hazelcast, a new server can be added to the cluster. It finds the cluster, negotiates entry to it and given a share of data and processing. Other than starting up a new server, so DevOps work is required. In Hazelcast Cloud, we automate the server start process.
Hazelcast can scale to terabytes of memory storage and thousands of CPU core. Scaling is a more straightforward proposition when compared to RDBMS and NoSQL solutions.
Q6. Can scaling down Hazelcast cluster result in data loss.? If yes, how do you mitigate this?
No. Hazelcast keeps at least one backup copy of data on another server. When a server is shut down there is always another copy. So shutting down a single node will never result in data loss.
To support scale-in with more than one mode at a time, enable Graceful shutdown which will proactively migrate data from servers shutting down using the following environment variable:
See Graceful Shutdown in https://hub.docker.com/r/hazelcast/hazelcast/ for more information. In Hazelcast Cloud, we couple that with a call to isClusterSafe() to do a final check before terminating a process.
Q7. What is Hazelcast Jet, and what is it useful for?
Hazelcast Jet is an engine for in-memory streaming and fast batch processing.
Jet is used for latency-sensitive applications at scale such as near real-time analytics, prediction, fraud detection, event-driven applications or for continuous ETL. The massively parallel continuous streaming core of Jet is designed to keep the latency constantly low with growing workloads. Jet is elastic; the cluster can scale up or down to adapt to a load spike or tolerate machine failures, without affecting the correctness of the computation.
Hazelcast Jet embeds Hazelcast IMDG, so processing jobs can take full advantage of the distributed in-memory data structures provided by Hazelcast IMDG for data ingestion, caching, messaging and data distribution.
Jet is a lightweight, embeddable library with no dependencies. The straightforward deployment makes Jet relevant for containerized applications, IoT (edge) deployments, OEMing, and for microservice architectures.
Q8. What is Hazelcast`s roadmap ahead?
Hazelcast Cloud launched in March this year. We will add further features to it for multi-zone, multi-region, multi-cloud and hybrid cloud. These capabilities exist inside Hazelcast Jet and IMDG today and will be made available to our managed service. For larger enterprises, we will also be creating Hazelcast Cloud Dedicated, which will use dedicated, isolated resources in their own VPCs. We will also be adding security certifications.
For IMDG, we are moving into multi-model. We are launching document-oriented features for the first time in 3.12 with high-speed native JSON support. Also in 3.12, we are moving our atomic structures to a CP subsystem. This will enable our users to create robust, correct distributed systems on top of Hazelcast. Finally, we have been making a substantial investment in all things cloud. Part of that is the completion of our Kubernetes feature set in 3.12, with all major public cloud Kubernetes services certified to run Hazelcast.
Greg Luck, CTO of Hazelcast Inc, is a leading technology entrepreneur with more than 15 years of experience in high-performance in-memory computing. He is the founder and inventor of Ehcache, a widely used open source Java distributed cache, that was acquired by Software AG (Terracotta) in 2009 where he served as CTO. Prior to that, Greg was the Chief Architect at Australian startup Wotif.com that went public on the Australian Stock Exchange (ASX:WTF) in 2006.
Greg is a current member of the JCP (Java Community Process) Executive Committee and since 2007, has been the Specification Lead for JSR 107 (Java Specification Requests) JCACHE.
Greg has a Master’s degree in Information Technology from QUT and a Bachelor of Commerce from the University of Queensland.