Using Couchbase at CenterEdge. Q&A with Brant Burnett
Q1. What are the main technical challenges you face when developing software solutions for the entertainment industry?
When developing solutions for the entertainment industry one of the biggest challenges we face at CenterEdge Software is dealing with the mix of cloud and on-premise systems. Modern visitors to entertainment facilities have expectations around the ability to make purchases online, especially for capacity limited venues like movie theaters and trampoline parks. However, the ability to make walk up purchases quickly and easily is also very important to the facility and their customers.
The rise of broadband internet connections over the last decade has made this kind of business model possible. Unfortunately, highly available and redundant internet connections are still out of reach for the majority of entertainment facilities due to location, cost, or technical skill. Therefore, at CenterEdge we’ve invested a lot of effort into synchronization and conflict resolution paradigms between the on-premise and cloud systems, with the goal of enabling as much functionality as possible in both areas when connectivity is unavailable.
The next major challenge is dealing with correlating the data we’ve synchronized into our cloud platforms into shared views of the data for franchises or other groups of related facilities. For example, while each facility maintains its own customer database, among franchise groups there is the desire to share some portion of the customer data as the customer visits other facilities within the group. Maintaining and synchronizing these semi-separate data sets can be quite a challenge, especially at scale.
Q2. Do you work with .NET Couchbase SDKs and especially the LINQ provider?
Yes, we work extensively at CenterEdge with both the .NET Couchbase SDK and the LINQ provider. I think we were also one of the first companies to use the .NET Core flavor of the SDK as we began transitioning to a containerized microservice architecture. Couchbase began working on a .NET Core flavor of the SDK while .NET Core was still in beta, which meant that we were able to take advantage of .NET Core and the associated reductions in operating costs very quickly.
The LINQ provider has been very helpful in easing the transition from traditional RDBMS platforms like SQL Server for our developers. While Couchbase’s N1QL query language is very SQL-like and easy to use, the ability to write LINQ queries in C# provides an even more familiar way to write queries. Using LINQ has also made unit testing easy by providing an easy route to in-memory mocks.
At CenterEdge we’re particularly fond of the open source nature of the SDK and LINQ provider. It has given us the opportunity to directly apply improvements to the SDK, which in turned allowed us to move forward quickly with addressing our business needs. Partnering with the Couchbase SDK team to implement these improvements has been very easy and productive. I am also a major contributor to the LINQ provider, which I’ve found very educational and, frankly, has been a lot of fun.
Q3. How do you use Couchbase at CenterEdge?
At CenterEdge we’ve been using Couchbase in one form or another since 2012. Our first experience was using Couchbase as a managed query cache, as it provided operational advantages above and beyond a simple memcached implementation. The use of Couchbase at CenterEdge quickly escalated, as we found the memory-first JSON architecture was highly effective for managing online shopping carts. We saw very significant performance gains persisting our carts to Couchbase using simple get/set operations versus our previous SQL Server approach.
Over time as the functionality of Couchbase continues to grow, so does our use of Couchbase. At this point, Couchbase is our go-to solution for databasing in the cloud environment. Being able to persist data with the scale and availability of NoSQL while retaining accessibility via N1QL queries has been a game changer for us.
We have also been able to utilize Couchbase’s built-in full text search system to power more advanced queries, such as multi-field customer database queries. This offered a significant time and cost savings over implementing a separate solution like Elasticsearch.
Q4. How do you guarantee the scalability for your cloud applications?
Designing software that’s guaranteed to be scalable is always an interesting challenge, but it really starts with picking the correct architecture and tools. If any given component of your architecture isn’t scalable, there’s always the risk that you’ll suddenly hit a wall and find you need to scale it and can’t. At CenterEdge we have an interesting story, because we moved gradually from an application that was not at all scalable (and had a major Black Friday outage several years ago) to one that is highly scalable.
Our first step was transitioning from data center colocation to a cloud provider. This enabled us to scale vertically in minutes rather than days or weeks. However, it still required manual intervention and down time.
Our second step involved some application redesign surrounding state management. In-process state management becomes problematic once you try to scale the application tier horizontally. By transitioning temporary state (such as session state) from in-process to Couchbase, we were able to make the application tier scalable using load balancers and enable auto scaling based upon demand.
Finally, we began making our data persistence tier scalable using Couchbase instead of RDBMS. While RDBMS solutions can be made scalable, the effort and cost involved was too high for our taste. Couchbase’s approach to autosharding data as we scale the cluster out with zero down time has since proven very effective. This also gives us the advantage that all state, both temporary and persisted, can be run through the same platform, further reducing both temporal and financial costs.
Q5. Do you think it is a good idea to store very large data sets for multiple customers in a single cluster? and why?
In the past at CenterEdge we avoided storing multiple customers’ data sets in the same RDMBS database. There were two major motivating factors for this:
- Keeping data securely isolated, avoiding the potential for accidental cross talk between our customers’ data.
- Maintaining separate data schemas for each customer. As we released new versions of our cloud applications we’d beta test them on a limited subset of customers. However, the release and beta versions of the application would often have different schemas.
Unfortunately, the isolated database approach had some downsides:
- It made the solution more difficult to scale, as it inherently creates a data sharding approach that is per customer. This means that if we have one customer generating more load on the system than others the load is artificially constrained to the servers where that customer’s data resides.
- Poor scaling also had a secondary impact, which was high cost of operations. It was necessary to size each server based on peak load for the clients using that server, leaving computing power on the table during off hours. This was exacerbated by the international nature of our business, where sharing computing power is particularly advantageous due to time zone differences.
- We found that separating the customers’ data created additional difficulty getting a consolidated view of the data for our internal purposes, such as analyzing application use patterns and generating bills.
Since we’ve transitioned to using Couchbase for our new cloud products, we’ve also switched to keeping multiple customers’ data sets in a single cluster.
- We’ve addressed our data isolation concerns by including a consistent attribute in every document to identify the associated customer, combined with careful code review and testing.
- The schema concern became non factor with Couchbase, since we now manage schema in our data access tier within the application rather than directly in the database. This allows each version to have its own schema without a separate database.
- Data is no longer sharded by customer, and is instead sharded evenly using Couchbase’s autosharding mechanism.
- Load is evenly shared across all data nodes, allowing scaling based on overall peak load rather than per customer peak load.
- Data is easily accessible from one database for combined internal reporting using N1QL queries (or soon via the Analytics system)
Q6. What is your experience in using the Query Workbench for N1QL?
At CenterEdge we began using N1QL queries when the feature was still in beta, and before there was a Query Workbench. At that time we developed our own very simplistic tools to assist with development and running ad-hoc queries in production. Since the advent of Query Workbench, working with queries in Couchbase has become much more streamlined. I use the Workbench regularly, and find that it almost always meets my needs.
As new versions of Couchbase have been released I have seen the evolution in the Query Workbench tool, and each new version adds exciting new features. The query plan visualizations, both estimated and actual, added in Couchbase Server 5.0 make understanding and optimizing queries a breeze. The Bucket Insights panel is very useful, helping users determine which JSON attributes are available based on schema inference, and recent versions even warn if you try to use attributes that may not be present.
About Brant Burnett