Circle Scales Up Digital Parenting Services with DataStax Astra DB
by Arlene Go. September 29, 2021 · 10 minute read
“Astra DB is built on Cassandra and takes care of the heavy lifting for you. When you’re starting off a new project, you don’t want to spend a week ordering hardware and building something up from scratch. You just want to click a button and go.”
— Nathan Bak, principal engineer, Circle
Circle, based in Portland, Oregon, is a leader in content filtering and screen time management solutions for families. Its Circle Home Plus device can manage all of a family’s internet-connected devices. The Circle Parental Controls App provides content filtering and blocking, time limits, Pause the Internet(R), bedtimes, and mobile device management.
For the newest installment of our Q&A series, “Behind the Innovator,” we’re featuring a presentation given at a recent DataStax event by Nathan Bak, Circle’s principal engineer. Nathan’s experience includes more than 13 years at IBM in addition to work at Rational Software, Novell, and Quark. Prior to Circle, Bak was a senior software engineer at DataStax.
Read on to learn how Nathan is using DataStax Astra DB to help Circle enhance its services with the power of a Cassandra database.
1. Can you tell us a little about your environment at Circle?
Although we offer a physical product, Circle is a cloud company. We use a lot of different technologies, but most of our infrastructure is currently hosted on AWS. We’re using Kubernetes, Docker, and Compose. Our microservices are written mostly in Go. At Circle we use relational databases, NoSQL databases, and of course DataStax Astra DB. We started using Astra DB earlier this year and our use has been growing.
We have lots of data, mainly three different types. First, there’s the user account and config-type data. Second is website and app categorization data, related to how appropriate those might be for children. Third is history and usage data, detailing which sites or apps family members are using, plus length of time.
We log history and usage every minute by account, by profile, and by device. We have a lot of customers and each might have one profile per family member. Each profile can have many devices. That becomes a lot of data very quickly, which creates various difficulties.
2. What are the biggest data challenges you’re facing?
Growth is a great problem to have, but it still is a problem. When you’re a fast-growing company, some of the biggest challenges are in right-sizing your database or instance. It’s tricky because when starting out you usually don’t know exactly how big your database needs to be. If you’re spinning using your own hardware or spinning up virtual machines, it’s hard to know what you need. You often have to resize or tweak things.
Another problem that you can run into is hard limits. There are a lot of different ways to store data and if you don’t have a lot of data, you don’t have to give it too much thought, because it’ll work fine. But once you get large numbers of records, you can very quickly hit physical limits. For example, if you’re storing your information on a hard drive, they’re very fast, but there’s a physical limit to how much you can read and write in one second.
Even if you do have your database figured out and right-sized initially, inevitably one day you’re doing something different and you’ve got an unplanned load. If you’re suddenly hitting your database with a bunch of new requests, that can affect the functionality of your core product. You don’t want customers to have a bad experience.
3. What makes Astra DB so great to use?
What I really like about Astra DB is its elastic scaling. Astra DB has Cassandra backing it, which means it can scale up immensely, without needing to plan around that. Predicting and balancing the workloads and storage sizing can be difficult. Sometimes you have a database with a ton of data that is only infrequently accessed. Other times you have the opposite, where it’s just a small set of data, but it has to work really hard to handle all of the requests.
A good thing about Astra DB is that compute and storage are separated, so you can easily handle either of those extreme scenarios or anywhere in between. And finally, with growth, sometimes you have a setup that works today, but then you go over a limit and it breaks tomorrow. This can cause people to have to wake up in the middle of the night, bleary-eyed, trying to figure out what caused the database to crash.
With Astra DB, you don’t have to be a database expert. You just have to watch the introductory video, do the five clicks, and bam, you’ve got an extremely scalable database with high performance and availability.
4. How are you using Astra DB at Circle?
We’ve been using Astra DB for proofs of concepts with new use cases. One new feature that we recently rolled out that is now in production to all of our Circle customers is usage emails. Every week, we’re sending our customers an email showing them the usage for all of the different profiles in their account.
This has been a very interesting use case to develop because it started off with just me, sending a single email to myself about my own usage. I was running all the queries from my laptop. From click to email sent it took about three seconds, which is pretty fast.
When you think about ten emails, that’s 30 seconds, still no problem. 100 emails, five minutes. But a thousand emails, then you’re up to an hour. And then ten thousand, a hundred thousand, or a million … suddenly by the time we’d be done sending all of the emails, it would already be more than a week later and we’d have to start sending out the next batch.
Obviously, what I had running on my laptop wasn’t sufficient and we had to come up with strategies to scale up. It’s been fun to put the capabilities of Astra DB to the test, putting the data in and seeing that we can pump out emails very quickly.
5. Why is Astra DB a good choice even for quick or small projects, like a Proof of Concept?
Moving fast is a hard thing to do. It can be a challenge because creating databases takes a lot of time. With Astra DB, you can create a new database with just a few clicks. Plus, there’s the issue of anticipating the sizing, but with Astra DB you can start small and then grow.
One thing about POCs is that sometimes you start something up and end up throwing it away. Other times, it grows. With Astra DB, it’s really easy to create new databases and throw away the ones that you don’t want.
So if I’m creating a new service, I don’t have to stress about whether I want to connect to this existing database or create a new one. It’s a no-brainer. Just create a new database.
As far as managing costs, if you’re doing something small, that’s very cheap with Astra DB. And with most POC-type projects, you’re probably fine remaining even on the free tier. Then, when you are ready to go from proof of concept to GA or production, all you have to do is turn it up. And by turn it up, I mean just start sending it more data and it will automatically scale.
6. If you had to pick one attribute of Astra DB that stands out for you most at Circle, which one would that be?
The best attribute of Astra DB is definitely Cassandra itself. Cassandra is awesome. It can just keep adding nodes and it’ll keep scaling. When we want to deal with a lot of data, Cassandra is great for doing that.
7. How did you come to the conclusion at Circle that with Astra DB, you really don’t have to worry about storage or scalability?
I’ve observed how it expands dynamically from my own usage. I push a little bit of data and it’s fine. Then sometimes if I push a lot of data, I might see latency pick up a little bit and then suddenly it drops back down to normal. So far it’s handled everything I’ve thrown at it.
8. Why not just use open source Cassandra? Why use Astra DB?
Cassandra is awesome. But Cassandra can be hard to configure. You’ve got to set up all of the nodes and take care of the replication factor, the health dashboard, backups, and things like that.
Fortunately, Astra DB is built on Cassandra and takes care of the heavy lifting for you. When you’re starting off a new project, you don’t want to spend a week ordering hardware and building something up from scratch. You just want to click a button and go.
I’m sure many developers have been in the situation where they throw together a proof of concept, show it to the powers that be, and some executive says “that’s great, let’s ship it”. If you don’t have a robust database backing you, you can be in trouble. But with Astra DB, you know you’ve got lots of flexibility, with all the power of Cassandra. You don’t have to worry about it.
9. For a beginner who doesn’t have much experience with Cassandra yet, what are your best quick tips for using Astra DB? Also, what do you think beginners should try with Astra DB first, before they build something with it?
First off, just create a database. It’s easy to do and it’s not going to be the last Astra database you create. It doesn’t have to be perfect. You don’t have to get the name just right. Just create a database and get started.
And then, if you’re like me, I learn best by doing and by having a project to work on. So don’t look at it as, ‘What can I do with Astra?’ That almost makes it sound like a problem. Astra DB is really a solution, and it’s a great solution for data.
For any project idea that you have, there is a very high chance that you’re going to have to deal with some sort of data. Any time you think about what you’re going to do with the data, think Astra DB.
10. Can you share an example of a simple, beginner-style project where you used Astra DB?
Last year, my daughter got a fish in a fish tank and it was important to her to treat the fish right, such as making sure that the water was the right temperature. We started with an ESP8266 development platform–basically an Arduino with on-board wi-fi and connected it to a temperature sensor. Then we wondered what we were going to do with the data, we needed to put it somewhere. Astra DB was our solution.
Astra DB has a REST interface, so you can push data into Astra DB via RESTful APIs, so it was easy to create some code for that in the Arduino library to start pushing that information into our Astra DB database. Once it was there, we could track the time series data. With the data in Astra, we had some other code that could retrieve the data and display it on a screen in our kitchen so we could monitor the temperature.
In this case, we’re storing information for only one input device, every minute. There are 10,080 minutes a week, so that’s not a lot of data. It’s certainly nothing compared to what Cassandra can handle, but it got us started and it was a lot of fun. So you can have a very small and simple use case–just figure out what data you’ve got, start putting it into Astra DB, and try playing around with it.
11. We heard you have a helpful code library that you’re willing to share with friends of DataStax who are reading this?
The Astra Dashboard has a Connect tab that has instructions and examples that make it easy to connect to Astra in a variety of ways such as REST, Python, and Java, but unfortunately there aren’t instructions specific to Golang. I put a project up on my personal GitHub account that helps me connect to Astra from Go code quickly: https://github.com/NathanBak/easy-cass-go
If people are interested in connecting up to Astra DB quickly, they can check that out. GoCQL is kind of the de facto Golang library. With Astra DB you get the secure connect bundle, which has all of the certs you need. It can be tricky to figure out how to connect everything, so the code in my repo lets you just pass your client id, your client secret, and your secure connect bundle and it will create and configure the GoCQL session for you. So it’s an easy way to get your Astra DB database started and tap into all of the Cassandra goodness.
Nathan Bak is Principal Engineer at Circle (https://meetcircle.com) where his focus is using data in the cloud to support Circle’s mission “to make families’ lives better online and off.” Previous to joining Circle, Nathan was a member of the team that built DataStax Astra DB. He also spent several years as an IBM Master Inventor and authored patents and publications in numerous areas including internationalization, dependency management, and build optimization. Nathan holds a BS in Computer Science from the University of Maryland Global Campus and a MA in East Asian Languages and Literatures from the University of Colorado in Boulder.
Sponsored by DataStax