On Security and time series databases. Q&A with Darin Fisher
Q1. What is security event monitoring?
In the most basic sense, security event monitoring is the process of observing and reacting to known or derived security events.
Q2. Why are time series databases a solution for managing security events and anomalies?
“One efficient way to monitor security is to model user behavior using time series data and watch it for anomalies over time. Depending on the individual SaaS product or service, there could be five or more metrics to collect for creating a mathematical model that describes “normal” user behavior.
For example, for a developer platform, you could model commands such as “commit” or “clone” to get a sense of a typical level of activity. Over time, you will start to see how often these commands are used per day, week, and month on average, as well as where they originate from geographically. Let’s say you have 80 engineers and almost all of them are based in the US and Western Europe, but you suddenly see a connection delivering commands from Ukraine. That would be an obvious red flag that something might be — and likely is — up.
Similarly, most organizations perform only a few clone operations each day or week; employing time series data to model activity over the course of a few months reveals your organization’s typical use. If your graph suddenly spikes to 100 or more where you usually see three, you know you’ve got a problem.“
Q3. How is it possible to quickly be informed of the alerting and incident response process without having to dig into log files?
The best way to accomplish this is to set up some automated processes. For example, we can use a time series database, like InfluxDB, to collect and analyze log data. Next, you’d want to create models from the data. Again, we use InfluxDB to visualize these models and to establish alerting thresholds. We have a lot of flexibility when configuring alerts, so you can pick from a range of services or create your own. The nice thing is that we can use InfluxDB to handle most of the process; it’s on us as individuals to follow up on the alerts it generates.
Q4. Why is log data on its own not an efficient way of finding the anomalies?
Log data is great for identifying anomalies and when they occur. However, log data tends to be both verbose and granular so it can be a challenge to find the information you want. When it comes to security monitoring, you need to think in terms of behavior patterns, and logs don’t provide those. Similarly, an anomalous event may occur, but, because it looks like a bunch of other normal events you can’t identify as an anomaly. To get a full range of security anomalies, log data needs to be placed in the context of time. Establishing a time window enables you to convert event data into metrics. Once you do that, you can see how regularly the anomalies occur, and determine if they even are anomalies. At the same time, you can correlate log metrics with other events and other data sources to put together a more comprehensive, and accurate picture of security threats.
Q5. What does it take to successfully implement a security monitoring infrastructure?
There are a number of items to consider when building a security monitoring system. First, you want to make sure that you have the broadest possible observability. That means knowing your sources, bringing that data in, and being able to process in a way that makes sense. Next, you need to understand your risks. What are your attack vectors? Are some systems more susceptible to attack than others? How do you know, and what does anomalous behavior for that vector look like? Being able to answer questions like these will go a long way toward protecting your entire ecosystem. Concomitant with that is understanding your supply chain. How do systems integrate, and how will a breach in one system affect other systems upstream/downstream? Finally — and this may seem obvious but can be challenging for some organizations — is the need to create as much transparency as possible with all stakeholders in your organization. You want to work to actually solve problems and avoid putting up gates and/or blocking certain stakeholders from getting access to the information they need.
Q6. Can you describe, in a nutshell, the functional architecture of the InfluxData security monitoring platform?
The architecture is very similar to most modern monitoring solutions with some extra collection and observability models.
In a nutshell, however, the process looks like this: Collect appropriate events and metrics, translate those events into measurable metrics, store those metrics, model the data, observe and analyze the data in the models, generate notifications for data that crosses the established thresholds, and evaluate those alerts. This entire process constantly repeats.
Q7. Security monitoring is about anomaly detection — what are the deviations from normal? What patterns should one look for?
This can vary greatly, but I’ve found that most anomalies fall into a few categories: rate of change, sudden changes, and general deviations from normal. It’s important to understand what the “normal” patterns are for each category. This helps to maintain a more accurate baseline and mitigate the frequency of false positives.
For instance, if we’re looking at authentication data, different groups within an organization will have different access patterns. Heavy travelers (e.g., sales, marketing, executives, etc.) frequently change source addresses. In this case, the lack of consistency in source location actually becomes a pattern. Contrast this with individuals and teams within the same organization who don’t travel and have a very limited source pattern. We use time series data to create historical models for each group and individual, establishing “norms” that make anomalies easier to detect.
Q8. If you store the event data in InfluxDB, how do you view the anomaly information across any time period?
InfluxDB includes a web-based user interface to build and test the logic behind your security monitoring solution, and to then visualize that data. InfluxDB allows you to create automated queries and tasks to process and transform that data. You can also use tasks to set alert delivery thresholds and destinations. Many customers have very specific needs for time series data, and InfluxDB supports developers with a very robust API.
Q9. Many cloud services and SaaS applications don’t provide access to security events such as logins. Is this a problem?
It can be, yes. Observability is difficult without automated visibility. But that’s one of the reasons why it’s so important to proliferate the idea that this type of data is security data. It’s our hope that SaaS vendors will recognize this and start to provide greater access to this type of data.
Q10. What are some use cases for using InfluxDB for security monitoring?
We have seen InfluxDB used for several different security monitoring approaches. Some examples include, simple certificate state, authentication verification, integration with other security tools (see Fail2Ban integration), distributed transactional services monitoring, network traffic analysis, and user authentication tracking.
Q11. You have announced that InfluxDB Cloud is SOC 2Ⓡ Type 1 and Type 2 certified. What does it mean in practice?
This demonstrates to our customers and partners our commitment to their compliance requirements and the transparency of our security practices. In other words, this shows the world that we actually do the security practices we claim to do.
Darin Fisher, Sr. Manager Security and Compliance, InfluxData
Darin has over 30 years of technical experience covering technical support, network engineering, software development, and large-scale data processing systems. Having spent a good deal of his career working with banks and tech companies, he now gets to spend his time merging operations, engineering, and security, along with trying to lower his golf scores.