Why Time Series Data Matters
By Susannah Brodnitz
What is time series data?
As the world becomes more automated, sensors and applications need to make smarter decisions in real time, and time series data is becoming increasingly important. Time series data includes any kind of measured variable with a time stamp, from temperature to internet traffic. Common sources of time series data include sensors, applications and systems.
Our world is built around time, and time series data is a part of everything. It’s used to identify patterns in data, make real-time decisions, and forecast future events. Two kinds of time series data are metrics and events. Metrics are taken at regular intervals and events are taken at irregular intervals, as external events occur or users choose to take a measurement. You can average events over regular intervals to transform them into metrics.
Challenges of working with time series data
As more time series data is collected, it’s important to understand how to work with it effectively. Because time series data comes in a linear order, the newest data should almost always be appended to existing data. Some aspects of time series data that require special handling include lifecycle management, summarizing data over time, and scanning over large ranges of time.
A common lifecycle for time series data starts with an initial phase of collecting finely detailed data and storing it for a brief period of time. Later phases may summarize and downsample data, which then gets stored long term. One of the most common queries for time series data is a summary over a large time period. Queries like this can take a long time to run if the data isn’t stored in a database that is built to handle time series data.
Using a time series database
Time series databases stand out from more common relational databases because instead of using rows and columns to quickly find relationships between data points, they are designed to handle the unique workloads of time series data. They help developers build applications for IoT, monitoring, and analytics more efficiently for quicker and more valuable results. Currently the number one time series database is InfluxDB, which is what this article will highlight. InfluxDB is a platform built specifically for time series data that is optimized for both metrics and events. It can handle large volumes of time series data, with precision down to nanoseconds.
Data in InfluxDB is assigned a measurement name, key/value pair tags for metadata, key/value pair fields for data values, and a time stamp. Users can query data across time by measurement, tags, and/or field. This database structure lets a single server handle over 2 million writes per second.
InfluxDB isn’t just a database — it’s a complete time series platform that includes a UI and dashboarding tools, the collection agent Telegraf, and the query language Flux. InfluxDB also has client libraries so you can work with it in the programming language of your choice. Common time series queries, such as data summaries, return results from months of data points in milliseconds. InfluxDB also has built-in methods to manage data lifecycles so developers don’t have to write and implement deletion schemes on top of their applications.
InfluxDB users typically fall into three primary categories: DevOps monitoring, real-time analytics, and IoT monitoring. Application performance monitoring gives companies insights into the health of an application. Because InfluxDB is built to handle real-time analytics and large volumes of time series data in a central platform, it works well for application performance monitoring. All metrics, events, logs, and tracing data involved with application performance monitoring can be integrated and monitored in one platform.
Some businesses offer metrics and analytics as a service. InfluxDB allows them to keep all data in one platform with built-in visualizations and querying capabilities. Real-time data analysis helps companies make smart decisions. This type of time series work is important for all kinds of companies. Some users create applications that use InfluxDB as the database and backbone, and others use InfluxDB for internal business or development metrics.
Sensors from IoT devices collect time series data used for automation, predictive maintenance, and to aid future engineering and decisions. InfluxDB has a Iot of users working in factories, energy plants, smart homes, and more.
Time series data is everywhere, and it’s driving the most important decisions companies make. Paul Dix, the creator of InfluxDB, said in an interview, “The context of time is critical to understanding both historical and current performance.” As more of the world becomes automated, more data will be collected, and it’s important to handle it properly to get actionable insights. Using a database built for time series data lets users focus on the applications they’re building, secure in the knowledge that their storage and queries will be optimized.
Susannah Brodnitz is a Technical Writer at InfluxData. Before this, she studied physics at Oberlin College and worked as a physical oceanography researcher at the University of North Carolina Wilmington.
Sponsored by InfluxData