Welcome to the series on Analyzing Time Series Data 👋 Check out the video & learning materials from the author-led workshop
Many of us work with data that is about time, tracking fluctuations by hour, day, week, month, etc. We need to know what caused a spike or dip, learn why the actuals diverged from the forecast, characterize typical daily, weekly, and seasonal patterns, or see how the correlation between two metrics depends on time.
In this collection of stories we share how we've looked at a “mountain of data” about energy production in the United States and the signals we've found.
These are stories about analyzing data that changes over time. While most of us don't dig into data about energy day-to-day, we hope the feel of this data and these questions will be familiar to anyone who regularly faces questions like “what changed?”, “what happened?”, “was that normal?”, “what is typical?”, and “did things go as expected?” We hope that this will spark an idea about how to look at your own data in a new way.
If you've done this type of analysis work, you know that it's hard and the signals are often subtle. In the crux moment of discovery you’re more likely to say “hmm... that looks funny” than “Eureka!”
These stories are about creating those "hmm..." moments. Looking at our data in a way that gives us a chance to notice. A chance to ask the next question, and the next, until the story is so clear that we forget that we never had seen it this way before.
It’s not enough for an important feature of the data to be visible. It needs to be noticeable.
When you compare two metrics over time, standard practice is to draw both lines and look for the gaps. But, if the the difference is most important, visualize the difference.
In this notebook, we show how changing the visual marks exposes a subtle signal of the 2021 Texas Energy crisis.
Read the story here.
Seeing contributions over time with simple charts
When diving deeper into a dataset, being able to quickly generate different perspectives with visualization can make a big difference. How does visualization fit into your analysis workflow?
In this notebook we take you through the thought process of an analyst using Observable Plot to uncover hidden patterns behind an unusual spike in energy generation in Texas.
Read the story here.
If the only feature you have is the date, how can you uncover meaningful patterns?
Encapsulated in each timestamp is rich information about when something happened — the day of the week, month of the year, etc. Yet, when buried in our data, it's easy to overlook obvious questions, like did this happen on a weekend?
In this notebook, we augment our dataset with relevant time features to explore fluctuations in energy demand in California.
Read the story here.
Which chart is “best” depends on the data.
Everybody knows that line charts are the best way to look at one metric over time. But, maybe that's not always the case?In this notebook, we look at 6 simple charts that all show exactly the same data in almost the same way. We notice how each chart's effectiveness is influenced by the characteristics of the data itself, and how a series of small tweaks transforms a basic chart into one that clearly reveals the most important aspects of our data.
Read the story here.
How can you assess the relationship between two variables over time?
There are many approaches for analyzing correlation and for analyzing change over time. But, what about when you need to understand both time and correlation at once: to see how the correlation between two metrics varies over different time periods?
Or, how often have you stared at a sea of colorful dots on a scatterplot and felt like it was just an overwhelming mess?
In this notebook, we switch from using only colors to mark the categories in a scatterplot to faceting by category. The resulting mini-charts, one for each time period, reveal nuanced time-based distribution patterns that were literally impossible to see in the standard approach.
Read the story here.
The above stories are driven by two rich public datasets that showcase real-world patterns. In the process of developing the stories we built the following two interfaces for making it easier to access the data in visualization-ready form.
An interface for exploring a large, complex and powerful public dataset.
An interface for quickly downloading historical weather data from NOAA.
Watch a video workshop and/or work through the exercises to learn how to create charts like the ones used in this series of stories.
More in the Analyzing Time Series Data collection: What’s different? Analyzing Time-Series Forecast Performance