Mike Bostock has been busy. Our CTO and co-founder is the author of D3.js, an open-source library for data visualization which celebrated its 10th anniversary this year, and Observable Plot, a new library for exploratory visualization that builds on D3. Not to mention designing and developing features for the Observable platform.
For this week’s Future of Data Work series, we sat down with Mike for a Q&A about his motivations and how he hopes people work with data in the future.
Between your time as a graphics editor for The New York Times, starting Observable, developing D3, and now Plot, you’ve dedicated a significant portion of your life to visualization. What is it about visualization that has kept your commitment going?
I like the human-centric nature of visualization: you’re creating something whose only goal is to help people understand or communicate. A tool for thought, rather than a better mousetrap. Visualization revolves around human perception and cognition, and while there’s technical wizardry and science happening under the hood, it can be appreciated by anyone (at least, anyone with vision). The reader doesn’t need a PhD to see the value.
There’s also a visceral gratification in producing something you can see. It’s less abstract; it’s visual, spatial, tangible. Occasionally it’s beautiful. I suspect that’s why a lot of new programmers are drawn to visualization.
But most important is just the belief that visualization helps people. As Don Norman says, it makes us smart. It sounds idealistic, but it keeps me going.
How would you describe your approach to building technology that helps people understand their data? To me it reads as a systemic approach of removing barriers and obstacles.
Right, it’s a progression. You don’t set out to climb the mountain in one bound. You break it down into smaller steps that are achievable individually, yet which work together to address the larger goal.
I started Observable thinking about how we construct displays of data. What makes coding user interfaces so hard? How can we make it easier to incorporate dynamic or interactive elements, like parameterized models, randomized simulations, and animations? I wanted an environment for that sort of work. I half-jokingly referred to it as building a new IDE, but an integrated discovery environment rather than development environment.
At the same time, I didn’t want to introduce limitations on the displays themselves. Constraints would make the problem easier, but uninteresting. I wanted to preserve creativity and expressiveness. As a kid and as an adult, I never want to do what someone tells me to do; I want freedom to make up my own mind. This necessitated a low-level approach at the beginning. Code for everything, even paragraphs of text! And the depth of modern web technologies is staggering. But I was also making interaction more intrinsic to the medium—it should be taken for granted, like air—and thereby making the work easier and more enjoyable.
Is your hope that people won’t have to learn web technologies in the future?
I always want people to learn and I don’t want to discourage learning. It’s more, what is the right thing to learn at the right time? Do I think people need to memorize the SVG specification to construct meaningful visualization? No. Do I think people should learn D3? It shouldn’t be your top priority. It’s a thing you learn progressively when it’s useful to you. Should you study technologies or data? Both constitute learning. We’re not trying to eliminate learning; we’re optimizing what we spend our time on, asking how best to spend our finite cognitive resources.
Because D3 is so low-level, I worry about the effort expended on technical aspects that don’t directly contribute to understanding. To make a chart in D3, you think, “OK, first I need scales to translate abstract data into pixels; then I need rect elements with the appropriate attributes for each bar; then G elements for each axis, translated for margins,” and so on. Would this time be better spent thinking about the data? Ask, what is interesting in the data? Should I be looking at other data? What is the most effective form to convey my insights? The struggle to instruct computers takes away from human needs.
Plot, as a higher-level tool, alleviates some of these burdens. It understands that visualizations need scales (encodings), that the right scale type depends on the data, that some scales are positional, and that positional scales should have axes. You don’t have to do all that yourself. But if you want to, you can learn more and refine your plot. Maybe you want a square-root transform or a log transform. Or maybe you want to customize the ticks or add grid lines. Plot gives you something meaningful as quickly as possible, and you can then tailor as needed.
What do you hope people do with Plot?
I hope they make more meaningful visualizations, more often. Even if they’re not as technically impressive as D3 visuals. I hope people learn from their data and are able to communicate insights effectively. I want Plot (and more generally Observable) to augment human intellect.
Something tells me that Plot is not the endpoint for helping people achieve that goal. When you look at what people can do with Plot today and what you hope people can do in the future, what does that look like?
In my dream, there’s a ladder of abstraction. On the lowest rung, low-level code for constructing bespoke views of data—WebGL fragment shaders, say. At the highest rung, visual interfaces for data with overviews, zoom and filter, details on demand (Shneiderman’s mantra). We start exploring on the highest rung, and as we discover insights, we descend the ladder as needed.
Most people in this space seem to be building ladders from the top down. Say we build a point-and-click visualization tool. Sounds great, right? The problem is once someone reaches the limits of that interface, which will happen very quickly, there’s no recourse. You have to switch to a different tool and start all over again.
Observable is building a ladder from the bottom up. We start with JavaScript, Canvas, SVG—the full power of the web. One rung up is D3. One more, Plot. But there’s no reason to stop there. We’re looking beyond programming interfaces to graphical interfaces. Wouldn’t it be nice to get a beautiful summary table without code? Then click on columns to sort and filter, or to map columns to a scatterplot?
We’re building from the bottom up to retain expressiveness, creativity, and flexibility. You should be able to ascend or descend the ladder without having to start over from scratch. Imagine a graphical interface that can “eject” to Plot code when needed. And where you can use whatever library or interface feels right for the job. The Observable substrate is capable of nearly anything, so when you run into the limits of one approach, you can adapt without starting over.
So in the end, if the tool melts away, and folks can focus on their core work, is the idea that the work just becomes easier? Faster?
When we say a particular task is hard, we typically mean two things. The first is the preparation—the requisite skills or knowledge. The second is the performance—the actual work entailed, and how tedious or time-consuming it is. We can reduce effort by reducing both skills and actions. Less work to prepare, and less to perform.
Reducing effort is incredibly exciting because it doesn’t just save time, it changes who engages in a task, and when! When we perceive a high amount of effort relative to the return, we prejudge a task as not worth the effort. Conversely, when a task is perceived as easy, we perform the task in more contexts.
By making visualization easier, then, not only will more people visualize data but these people will visualize data more often. If that helps us understand our world and make better decisions, I’m all for it.