Observable hosted our second Student Data Jam this spring and the competing teams did not disappoint. The winner of this competition is Data Dynamically, from the University of North Carolina, Asheville. Eric Dunbar, Jacob Mello, and Asher Nejezchleb explored national college graduation rates and those of individual states by combining the contest datasets with additional data from the Integrated Postsecondary Education Data System (IPEDS). They used Bayes' Theorem to explore this data more deeply according to demographics like sex and race.
Observable judges, Robert Harris and Aaron Dennis, selected Data Dynamically as the winning team because of their thorough data analysis and striking hero visualization. They awarded the team extra points for storytelling as the team provide context for their conclusions and detailed future questions. The team also submitted early for bonus points, which is no easy task during a busy semester.
Data Dynamically’s winning hero visualization is an interactive choropleth map that displays enrollment and graduation data broken down by sex and race.
What skills did your team bring to the project and how did they come together when creating your notebook?
Eric Dunbar (ED): Everyone on our team had started learning Observable in our Dynamic Data Visualization class. The timing for the competition lined up with the final project, so our professor gave us the option to build a team and compete in the Data Jam. We decided that it would be a great opportunity to show off what we've learned in class and also get a good grade on the final.
How was building with Observable helpful to your team?
Jacob Mello (JM): I have a lot of experience in R and a little bit in Python when it comes to data science. I thought maybe I would have to do all the data processing and then throw my results into a notebook, but it turns out we were able to do it all in Observable. Even without a programming background, I could see how I could have done this in Observable - it just would have taken much longer.
ED: We made our hero visualization using the D3 library because we had already learned D3’s map capabilities in class. The approach we learned enabled us to make it very customizable; we could do a bunch of different things at once and we didn’t have to do it completely from scratch because we had this fantastic library.
Asher Nejezchleb (AN): My philosophy for a project like this is to only go as deep as you need to. We needed the visualization to do what we needed it to do; and if there's already a tool that does what you need it to do, don't look further than necessary.
What was the most surprising part of your experience? Did you have any “a-ha” moments with the data?
AN: One of the a-ha moments was definitely, “Oh wait, I can put functions as plot parameters instead of just constants?” We had been taught how to use functions in our database class and that's where I got the idea that maybe I can just put a function in here - and it worked.
JM: Two big moments leap out to me. The first involved our IPEDS data. We had imported it separately and split it into different tables; thankfully Asher and Eric handled combining and presenting that in Observable in an awesome way.
Select a state to highlight its T-score for graduation rate - a measure of how far each state is from the national average.
ED: I had some late-night realizations that something in the data was not what we thought it was, and I had to come up with new calculations to explore that new idea. I think the platform worked well in that way. Jacob and I had some long conversations; we spent a good couple hours talking about just different ways we could approach the data.
JM: That led us to the second realization; our findings and conclusion. We only had statistics from the national level, then we had the idea to look at the state level to see if there were any significant differences. We considered listing the ten statistically best states and the ten statistically worst states and thought from there maybe we can infer what's causing the difference. But our conclusion was that they're all from different populations. There is no unified United States; no average that is meaningful. And that was definitely a surprise.
How did working in Observable compare to other tools you’ve used?
ED: I definitely enjoyed Observable a lot more than some of the other notebook-based platforms. In the middle of this competition, I had another project that required me to do some data analysis without using Observable. It was definitely interesting having to do things manually, knowing that Observable wasn't gonna be there for that.
JM: Observable is amazing, especially for data calculations and visualization. There are a lot of quality-of-life tools that are really nice. Many of them are small things that you don't even think about until you don't have them. For example, I can just attach a .csv in Observable, but it gets tricky in R because if I move the data or alter it in any way, then everything breaks. It doesn’t break with Observable, and not only that but you can easily copy usage code to your clipboard. You can just control-v and paste it and you don’t have to think about it, which is awesome.
Check out Data Dynamically’s winning notebook, The Relationship Between National Level Graduation Rates and State Level Graduation Rates by Race and Sex and learn about the benefits of using Observable for education.