Examining a visualization of safety at Ivy League Universities to find the balance between design and analysis
The golden ratio is originally a mathematical term. But art, architecture, and design are inconceivable without this math. Everyone aspires to golden proportions as beautiful and unattainable perfection. By visualizing data, we challenge ourselves to strike a balance between design and analysis; finding a similar harmony in the visual perception of a graph much like the harmony of the golden ratio in the world around us. How do we bring numbers and dry logical conclusions to life? How do we get them to tell a fascinating story without losing its meaning? Perhaps we can consider this the "new" Golden Ratio.
In search for the answers to these questions, I investigated the best examples of modern data visualization by searching through the category "Greatest" in the Tableau Public gallery of works. The work "How Safe Are Ivy League Schools?" by authors Alex Dixon & Tarannum Ansari caught my eye. The authors asked the non-standard question: How safe is it to study at Ivy League universities?
For those unfamiliar, the Ivy League is an association of the eight oldest American universities — often used as a shorthand for prestigious higher education in the United States. To answer their question, the authors used open data on the website of the US Department of Education.
Despite the age of the visualization (it was released in 2017 and features data from 2001–2014), I decided to choose this piece for my investigation of the balance between analysis and visual design. I believe that paying attention to nuances that can confuse users is the best way to learn how to avoid misleading context in my own work, as well as to discuss differences in perception with other specialists.
At first glance, it's pleasing that the visualization is made with the ideology of a dashboard — on one screen without scrolling. Three vertical logical blocks are highlighted:
• for administrative violations, • for administrative penalties, • for criminal offenses.
The "How Safe Are Ivy League Schools?" dashboard
Vertical blocks are united by a horizontal strip at the top: eight logos and the short names of the universities. The logos make it possible to reference the universities throughout the dashboard without having to rely on the written name. The logos also work as filters: by clicking, you can select the desired university and view only its information.
Under the logos there is a time slicer by years — you can select the period of interest and the data will update accordingly. Traditional and understandable visual solutions are used for visualization:
• bar chart, • filter with a choice from the drop-down list, • heat chart, • area diagram.
Fonts and Colors Scheme
The authors use fonts very skillfully: two catchy bold fonts are chosen for the headers and a concise, well-read sans-serif font for the description and numbers.
The color scheme is a harmonious combination of milky cream and green (perhaps as a reference to the color of ivy leaves). While from a design standpoint this color palette is quite pleasing, it raises a few questions for our inner Analyst. For many people green is intuitively perceived as positive — the greener the better. Accordingly, a saturated green color should indicate a more positive situation (low level of danger). As a result, some users may potentially be confused due to the choice of this palette.
In addition, the heat map visualizations have a somewhat confusing color logic, which leads to color and data conflicts. For example, the data for Cornell University is distributed as follows:
• 100–200 offenses — standard green, • 200–500 offenses — light green color, • over 500 offenses — dark green color.
Selection of the heat map
What could make the Analyst feel a little more comfortable? Perhaps adding an alternative red spectrum color to the report to create a transition between negative and positive. And this color exists. The Ivy League has several signature colors and one of it — Harvard crimson — just the "crimson" shade we need.
Please note: This solution is not the best for all people due to color vision deficiency (CVD). It is just one of the suggestions to make the visualization more clear, though one should always check to ensure a viz is using colorblind-friendly palette to avoid misunderstandings due to color perception (for example, Color Blindness Simulator)
Visualization: The Analyst is Confused
As already mentioned, simple and intuitive graphs have been chosen for the visualizations. The authors brilliantly managed to reflect a large amount of data without excessive fragmentation and visual weight. Pop-up tips played a significant role — there are lots of them and they are quite informative.
However, there is an ambiguous logic in the choice of the measurement which crosses out all the ease of the visual perception for the Analyst. The authors chose absolute values as the unit of measurement: the total number of violations or offenses. It seems to me that it's not entirely informative to use absolute values in such studies.
For example, 691 law violations were committed at Cornell University in 2014 — is this a lot or a little? It seems to be a lot. That same year there were only 386 violations at Dartmouth College. It looks like things are really going bad at Cornell University after all, doesn't it? Moreover, the current logic of the heat map tells us this too— Cornell University is shown with darker green than Dartmouth College.
Selection of the heat map
But this conclusion may be wrong. To fully understand the situation, I would recommend taking into account the total number of students. And the authors have this data — when you hover over the histograms (left visual block) informative tips with this data for each university appear.
So, 691 law violations occurred at Cornell University, which has 21,679 students, and 386 law violations occurred at Dartmouth College with 6,298 students. That's 31.9 law violations per 1,000 students at Cornell University and 61.3 law violations per 1,000 students at Dartmouth College — with a convincing advantage… Cornell University wins!
The authors do attempt to compensate for this difference in volume in the bar graphs on the left, which use thickness to designate the number of students. However, this makes the visual less readily understandable at a glance and adds more cognitive load to the reader.
The number of students by column thickness
Our internal Explorers also lacked clear color legends — at least one for each block. Finally, the Analytist tore up something else about the horizontal scale for heat maps — because it was only from the prompts that he could understand that there is aggregate year-by-year data for each of the universities.
You Criticize — You Offer!
This report has deservedly entered the category "Greatest" in Tableau — the authors set a fairly complex and voluminous task and have effectively realized it. If it wasn't for the mentioned remarks on the measurement logic and color choice, the report would be almost flawless from the user's point of view.
At the same time, we don't necessarily get the answer to the basic question: how safe is it to study in the Ivy League? Trying to answer, our Analyst suggested the report be supplemented with a consolidated safety indicator for the entire Ivy League, also normalized to provide the data in the form of a percentage of events per 1,000 students. Our Designer enthusiastically released all that. The result of their collaborative search for the new golden ratio is this visualization:
A new variant of the same dashboard
So what do we see? The leader among all types of law violations is liquor law violations. Among the criminal offenses, burglary and robbery have a sad leadership.
The already mentioned Dartmouth College and its affiliated Princeton University and Brown University are the most "red" participants in our table.
In contrast, Columbia University in the City of New York, Cornell University, Harvard University, and the University of Pennsylvania are mostly in the safe "green" zone.
Harvard University, however, proved to be an unexpected leader in burglary and robbery. But if you look closely at Criminal Offenses' heat map 2001–2014, it is clear that these problems have been experienced by Harvard University in the past — since 2009, the university has moved to the "green" team and has never left it again.
Modern data is called Big Data for a reason — the more data, the more difficult it is to visualize. Volumetric data requires detail, and this affects the visual component of the analysis. The original report successfully avoided visualization problems by proposing a constructive graphical solution for a massive data block.
But the completeness of the data presented has replaced the analysis itself.
Not claiming the truth and armed with Occam's razor (yes, analysts have a lot of surprising techniques in the arsenal), I grouped the data on crimes into the category "criminal offenses" and got rid of the details of the sanctions in "law violations".
This allowed me to calculate aggregated security indicators, compare Ivy League universities, and find some answers. Maybe it wasn't as elegant as the original version — that's something for my inner Designer to think about.
My final word of advice — when working on data visualization switch internally from Analyst to Designer and vice versa. (Don't forget to involve the User however, since he will be the one who will evaluate the final result.) By switching between these two modes of thinking, we work towards that perfect balance between design and analysis — our "new" Golden Ratio.