Exploring How Color Choices Affect Data Understanding

Unpacking Color Choices in Data Visualization: A Fresh Look at Rainbows

Imagine you’re looking at a colorful map or chart of data, whether it be weather patterns or sales data, and you need to understand what it’s telling you. The colors aren’t there just to look pretty, they’re supposed to help you make sense of the data in front of you. Scientists and designers have long tried to figure out the best way to use colors in these visuals to help the everyday person comprehend them, and through this they have created known guidelines that most follow. But what if the guidelines for picking these colors are missing something? Something that could change data visualizations in a big way? In a 2020 study from IEEE Transactions on Visualization and Computer Graphics, titled “Rainbows Revisited: Modeling Effective Colormap Design for Graphical Inference”, the authors want to know if the colors that are predominately used in data visuals really help the average person understand what they’re seeing, beyond spotting numbers. They want to know if bright, flashy rainbow spectrums, which are usually frowned upon by current guidelines, could be better for some visualization tasks?

Why These Questions Matter

These questions aren’t just for the people who design these visualizations, they’re a big deal for anyone who relies on visuals to understand complicated data sets, like the weather or sales information mentioned at the start of this article. For years, expert have said that good colors schemes for data should be simple and smooth: colors should flow in a clear order (such as light to dark), avoid confusing jumps, and look evenly spaced to our eyes. Think of a grayscale or a gentle fade from red to blue. These rules work great when you’re just trying to read a single value off a map, like how warm an area is in one spot, but real world analysis often isn’t that simple. Scientists might need to compare multiple visuals at once, spotting patterns to figure out what those patterns mean. If the old color rules don’t help with that line of thinking, the researchers might be missing out on better ways to reach necessary conclusions. On top of this, people are more drawn to visuals that offer a wider range of color, even if the experts claim it to be messy. Why is that? This paper digs into these questions, asking if there’s a hidden benefit we’ve overlooked.

What They Wanted to Figure Out

The researchers laid out two main questions to work through. First, they wondered if the usual color rules, which focused on how people see colors, are less important when those people are thinking about data, like figuring out which visual is the odd one out. They pondered that colors we can easily name (like “red”, “blue”, or “yellow”) might help our brains process more complex visuals better than smooth, uniform scales. They titled this idea “color name variation” and guessed that maps with more nameable colors would make people spot differences more easily. This laid out the first hypothesis that needed to be tested. For the second hypothesis, there was concern that these colorful maps might trick people into seeing differences that weren’t really there, such as mistaking a color change for a data change, leading to mistakes.

How They Tackled It.

With these two hypotheses, research on the topic could begin. Two experiments were conducted with regular people online, using a task called “graphical inference”. Participants saw four color coded images side by side. Three showed data from one model, and one was from a different model. The people being tested were tasked with picking the odd model out from the four shown. 12 different color schemes were tested during this period, from some rather plain (like all shades of blue) to full rainbows (red to purple to blue), and the results were measured.

In the first experiment, 180 people looked at these lineups, with each person using four different color schemes. The images were made from fake data, being mixtures of different bell-shaped curves, to mimic real data visualizations. They checked how accurate people were and compared these results to their idea of “color name variation”, which involved how many distinct colors a scheme had. The authors also tested an easier to calculate stand in called “log-LAB length”, which measured how much a color scheme stretched across a standard color space. In the second experiment, the authors tweaked the task, instead having a group of 60 people decide if all four images were from the same model or not. This test was done to see whether colorful maps caused more false alarms, building on their second hypothesis that people would say there was a difference when there truly weren’t. Four key color schemes were used in this experiment: one that was plain blue, another that was a multi-hue “viridis”, a two-tone “cool-warm”, and a full rainbow.

The Results

The results were surprising. In the first experiment, rainbow color schemes won hands down. People were about 69% accurate with rainbows, compared to 63% with plain single-color scales. The more nameable colors a scheme had, the better people did, backing up their first guess. Even within similar schemes (such as “viridis” vs “plasma”), the visual with the more nameable colors came out ahead. Their log-LAB length idea worked almost as well, suggesting simpler ways to predict what worked. In the second experiment, colorful rainbows didn’t lead to more mistakes, in fact it had the opposite effect. They helped people spot real differences without inventing fake ones, tossing out the concerns of the second hypothesis. The rainbow colors weren’t just pretty, they were useful for understanding the data in a deeper way.

Limits to Watch For

While the research conducted by the authors provided interesting insights in how we use color for visualizations, it was not perfect. As mentioned earlier, fake data was used that was based on Gaussian curves, which can fit some data sets (such as temperate maps) but not everything. Complications could arise from data that is a lot more complex and messy. Building on that, the task given to those tested was specific. Picking an oddball in a set of four isn’t the same as guessing trends or testing theories between data sets. Rainbow colors could confused people with color blindness, a problem that was not fully tackled. Plus, their “color name variation” idea isn’t the only explanation for the results they gained, as the log-LAB length hint suggests raw color variety might matter as well. These gaps in the research should not distract from the great work the authors completed, but should inform future studies to see if the findings found here hold up everywhere.

My Takeaways and Thoughts

After reading through this research, I’ve found myself incredibly interested in how it flips the script on the traditional color guidelines I’ve learned from data visualization courses. I’ve always been told that smooth, simple colors were the best, as it caused less clutter and confusion, but this paper suggests that is not the case at all. Rainbows, with their bold colors, seem to give our brains a better grasp on patterns and differences we encounter. This could do a lot to help people better understand the data they’re reading, which could help scientists tweak models or doctors compare scans. If rainbows help, maybe we’ve been too quick to ditch them in favor some something “cleaner”.

Despite this, I am still thinking a lot about the gaps in this research. They make work well with finding out simple differences, but can the colors of the rainbow overwhelm much more complex data sets? What about those that are color blind? The authors allude that middle ground options, such as “plasma”, which mixes variety with accessibility, could help with this. Even if the rainbows can help others in understanding data better, it means nothing if some cannot even see the colors.

This reading has given me a much more critical view on how I use color in visualizations, and how I can attempt to use more colors to help point towards important information. Our visualizations shouldn’t just focus on being “clean”, they should match how our minds work and process color. It’s a good nudge to rethink the rules and test them more. Good visuals shouldn’t focus on being “right” by following guidelines, they should instead focus on being as readable as possible.

Questions to Ponder

Color is a topic we’ve discussed continuously throughout this course. How has this summarization and it’s accompanying research paper changed your views on how you use color in your visualizations?
Think on the issue of using rainbow colors for data readability and how it would be inaccessible to those who have color blindness. What is one idea you have on how to rectify that issue?
Consider the gaps in the research and your own visualizations so far. Do you think your work would benefit from the use of more recognizable colors? Or does the gap of the research only using simple visuals ring true for your more complex work?
With so many established guidelines on how to use color for data visualization, do you think it’s practical to go against the experts and attempt this new, more radical view, on color use? Or should those creating visuals stick to what’s practiced and well known to avoid creating charts that aren’t “clean”?

Unpacking Color Choices in Data Visualization: A Fresh Look at Rainbows

Contact Me