Which Data Visualization works Best?

Author: Thomas Gonzalez

1) Heat Index Grid (click for full view)


2) Micro Histogram  (click for interactive version)

3) Punch Card Grid


Recently I was asked by Universal Mind if I would be interested in consulting for them on the User Experience and Data Visualization aspects of their ground-breaking geo-spatial product SpatialKey. I felt it was a great compliment and privilege to have such a well recognized and respected company in the RIA world ask for my input, and I was excited by the prospect of working with such people as Mike Connor (VP Business Development), Tom Link (CTO) and Doug McCune (Flex Rockstar) and the many other talented people at UM. 

Whiile many of you in the Flex community probably have already heard about SpatialKey, for those of you who have not, definitely go check out their technology preview. Basically, SpatialKey represents some really innovative work on showing high volume data sets as they relate to geo-coded data with visual interpolation techniques that far eclipse the standard pin-based metaphor found on most geo-spatial visualization tools.

In that vein one of the tactical areas I was asked to look at is the effectiveness of certain types of visualizations for specific types of analysis. One of those areas was trying to plot crime incident data data by hour and day of week for a given 7 day period. From this discussion we came up with three alternative ways to visualize this data... one existed currrently within the system, one I created based on the visualization problem as i saw it, and one was found by surfing the web. One of the challenges in doing Data Viz work, is the more work you do in the field the more unique visualizations you create the stronger your preference for certain visual patterns becomes, thus creating a bias in what seems most effective for you.

While I believe I have scientific reasons to explain why I think one of the visualizations above is more effective than others, it comes down to what casual users find most intuitive. To that end, I would be greatly appreciative for any readers to vote on the visualization they feel is most compelling to solve the analysis problem of spotting trends and subtle differences in numeric counts (arrests) as plotted hour-by-hour for a seven day period. While we are interviewing real users in the SpatialKey target market, getting feedback from the Flex community is just as valuable. Just use the comments section to vote for 1 (heat index grid), 2 (micro histogram), or 3 (punch card). (Pics 1 & 2 show the same data, Pic 3 is a different data set, which I realize make the comparison a little harder, but hopefully still valid.)

 

19 Responses to “Which Data Visualization works Best?”

  1. Unknown

    I think #1 is significantly better than the others. The color shifts make it easy to get a quick impression of where the concentration lies, and the numbers allow you to see exact figures easily and without interaction. The other 2 really give no sense of scale, other than relative to the rest of the set. For instance, is the largest black circle representing 30 or 5000? There is no way to tell without rolling over or something like that.


  2. Jonathan

    #1 is the hardest to interpret. I can't map the changes in color accurately to values. I can see general trends just fine, but detailed analysis is very hard.

    Discerning between #2 and #3 is harder because they don't show the same data. I would say #2 is the easiest to compare relative sizes, however #3 is easier to compare along vertical axes.

    In #2 my eye can scan left to right and make a good comparison of height, but scanning top to bottom is much harder. It is tough to say if a value is bigger or smaller across the days, but easy across hours in a single day.

    #3 seems to make it easier to spot 2d trends rather than 1d trends. If I look from a distance my brain can make a shape from the outline created by the darker regions. The larger circles grouped together form an outline that my brain can process.

    However, for a close examination I think it is easier to compare detail by looking at simply the height difference in #2 rather than area difference in #3 (with the above caveat about directionality).


  3. Unknown

    Number 1 the Heat Index Grid gives the most information in a glance.


  4. Thomas Gonzalez

    Great feedback, keep it coming.

    Some other factors to consider:

    1) Form factor - depending on how much screen real estate you have to work with some of these may work better worse.

    2) We are completely ignoring interaction at the moment - so @jonathan makes a great point. With rollover interaction you could line up hours across days, without it, it is hard to visually correlate.


  5. Unknown

    I'm sort of at an impasse between 2 and 3, but I think I personally would get the most out of #3.

    All three are fairly easy to read, but these were my gut reactions:

    #1 - The heat map is a little disorienting to me, I'm not sure exactly what I'm looking at at first and once I do it takes a while to evaluate it. I've always had trouble with color gradients representing value ranges like that, unless it is a true heat map overlaid on top of something.

    #2 - Bar charts are obviously the most immediately recognizable option, but I guess I'm not used to comparing them with each other so I didn't immediately look for trends across the days. It took a little concentration to realize what patterns I was looking for.

    #3 - I was able to recognize patterns across days and hours immediately with the punch card. I like that one a lot, it's unlike any chart I've encountered in an app and yet I found it very easy to understand exactly what I was seeing.

    The only other off-hand note I'd make is that I may have understood #2 faster if there were some vertical queue such as a background color every few hours on the grid to guide my eye down the values.


  6. Thomas Gonzalez

    Just added a interactive version of Number 2 that clears up some of @jonathan questions

    http://www.brightpointinc.com/flexdemos/ummicrochart.swf


  7. Unknown

    While #1 might provoke an emotional response with color, there are some things lacking. This same thing might be accomplished, maybe even more effective using values of a single color or black/white. Especially if you're color blind. Also, this approach can be deceiving.

    In #2, seeing minute differences is much easier with form than it is variations in color or value. It's kind of a similar thought as to what makes a good logo great, it's the form that makes things more defining. I think you could bring #1 and #2 together using color and form. This would allow a quick glance understanding, but allow for more detailed analysis.

    #3 is somewhat disorienting because the circle is introducing additional dimensions that aren't adding any value. Make the bars in #2 squares starting from a center point and it's the same thing.

    My vote is for #2 with influences of #1. Just my initial thoughts.


  8. Anonymous

    # 1 Big, clunky hard to easily interpret
    # 2 instantly understood, would benefit from vertical lines to demark blocks of time
    # 3 have to work hard to interpret, not easy to correlate size


  9. Rob McKeown

    I think they all have some strong points but also all have some weaknesses.

    #1 - This one I think suffers form the problem of requiring the reader to map a color to a value themselves. If is possible for readers to do, but requires the initial overhead of looking at each cell to figure out that light blue is greater than dark blue. Also, colorblind readers may have difficulty with this approach.

    What #1 does give you is a consistent way of comparing horizontally and vertically.

    #2 The only problem with this one is the difficulty in comparing across days. Because the bars are not next to each other, it is hard to really see the difference in height.

    I do like the fact that you can see the "curve" over the course of the day.

    #3 Humans have a hard time understanding the difference in size of circles because something that looks twice as "big" may actually be 3 times the area.

    Again #3 makes vertical comparisons fairly easy and consistent.

    If you can make option #1 use a color scale that is easier to interpret that would help. Or if in #3 you changed the brightness of the circles to indicate the value along with the size of the circle that might also help.

    Some mechanism where you select multiple entries across days and then compare them side by side would make option 2 work.

    All in all if my goal was to pick out which hours throughout the week were "the worst" I would go with #1 but I would try to make the blues better indicate the level. Days with no incidents seem like they should be white to me.


  10. Jonathan

    Good comments all around.

    On #2:

    With a standard bar chart/sparkline the upper limit is not delineated because it is the relationship between the values that is important and the eye doesn't need an upper bound to compare the heights horizontally. However, in #2 the comparison can also be done vertically. I would suggest that an upper bound line on each row would help the eye determine relative heights between values in different rows.

    With an upper bound line the eye can see how much negative space above the bar is actual data and how much is design. The eye is better at making distinctions when they are proportional to size. So, when the bar is 2 pixels vs. 5 pixels it is easy to see it is about 2x as big. However, when the bar is 22 pixels vs. 25 pixels the difference may be invisible (especially comparing vertically). If you add an upper bound line, then the eye can compare the negative spaces at the top and may be able to see, e.g., that 22 pixels is 8 pixels from the top and 25 pixels is 5 pixels from the top.

    But, again, clutter can become your enemy quickly...


  11. Anonymous

    First, we're real glad to welcome Tom G to the SpatialKey team. Tom brings a great deal of experience to bear in this space!

    A comment on use of color:
    It's potentially of benefit to not "have to" use color in this component, which allows it to be coupled with a component that visualizes a different set of information using color. One thing we certainly want to avoid is use of colors to indicate different meaning inside of the same view.

    However, using bar height (or circle radius) to indicate quantity can be more easily reused.


  12. Unknown

    Number 2 was the only one that I understood at a glance without trying to figure it out. The others I just ignored (mentally). In the technology preview with the heat thing, I still didn't really spend the mental time to understand the graph.


  13. Harris Reynolds

    #2 is easily the best data visualization with best defined as providing meaning of the data in the shortest period of time (i.e. your brain does some amount of discerning without even having to think)

    Personally I like the form factor of #2 as well.


  14. Unknown

    I would say #2 was the easiest to understand at my first quick glance. #3 gives me a more general view, because it is harder to differentiate a big circle from a little bit bigger circle...where as the bars in #2 are a better representation for me.


  15. Thomas Gonzalez

    Okay, so the tally thus far

    Number 1: 2 votes
    Number 2: 6 votes
    Number 3: 1 vote

    Other posts were either undecided or just discussion comments.


  16. jugglebird

    #3 communicates best for me. I appreciate being able to compare data at the same time across days as easily as comparing it to the previous or next hour (which is what #2 makes easier).

    Correlating circle size is somewhat less precise than the bars in #2 but I don't think the primary task is to determine an exact value but to identify times/days of interest for further followup.

    #2 is the next best.

    And as Tom mentioned, I appreciate that color is freed up for both #2 and #3 to present another dimension of information.


  17. Bill Mill

    If I click on an hour in #2, perhaps I'd like it if The X and Y axes of the graph swapped so that I could compare the values for that hour all lined up next to each other?


  18. Bill Mill

    Oh, and I'm taking 2... 1 is essentially unscaled and probably close to useless for the ~10% colorblind men in the world. 3 has all the issues people have already mentioned about perceiving circle sizes.


  19. Thomas Gonzalez

    As an FYI, #2 is the custom component I designed personally, the rest have been taken from different applications.

    - Tom


Leave a Reply