This is an illustration of representing point count in a graphic using transparency. This is easy to do in ggplot2 if you use one of the barchart type of geoms. However I think there are other situations where it would be useful to apply aesthetics based on point count.
Since Hadley did a lot of his canonical examples using this data I thought it would be helpful for comparing and contrasting.
This chart shows the distribution of the price/carat of diamonds segmented by quartile of carats and clarity. The transparency shows how many diamonds each bar represents. This makes it easy to see where the action is.
library(ggplot2) # create copy of diamonds df <- diamonds # compute the quartiles of carat df$carat.qtiles <- cut(df$carat,unlist(quantile(df$carat)),include.lowest=T) # plot the probability distribution of price/carat, faceted by clarity and carat quartile # key point: using the count per bar to set the alpha level. This lets you see how much # data is represented by each bar (it would be nice to be able to do this # anytime an aggregate is done...boxplots, bins, etc.) p <- ggplot(data=df, aes(x=price/carat,y=..count../sum(..count..))) p <- p + geom_histogram(aes(alpha=..count..),binwidth=1000) +facet_grid(clarity~carat.qtiles) p
Currently in ggplot2 this method will only work if the ..output.. variables related to count are available. There are a number of areas that could benefit from this capability. It should also be easy to add more output variables to the elements of ggplot for which this behavior would be natural.
- geom_boxplot: Geoms that aggregates multiple points are good candidates for this
- facet_*: It would be interesting to be able to add a visual cue to each facet to show how many points are in each.
- The most appealing idea on this so far is to enable scaling of the facet area by point count (or other things).
- Ordering of the facets by point count would also be extremely useful.
- Thresholding by count. This would be great to easily chop low-signal facets and keep the visualization clean.
- Other half-baked ideas include background color, alpha box border…

Nice, lovely simple code as well. I’ll remember this next time I’ve got a dataset like this to summarise.