Skip to content

Hexbin transform ^0.5.0

The hexbin transform groups two-dimensional quantitative or temporal data — continuous measurements such as heights, weights, or temperatures — into discrete hexagonal bins. You can then compute summary statistics for each bin, such as a count, sum, or proportion. The hexbin transform is most often used to make heatmaps with the dot mark.

For example, the heatmap below shows the weights and heights of Olympic athletes. The color of each hexagon represents the number (count) of athletes with similar weight and height.

Fork
js
Plot
  .dot(olympians, Plot.hexbin({fill: "count"}, {x: "weight", y: "height"}))
  .plot({color: {scheme: "YlGnBu"}})

Whereas the bin transform produces rectangular bins and operates on abstract data, the hexbin transform produces hexagonal bins and operates in “screen space” (i.e., pixel coordinates) after the x and y scales have been applied to the data. And whereas the bin transform produces x1, y1, x2, y2 representing rectangular extents, the hexbin transform produces x and y representing hexagon centers.

To produce an areal encoding as in a bubble map, output r. In this case, the default range of the r scale is set such that the hexagons do not overlap. The binWidth option, which defaults to 20, specifies the distance between centers of neighboring hexagons in pixels.

Fork
js
Plot
  .dot(olympians, Plot.hexbin({r: "count"}, {x: "weight", y: "height", binWidth}))
  .plot()

If desired, you can output both fill and r for a redundant encoding.

Fork
js
Plot
  .dot(olympians, Plot.hexbin({fill: "count", r: "count"}, {x: "weight", y: "height", stroke: "currentColor"}))
  .plot({color: {scheme: "YlGnBu"}})

TIP

Setting a stroke ensures that the smallest hexagons are visible.

Alternatively, the fill and r channels can encode independent (or “bivariate”) dimensions of data. Below, the r channel uses count as before, while the fill channel uses mode to show the most frequent sex of athletes in each hexagon. The larger athletes are more likely to be male, while the smaller athletes are more likely to be female.

Fork
js
Plot
  .dot(olympians, Plot.hexbin({fill: "mode", r: "count"}, {x: "weight", y: "height", fill: "sex"}))
  .plot()

Using z, the hexbin transform will partition hexagons by ordinal value. If z is not specified, it defaults to fill (if there is no fill output channel) or stroke (if there is no stroke output channel). Setting z to sex in the chart above, and switching to stroke instead of fill, produces separate overlapping hexagons for each sex.

Fork
js
Plot
  .dot(olympians, Plot.hexbin({stroke: "mode", r: "count"}, {x: "weight", y: "height", z: "sex", stroke: "sex"}))
  .plot()

The hexbin transform can be paired with any mark that supports x and y channels (which is almost all of them). The text mark is useful for labelling. By setting the text output channel, you can derive the text from the binned contents.

Fork
js
Plot
  .text(olympians, Plot.hexbin({text: "count"}, {x: "weight", y: "height"}))
  .plot()

The hexbin transform also works with Plot’s projection system. Below, hexagon size represents the number of nearby Walmart stores, while color represents the date the first nearby Walmart store opened. (The first Walmart opened in Rogers, Arkansas.)

Fork
js
Plot.plot({
  projection: "albers",
  r: {range: [0, 16]},
  color: {scheme: "spectral", label: "First year opened", legend: true},
  marks: [
    Plot.geo(statemesh, {strokeOpacity: 0.5}),
    Plot.geo(nation),
    Plot.dot(walmarts, Plot.hexbin({r: "count", fill: "min"}, {x: "longitude", y: "latitude", fill: "date"}))
  ]
})

CAUTION

Beware the modifiable areal unit problem. On a small scale map, this is compounded by the Earth’s curvature, which makes it impossible to create an accurate and regular grid. Use an equal-area projection when binning.

The hexgrid mark draws the base hexagonal grid as a mesh. This is useful for showing the empty hexagons, since the hexbin transform does not output empty bins (and unlike the bin transform, the hexbin transform does not currently support the filter option).

Fork
js
Plot.plot({
  marks: [
    Plot.hexgrid(),
    Plot.dot(olympians, Plot.hexbin({r: "count"}, {x: "weight", y: "height", fill: "currentColor"}))
  ]
})

The hexbin transform defaults the symbol option to hexagon, but you can override it. The circle constructor changes it to circle.

Fork
js
Plot.circle(olympians, Plot.hexbin({r: "count"}, {x: "weight", y: "height"})).plot()

Hexbins work best when there is an interesting density of dots in the center of the chart, but sometimes hexagons “escape” the edge of the frame and cover the axes. To prevent this, you can use the inset scale option to reserve space on the edges of the frame.

Fork
js
Plot
  .dot(olympians, Plot.hexbin({fill: "count"}, {x: "weight", y: "height"}))
  .plot({inset: 10, color: {scheme: "YlGnBu"}})

TIP

You can also set the dot’s clip option to true to prevent the hexagons from escaping.

Alternatively, use the axis mark to draw axes on top of the hexagons.

Fork
js
Plot.plot({
  color: {scheme: "YlGnBu"},
  marks: [
    Plot.dot(olympians, Plot.hexbin({fill: "count"}, {x: "weight", y: "height"})),
    Plot.axisX(),
    Plot.axisY()
  ]
})

Hexbin options

The options must specify the x and y channels. The binWidth option (default 20) defines the distance between centers of neighboring hexagons in pixels. If any of z, fill, or stroke is a channel, the first of these channels will be used to subdivide bins.

The outputs options are similar to the bin transform; for each hexagon, an output channel value is derived by reducing the corresponding binned input channel values. The outputs object specifies the reducer for each output channel.

The following named reducers are supported:

  • first - the first value, in input order
  • last - the last value, in input order
  • count - the number of elements (frequency)
  • distinct - the number of distinct values
  • sum - the sum of values
  • proportion - the sum proportional to the overall total (weighted frequency)
  • proportion-facet - the sum proportional to the facet total
  • min - the minimum value
  • min-index - the zero-based index of the minimum value
  • max - the maximum value
  • max-index - the zero-based index of the maximum value
  • mean - the mean value (average)
  • median - the median value
  • deviation - the standard deviation
  • variance - the variance per Welford’s algorithm
  • mode - the value with the most occurrences
  • identity - the array of values
  • x ^0.6.12 - the hexagon’s x center
  • y ^0.6.12 - the hexagon’s y center

In addition, a reducer may be specified as:

  • a function to be passed the array of values for each bin and the center of the bin
  • an object with a reduceIndex method

In the last case, the reduceIndex method is repeatedly passed three arguments: the index for each bin (an array of integers), the input channel’s array of values, and the center of the bin (an object {data, x, y}); it must then return the corresponding aggregate value for the bin.

Most reducers require binding the output channel to an input channel; for example, if you want the y output channel to be a sum (not merely a count), there should be a corresponding y input channel specifying which values to sum. If there is not, sum will be equivalent to count.

hexbin(outputs, options)

js
Plot.dot(olympians, Plot.hexbin({fill: "count"}, {x: "weight", y: "height"}))

Bins hexagonally on x and y. Also groups on the first channel of z, fill, or stroke, if any.