# Hexbin transform

The **hexbin transform** groups two-dimensional quantitative or temporal data—continuous measurements such as heights, weights, or temperatures—into discrete hexagonal bins. You can then compute summary statistics for each bin, such as a count, sum, or proportion. The hexbin transform is most often used to make heatmaps with the dot mark.

For example, the heatmap below shows the weights and heights of Olympic athletes. The color of each hexagon represents the number (*count*) of athletes with similar weight and height.

```
Plot
.dot(olympians, Plot.hexbin({fill: "count"}, {x: "weight", y: "height"}))
.plot({color: {scheme: "YlGnBu"}})
```

Whereas the bin transform produces rectangular bins and operates on abstract data, the hexbin transform produces hexagonal bins and operates in “screen space” (*i.e.*, pixel coordinates) after the *x* and *y* scales have been applied to the data. And whereas the bin transform produces **x1**, **y1**, **x2**, **y2** representing rectangular extents, the hexbin transform produces **x** and **y** representing hexagon centers.

To produce an areal encoding as in a bubble map, output **r**. In this case, the default range of the *r* scale is set such that the hexagons do not overlap. The **binWidth** option, which defaults to 20, specifies the distance between centers of neighboring hexagons in pixels.

```
Plot
.dot(olympians, Plot.hexbin({r: "count"}, {x: "weight", y: "height", binWidth}))
.plot()
```

If desired, you can output both **fill** and **r** for a redundant encoding.

```
Plot
.dot(olympians, Plot.hexbin({fill: "count", r: "count"}, {x: "weight", y: "height", stroke: "currentColor"}))
.plot({color: {scheme: "YlGnBu"}})
```

TIP

Setting a **stroke** ensures that the smallest hexagons are visible.

Alternatively, the **fill** and **r** channels can encode independent (or “bivariate”) dimensions of data. Below, the **r** channel uses *count* as before, while the **fill** channel uses *mode* to show the most frequent sex of athletes in each hexagon. The larger athletes are more likely to be male, while the smaller athletes are more likely to be female.

```
Plot
.dot(olympians, Plot.hexbin({fill: "mode", r: "count"}, {x: "weight", y: "height", fill: "sex"}))
.plot()
```

Using **z**, the hexbin transform will partition hexagons by ordinal value. If **z** is not specified, it defaults to **fill** (if there is no **fill** output channel) or **stroke** (if there is no **stroke** output channel). Setting **z** to *sex* in the chart above, and switching to **stroke** instead of **fill**, produces separate overlapping hexagons for each sex.

```
Plot
.dot(olympians, Plot.hexbin({stroke: "mode", r: "count"}, {x: "weight", y: "height", z: "sex", stroke: "sex"}))
.plot()
```

The hexbin transform can be paired with any mark that supports **x** and **y** channels (which is almost all of them). The text mark is useful for labelling. By setting the **text** output channel, you can derive the text from the binned contents.

```
Plot
.text(olympians, Plot.hexbin({text: "count"}, {x: "weight", y: "height"}))
.plot()
```

The hexbin transform also works with Plot’s projection system. Below, hexagon size represents the number of nearby Walmart stores, while color represents the date the first nearby Walmart store opened. (The first Walmart opened in Rogers, Arkansas.)

Fork```
Plot.plot({
projection: "albers",
r: {range: [0, 16]},
color: {scheme: "spectral", label: "First year opened", legend: true},
marks: [
Plot.geo(statemesh, {strokeOpacity: 0.5}),
Plot.geo(nation),
Plot.dot(walmarts, Plot.hexbin({r: "count", fill: "min"}, {x: "longitude", y: "latitude", fill: "date"}))
]
})
```

CAUTION

Beware the modifiable areal unit problem. On a small scale map, this is compounded by the Earth’s curvature, which makes it impossible to create an accurate and regular grid. Use an equal-area projection when binning.

The hexgrid mark draws the base hexagonal grid as a mesh. This is useful for showing the empty hexagons, since the hexbin transform does not output empty bins (and unlike the bin transform, the hexbin transform does not currently support the **filter** option).

```
Plot.plot({
marks: [
Plot.hexgrid(),
Plot.dot(olympians, Plot.hexbin({r: "count"}, {x: "weight", y: "height", fill: "currentColor"}))
]
})
```

The hexbin transform defaults the **symbol** option to *hexagon*, but you can override it. The circle constructor changes it to *circle*.

`Plot.circle(olympians, Plot.hexbin({r: "count"}, {x: "weight", y: "height"})).plot()`

Hexbins work best when there is an interesting density of dots in the center of the chart, but sometimes hexagons “escape” the edge of the frame and cover the axes. To prevent this, you can use the **inset** scale option to reserve space on the edges of the frame.

```
Plot
.dot(olympians, Plot.hexbin({fill: "count"}, {x: "weight", y: "height"}))
.plot({inset: 10, color: {scheme: "YlGnBu"}})
```

TIP

You can also set the dot’s **clip** option to true to prevent the hexagons from escaping.

Alternatively, use the axis mark to draw axes on top of the hexagons.

Fork```
Plot.plot({
color: {scheme: "YlGnBu"},
marks: [
Plot.dot(olympians, Plot.hexbin({fill: "count"}, {x: "weight", y: "height"})),
Plot.axisX(),
Plot.axisY()
]
})
```

## Hexbin options

The *options* must specify the **x** and **y** channels. The **binWidth** option (default 20) defines the distance between centers of neighboring hexagons in pixels. If any of **z**, **fill**, or **stroke** is a channel, the first of these channels will be used to subdivide bins.

The *outputs* options are similar to the bin transform; each output channel receives as input, for each hexagon, the subset of the data which has been matched to its center. The outputs object specifies the aggregation method for each output channel.

The following aggregation methods are supported:

*first*- the first value, in input order*last*- the last value, in input order*count*- the number of elements (frequency)*distinct*- the number of distinct values*sum*- the sum of values*proportion*- the sum proportional to the overall total (weighted frequency)*proportion-facet*- the sum proportional to the facet total*min*- the minimum value*min-index*- the zero-based index of the minimum value*max*- the maximum value*max-index*- the zero-based index of the maximum value*mean*- the mean value (average)*median*- the median value*deviation*- the standard deviation*variance*- the variance per Welford’s algorithm*mode*- the value with the most occurrences*identity*- the array of values- a function to be passed the array of values for each bin and the extent of the bin
- an object with a
*reduceIndex*method

## hexbin(*outputs*, *options*)

`Plot.dot(olympians, Plot.hexbin({fill: "count"}, {x: "weight", y: "height"}))`

Bins (hexagonally) on **x** and **y**. Also groups on the first channel of **z**, **fill**, or **stroke**, if any.