raster mark (observablehq#1196)

* image data mark * PreTtiER * handle invalid data; stride, offset * handle flipped images * archive test failure artifacts * skip image data tests, for now * PreTtiER * only ignore generated images in CI * only ignore large generated images * fillOpacity * tweak * fix formula * PreTtiER * volcano * more idiomatic heatmap * fill as f(x, y) * pixel midpoints * PreTtiER * not pixelated, again * PreTtiER * raster * pixelRatio * fix aria-label; comments * Goldstein–Price * tentative documentation for Plot.raster * fix partial coverage of sample fill * raster fillOpacity * require x1, y1, x2, y2 * validate width, height * fix for sparse samples * better error on missing scales * document * floor rounded (or floored?) * exploration for a "nearest" raster interpolate method * barycentric interpolation see https://observablehq.com/@visionscarto/igrf-90 * raster tuple shorthand * barycentric interpolate and extrapolate * only maybeTuple if isTuples * allow marks to apply scales selectively (like we do with projections) * interpolate on values * 3 interpolation methods for the nearest neighbor: voronoi renderCell, quadree.find, delaunay.find. This is completely gratuitous since they all run in less than 1ms… It's even hard to know which one is the fastest, because if I loop on 100s of them the browser starts to thrash (allocating so much memory for images it immediately discards, I guess…) * barycentric walmart * fold mark.project into mark.scale * fix barycentric extrapolation * materialize fewer arrays * use channel names * don’t pass {r, g, b, a} * don’t overload x & y channels * fix inverted x or y; simplify example * simpler * fix grid orientation * only stroke if opaque * optional x1, y1, x2, y2 * shorten * fix order * const * rasterize * The performance measurements I had done were just rubbish (I forgot to await on the promises!). Measuring the three methods on the ca55 dataset I see this order: voronoi cellRender (180ms), delaunay find (220ms), quadtree (500ms). * rasterize * tolerance for points that are on a triangle's edge * use a symbol for values that need extrapolation, simplify and fix a few issues, use a mixing function for categorical interpolation * rasterize with walk on spheres * document rasterize * pixelSize * default to full frame * remove ignored options * reformat options * fix the ca55 tests (the coordinates represent a planar projection) * caveat about webkit/safari * remove console.log * more built-in rasterizers * fix walk-on-spheres implementation; remove blur * port fixes to wos * adaptive extrapolation * fillOpacity fixes * renames walk-on-spheres to random-walk; documents the rasterize option rationale for the renaming: "random-walk" is more commonly known, and expresses well enough what's happening. Walk on spheres converges much faster than a basic random walk would, and makes it feasible, but it is a question of implementation. * a constant fillOpacity informs the opacity property on the g element, not the opacity of each pixel * fix bug with projection clip in indirectStyles * performance optimizations for randow-walk: 1. use rasterizeNull to boot; if we have more samples (and a costlier delaunay), at least we have less pixels to impute. 2. cache more aggressively the result of delaunay.find: at the beginning of each line, for each pixel, and for each step of the walk On actual tests it can be up to 2x faster. * sample pixel centroids * fix handling of undefined values * use transform for equirectangular coordinates * don’t delete * stroke if constant fillOpacity * fix test snapshots * fix typo in test name * note potential bias caused by stroke * rename tests * don’t bootstrap random-walk with none * terminate walk when minimum distance is reached * comment re. opacity * comment re. none order bias * contour mark * dense grid contours * consolidate code * more code consolidation * cleaner * cleaner deferred channels * interpolate, not rasterize * blur * cleaner * use typed array when possible * optimize barycentric interpolation * nicer contours for ca55 with barycentric+blur 3; support raster blur Contour blurring is unchanged, and blurs the abstract data (with a linear interpolation). Raster blurring is made with d3.blurImage. Two consequences: * we can now blur “categorical” colors, if we want to smooth out the image and give it a polished look in the higher variance regions. (This works very well when we have two colors, but with more categories there is a risk of hiding the components of a color, making the image more difficult to understand. Anyway, it’s available as an option to play with.) * for quantitative data, and with a color scale with continuous scheme and linear transform, this is very close to linear interpolation; but if the underlying data is better rendered with a log color scale, the color interpolation takes this into account (which IMO is better). * ignore negative blur * cleaner tests * for contours, filter points with missing X and Y before calling the interpolate function, and ignore x and y filters on geometries * fix barycentric interpolate for filtered points note: the penguins dataset is full of surprises since some points are occluded by others of a different species… * contour shorthands * fix contour filtering * filter value, too * materialize x and y when needed * default to nearest * comment * remove obsolete opacity trick * better contour thresholds; fix test * nullish instead of undefined * renderBounds * fix circular import * a hand-written Peters projection seemed more fun than the sqrt scale; tests the same thing * update raster documentation with interpolate; document contour * document Plot.identity * peters axes * symmetric Peters * style tweak * NaN instead of null * avoid error when empty quantile domain * faceted sampler raster * fix test snapshot * faceted contour; fix dense faceted raster … and fix default contour thresholds * expose spatial interpolators * pass x, y, step * error when data undefined, but not null * d3 7.8.1 Co-authored-by: Philippe Rivière <[email protected]>
chaichontat · Jan 14, 2024 · c648097 · c648097
1 parent afcd8b6
commit c648097
Show file tree

Hide file tree

Showing 57 changed files with 9,124 additions and 560 deletions.
diff --git a/.github/workflows/node.js.yml b/.github/workflows/node.js.yml
@@ -27,3 +27,9 @@ jobs:
           echo ::add-matcher::.github/eslint.json
           yarn run eslint . --format=compact
       - run: yarn test
+      - name: Test artifacts
+        uses: actions/upload-artifact@v3
+        if: failure()
+        with:
+          name: test-output-changes
+          path: test/output/*-changed.*
diff --git a/README.md b/README.md
@@ -1112,6 +1112,66 @@ Equivalent to [Plot.cell](#plotcelldata-options), except that if the **y** optio
 
 <!-- jsdocEnd cellY -->
 
+
+### Contour
+
+[Source](./src/marks/contour.js) · [Examples](https://observablehq.com/@observablehq/plot-contour) · Renders contour polygons from two-dimensional samples.
+
+#### Plot.contour(*data*, *options*)
+
+<!-- jsdoc contour -->
+
+Returns a new contour mark with the given *data* and *options*. The *data* represents a discrete set of samples in abstract coordinates, bound to the scales *x* and *y*, and a **value** channel.
+
+Most of the options are identical to the [raster](#raster) mark’s options, which is used internally to compute a rectangular grid of numeric values. Marching squares are then applied to derive the contour polygons for each threshold value.
+
+The following options define the value channel and the aesthetics of the contours:
+* **value** - the sample’s value (a channel); as a shorthand notation, it can be defined by setting either fill, fillOpacity or stroke
+* **fill** - the contour’s fill color; if a channel, bound to the *color* scale
+* **fillOpacity** - the contour’s opacity; if a channel, bound to the *opacity* scale
+* **stroke** - the contour’s stroke color; if a channel, bound to the *color* scale; defaults to currentColor
+* **strokeOpacity** - the (constant or variable) contour’s stroke opacity; if a channel, bound to the *opacity* scale; defaults to 1
+* **strokeWidth** - the (constant or variable) contour’s stroke width; defaults to 1
+* **thresholds** - the thresholds — an array of threshold values; if a *count* is specified instead of an array of thresholds, then the input values’ extent will be uniformly divided into approximately *count* bins. Defaults to [Sturges’s formula](https://github.com/d3/d3-contour/blob/main/README.md#contours_thresholds).
+* **x** and **y** - the sample’s coordinates.
+* **interpolate** - the interpolate method (see [raster](#raster) for details).
+* **blur** - the blur radius, a non-negative number of pixels, that defaults to 0.
+
+Each sample is projected onto the coordinate system of a rectangle with dimensions that may be specified directly with the following options:
+
+* **width** - the number of pixels on each horizontal line
+* **height** - the number of lines; a positive integer
+
+Alternatively, the width and height of the raster can be imputed from the starting and ending positions for x and y, and a pixel size:
+
+* **x1** - the starting horizontal position; bound to the *x* scale
+* **x2** - the ending horizontal position; bound to the *x* scale
+* **y1** - the starting vertical position; bound to the *y* scale
+* **y2** - the ending vertical position; bound to the *y* scale
+* **pixelSize** - the density of the raster image; defaults to 1
+
+If a width has been specified, x1 defaults to 0 and x2 defaults to width; similarly, if a height has been specified, y1 defaults to 0 and y2 defaults to height. Otherwise, if data has been specified, x1, y1, x2, and y2 respectively default to the frame’s left, top, right, and bottom, coordinates. Lastly, if no data has been specified, and fill is a function of x and y, you must specify all of x1, x2, y1 and y2 to define the domain (see below).
+
+The defaults for this mark make it convenient to draw thresholds from a flat array of values representing a rectangular matrix:
+
+```js
+Plot.contour(volcano.values, {width: volcano.width, height: volcano.height, fill: volcano.values, thresholds: 5})
+```
+
+When *data* is not specified and *value* is a function, a sample is taken for every pixel of the raster, which allows to draw contours from a function and a two-dimensional domain:
+
+```js
+Plot.contour({
+  fill: (x, y) => x * y * Math.sin(x) * Math.sin(y),
+  x1: 0,
+  x2: 2 * Math.PI,
+  y1: 0,
+  y2: 2 * Math.PI
+})
+```
+
+<!-- jsdocEnd contour -->
+
 ### Delaunay
 
 [<img src="./img/voronoi.png" width="320" height="198" alt="a Voronoi diagram of penguin culmens, showing the length and depth of several species">](https://observablehq.com/@observablehq/plot-delaunay)
@@ -1504,6 +1564,63 @@ Returns a new link with the given *data* and *options*.
 
 <!-- jsdocEnd link -->
 
+### Raster
+
+[Source](./src/marks/raster.js) · [Examples](https://observablehq.com/@observablehq/plot-raster) · Fills a raster image with color samples.
+
+#### Plot.raster(*data*, *options*)
+
+<!-- jsdoc raster -->
+
+Returns a new raster mark with the given *data* and *options*. The *data* represents a discrete set of samples in abstract coordinates, bound to the scales *x* and *y*, a **fill** channel bound to the *color* scale, and a **fillOpacity** channel bound to the *opacity* scale.
+
+Each sample is drawn on a rectangular raster image with dimensions that may be specified directly with the following options:
+
+* **width** - the number of pixels on each horizontal line
+* **height** - the number of lines; a positive integer
+
+Alternatively, the width and height of the raster can be imputed from the starting and ending positions for x and y, and a pixel size:
+
+* **x1** - the starting horizontal position; bound to the *x* scale
+* **x2** - the ending horizontal position; bound to the *x* scale
+* **y1** - the starting vertical position; bound to the *y* scale
+* **y2** - the ending vertical position; bound to the *y* scale
+* **pixelSize** - the density of the raster image; defaults to 1
+
+If a width has been specified, x1 defaults to 0 and x2 defaults to width; similarly, if a height has been specified, y1 defaults to 0 and y2 defaults to height. Otherwise, if data has been specified, x1, y1, x2, and y2 respectively default to the frame’s left, top, right, and bottom, coordinates. Lastly, if no data has been specified, and fill is a function of x and y, you must specify all of x1, x2, y1 and y2 to define the domain (see below).
+
+
+The following options are supported:
+
+* **fill** - the sample’s color; if a channel, bound to the *color* scale
+* **fillOpacity** - the sample’s opacity; if a channel, bound to the *opacity* scale
+* **x** and **y** - the sample’s coordinates
+* **imageRendering** - the [image-rendering](https://developer.mozilla.org/en-US/docs/Web/SVG/Attribute/image-rendering) attribute of the image; defaults to auto, which blends neighboring samples with bilinear interpolation. A typical setting is pixelated, that asks the browser to render each pixel as a solid rectangle (unfortunately not supported by Webkit).
+* **interpolate** - the interpolate method.
+* **blur** - the blur radius, a non-negative number of pixels, that defaults to 0.
+
+The interpolate option supports the following settings:
+* none - default if the *x* and *y* options are not null: assigns the value to the pixel under the (floor rounded) coordinates of each sample—if inside the raster
+* dense - default otherwise; assumes that the data describes every pixel on the raster of dimensions width × height, starting from the top left, in row-major order
+* nearest - evaluates each pixel with the closest sample, resulting in Voronoi cells
+* barycentric - does a Delaunay triangulation of the samples, then evaluates each triangle’s interior with a mix of the values of its vertices, weighted by the distance to each of the vertices; points outside the convex hull are extrapolated
+* random-walk - evaluates a pixel by simulating a random walk, and picking the value of the first sample reached
+* a function that receives a sample index, width and height of the raster, the *x* and *y* positions of the samples (in the coordinate system of the raster), and an array of (unscaled) values, and must return a dense array of width * height values, organized in row-major order.
+
+The defaults for this mark make it convenient to draw an image from a flat array of values representing a rectangular matrix:
+
+```js
+Plot.raster(volcano.values, {width: volcano.width, height: volcano.height, fill: volcano.values})
+```
+
+When *data* is not specified and *fill* or *fillOpacity* is a function, a sample is taken for every pixel of the raster, which allows to fill an image from a function and a two-dimensional domain:
+
+```js
+Plot.raster({x1: -1, x2: 1, y1: -1, y2: 1, fill: (x, y) => Math.atan2(y, x)})
+```
+
+<!-- jsdocEnd raster -->
+
 ### Rect
 
 [<img src="./img/rect.png" width="320" height="198" alt="a histogram">](https://observablehq.com/@observablehq/plot-rect)
@@ -2771,6 +2888,18 @@ Plot.column is typically used by options transforms to define new channels; the
 
 <!-- jsdocEnd column -->
 
+#### Plot.identity
+
+<!-- jsdoc identity -->
+
+This channel helper returns a source array as-is, avoiding an extra copy when defining a channel as being equal to the data:
+
+```js
+Plot.raster(await readValues(), {width: 300, height: 200, fill: Plot.identity})
+```
+
+<!-- jsdocEnd identity -->
+
 ## Initializers
 
 Initializers can be used to transform and derive new channels prior to rendering. Unlike transforms which operate in abstract data space, initializers can operate in screen space such as pixel coordinates and colors. For example, initializers can modify a marks’ positions to avoid occlusion. Initializers are invoked *after* the initial scales are constructed and can modify the channels or derive new channels; these in turn may (or may not, as desired) be passed to scales.

diff --git a/src/index.js b/src/index.js
@@ -4,6 +4,7 @@ export {Arrow, arrow} from "./marks/arrow.js";
 export {BarX, BarY, barX, barY} from "./marks/bar.js";
 export {boxX, boxY} from "./marks/box.js";
 export {Cell, cell, cellX, cellY} from "./marks/cell.js";
+export {Contour, contour} from "./marks/contour.js";
 export {delaunayLink, delaunayMesh, hull, voronoi, voronoiMesh} from "./marks/delaunay.js";
 export {Density, density} from "./marks/density.js";
 export {Dot, dot, dotX, dotY, circle, hexagon} from "./marks/dot.js";
@@ -14,13 +15,15 @@ export {Image, image} from "./marks/image.js";
 export {Line, line, lineX, lineY} from "./marks/line.js";
 export {linearRegressionX, linearRegressionY} from "./marks/linearRegression.js";
 export {Link, link} from "./marks/link.js";
+export {Raster, raster} from "./marks/raster.js";
+export {interpolateNone, interpolatorBarycentric, interpolateNearest, interpolatorRandomWalk} from "./marks/raster.js";
 export {Rect, rect, rectX, rectY} from "./marks/rect.js";
 export {RuleX, RuleY, ruleX, ruleY} from "./marks/rule.js";
 export {Text, text, textX, textY} from "./marks/text.js";
 export {TickX, TickY, tickX, tickY} from "./marks/tick.js";
 export {tree, cluster} from "./marks/tree.js";
 export {Vector, vector, vectorX, vectorY, spike} from "./marks/vector.js";
-export {valueof, column} from "./options.js";
+export {valueof, column, identity} from "./options.js";
 export {filter, reverse, sort, shuffle, basic as transform, initializer} from "./transforms/basic.js";
 export {bin, binX, binY} from "./transforms/bin.js";
 export {centroid, geoCentroid} from "./transforms/centroid.js";

diff --git a/src/marks/contour.js b/src/marks/contour.js
@@ -0,0 +1,199 @@
+import {blur2, contours, geoPath, map, max, min, range, thresholdSturges} from "d3";
+import {Channels} from "../channel.js";
+import {create} from "../context.js";
+import {labelof, identity} from "../options.js";
+import {Position} from "../projection.js";
+import {applyChannelStyles, applyDirectStyles, applyIndirectStyles, applyTransform, styles} from "../style.js";
+import {initializer} from "../transforms/basic.js";
+import {maybeThresholds} from "../transforms/bin.js";
+import {AbstractRaster, maybeTuples, rasterBounds, sampler} from "./raster.js";
+
+const defaults = {
+  ariaLabel: "contour",
+  fill: "none",
+  stroke: "currentColor",
+  strokeMiterlimit: 1,
+  pixelSize: 2
+};
+
+export class Contour extends AbstractRaster {
+  constructor(data, {value, ...options} = {}) {
+    const channels = styles({}, options, defaults);
+
+    // If value is not specified explicitly, look for a channel to promote. If
+    // more than one channel is present, throw an error. (To disambiguate,
+    // specify the value option explicitly.)
+    if (value === undefined) {
+      for (const key in channels) {
+        if (channels[key].value != null) {
+          if (value !== undefined) throw new Error("ambiguous contour value");
+          value = options[key];
+          options[key] = "value";
+        }
+      }
+    }
+
+    // For any channel specified as the literal (contour threshold) "value"
+    // (maybe because of the promotion above), propagate the label from the
+    // original value definition.
+    if (value != null) {
+      const v = {transform: (D) => D.map((d) => d.value), label: labelof(value)};
+      for (const key in channels) {
+        if (options[key] === "value") {
+          options[key] = v;
+        }
+      }
+    }
+
+    // If the data is null, then we’ll construct the raster grid by evaluating a
+    // function for each point in a dense grid. The value channel is populated
+    // by the sampler initializer, and hence is not passed to super to avoid
+    // computing it before there’s data.
+    if (data == null) {
+      if (typeof value !== "function") throw new Error("invalid contour value");
+      options = sampler("value", {value, ...options});
+      value = null;
+    }
+
+    // Otherwise if data was provided, it represents a discrete set of spatial
+    // samples (often a grid, but not necessarily). If no interpolation method
+    // was specified, default to nearest.
+    else {
+      let {interpolate} = options;
+      if (value === undefined) value = identity;
+      if (interpolate === undefined) options.interpolate = "nearest";
+    }
+
+    // Wrap the options in our initializer that computes the contour geometries;
+    // this runs after any other initializers (and transforms).
+    super(data, {value: {value, optional: true}}, contourGeometry(options), defaults);
+
+    // With the exception of the x, y, x1, y1, x2, y2, and value channels, this
+    // mark’s channels are not evaluated on the initial data but rather on the
+    // contour multipolygons generated in the initializer.
+    const contourChannels = {geometry: {value: identity}};
+    for (const key in this.channels) {
+      const channel = this.channels[key];
+      const {scale} = channel;
+      if (scale === "x" || scale === "y" || key === "value") continue;
+      contourChannels[key] = channel;
+      delete this.channels[key];
+    }
+    this.contourChannels = contourChannels;
+  }
+  filter(index, {x, y, value, ...channels}, values) {
+    // Only filter channels constructed by the contourGeometry initializer; the
+    // x, y, and value channels must be filtered by the initializer itself.
+    return super.filter(index, channels, values);
+  }
+  render(index, scales, channels, dimensions, context) {
+    const {geometry: G} = channels;
+    const path = geoPath();
+    return create("svg:g", context)
+      .call(applyIndirectStyles, this, dimensions, context)
+      .call(applyTransform, this, scales)
+      .call((g) => {
+        g.selectAll()
+          .data(index)
+          .enter()
+          .append("path")
+          .call(applyDirectStyles, this)
+          .attr("d", (i) => path(G[i]))
+          .call(applyChannelStyles, this, channels);
+      })
+      .node();
+  }
+}
+
+function contourGeometry({thresholds, interval, ...options}) {
+  thresholds = maybeThresholds(thresholds, interval, thresholdSturges);
+  return initializer(options, function (data, facets, channels, scales, dimensions, context) {
+    const [x1, y1, x2, y2] = rasterBounds(channels, scales, dimensions, context);
+    const dx = x2 - x1;
+    const dy = y2 - y1;
+    const {pixelSize: k, width: w = Math.round(Math.abs(dx) / k), height: h = Math.round(Math.abs(dy) / k)} = this;
+    const kx = w / dx;
+    const ky = h / dy;
+    const V = channels.value.value;
+    const VV = []; // V per facet
+
+    // Interpolate the raster grid, as needed.
+    if (this.interpolate) {
+      const {x: X, y: Y} = Position(channels, scales, context);
+      // Convert scaled (screen) coordinates to grid (canvas) coordinates.
+      const IX = map(X, (x) => (x - x1) * kx, Float64Array);
+      const IY = map(Y, (y) => (y - y1) * ky, Float64Array);
+      // The contour mark normally skips filtering on x, y, and value, so here
+      // we’re careful to use different names (0, 1, 2) when filtering.
+      const ichannels = [channels.x, channels.y, channels.value];
+      const ivalues = [IX, IY, V];
+      for (const facet of facets) {
+        const index = this.filter(facet, ichannels, ivalues);
+        VV.push(this.interpolate(index, w, h, IX, IY, V));
+      }
+    }
+
+    // Otherwise, chop up the existing dense raster grid into facets, if needed.
+    // V must be a dense grid in projected coordinates; if there are multiple
+    // facets, then V must be laid out vertically as facet 0, 1, 2… etc.
+    else if (facets) {
+      const n = w * h;
+      const m = facets.length;
+      for (let i = 0; i < m; ++i) VV.push(V.slice(i * n, i * n + n));
+    } else {
+      VV.push(V);
+    }
+
+    // Blur the raster grid, if desired.
+    if (this.blur > 0) for (const V of VV) blur2({data: V, width: w, height: h}, this.blur);
+
+    // Compute the contour thresholds; d3-contour unlike d3-array doesn’t pass
+    // the min and max automatically, so we do that here to normalize, and also
+    // so we can share consistent thresholds across facets. When an interval is
+    // used, note that the lowest threshold should be below (or equal) to the
+    // lowest value, or else some data will be missing.
+    const T =
+      typeof thresholds?.range === "function"
+        ? thresholds.range(...(([min, max]) => [thresholds.floor(min), max])(finiteExtent(VV)))
+        : typeof thresholds === "function"
+        ? thresholds(V, ...finiteExtent(VV))
+        : thresholds;
+
+    // Compute the (maybe faceted) contours.
+    const contour = contours().thresholds(T).size([w, h]);
+    const contourData = [];
+    const contourFacets = [];
+    for (const V of VV) contourFacets.push(range(contourData.length, contourData.push(...contour(V))));
+
+    // Rescale the contour multipolygon from grid to screen coordinates.
+    for (const {coordinates} of contourData) {
+      for (const rings of coordinates) {
+        for (const ring of rings) {
+          for (const point of ring) {
+            point[0] = point[0] / kx + x1;
+            point[1] = point[1] / ky + y1;
+          }
+        }
+      }
+    }
+
+    // Compute the deferred channels.
+    return {
+      data: contourData,
+      facets: contourFacets,
+      channels: Channels(this.contourChannels, contourData)
+    };
+  });
+}
+
+export function contour() {
+  return new Contour(...maybeTuples(...arguments));
+}
+
+function finiteExtent(VV) {
+  return [min(VV, (V) => min(V, finite)), max(VV, (V) => max(V, finite))];
+}
+
+function finite(x) {
+  return isFinite(x) ? x : NaN;
+}