-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathREADME.Rmd
158 lines (106 loc) · 6.06 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# controlledburn
<!-- badges: start -->
[![R-CMD-check](https://github.com/hypertidy/controlledburn/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/hypertidy/controlledburn/actions/workflows/R-CMD-check.yaml)
<!-- badges: end -->
The goal of controlledburn is to rasterize without materializing any pixel values.
A very fast rasterization algorithm for polygons in [fasterize](https://github.com/ecohealthalliance/fasterize) does the following:
- restructures polygons to edges
- considers only non-horizontal polygon edges and indexes them in y,x order (slope always down)
- determines all affected raster rows and scans these for edge *start* and *end* column
- burns polygon value or identity into every pixel between those starts and ends
controlledburn avoids that last step. With these rasterization functions the return value is not a set of pixels or a side-effect of
a materialized raster file but simply those row and column start and end indexes.
This is an expression of the *cell abstraction* wish item [fasterize/issues/11](https://github.com/ecohealthalliance/fasterize/issues/11).
## TODO
- [ ] move to cpp11
- [ ] rasterize lines [fasterize/issues/30](https://github.com/ecohealthalliance/fasterize/issues/30)
- [ ] formats for import (wk, geos, grd, rct, triangles etc.)
- [ ] streaming with wkb/xy unpack with wk
- [ ] provide output options (see next section)
- [ ] port back into fasterize, with options for efficiently writing out to a tiled and sparse GeoTIFF
- [x] points is too easy, see vaster::cell_from_xy
- [x] name the package
- [x] copy logic from fasterize, and remove Armadillo array handling
- [x] remove use of raster objects, in favour of input extent and dimension
- [x] remove all trace of the raster package
- [x] implement return of the 'yline, xpix' and polygon ID info to user (see below)
- [x] make return of ylin,xpix structure efficient (CollectorList.h ftw)
Lines is working but still has some problems. For polygons it's pretty good, see [fasterize #6](https://github.com/ecohealthalliance/fasterize/issues/6) for a remaining issue.
## Outputs
Internal function `burn_polygon()` has arguments `sf`, `extent`, `dimension`. The first is a sf polygons data frame, extent is `c(xmin, xmax, ymin, ymax)` and dimension is `c(ncol, nrow)`. In fasterize you need an actual non-materialized *RasterLayer* object, but all that was
really used for was the six numbers extent, dimension and as the shell for the in-memory output.
The output of `burn_polygon()` is a list of four-element indexes `start,end,row,poly_id` - these are zero-based atm because they reflect the underlying C++ code. Examples shown here flatten this to a 4-column matrix (and add 1).
These options are still in play for what the interface/s could do:
* record presence of polygon (this is all we have currently) OR ID of polygon
* tools to format this meaningfully, and plot lazily (see example for quick plot)
* tools to materialize as actual raster data
I also found this real world example for which this is relevant, discussed in PROJ for
very fast lookup for large non-materialized (highly compressed) grids by Thomas
Knudsen:
https://github.com/OSGeo/PROJ/issues/1461#issuecomment-491501992
## Installation
You can install the development version of controlledburn like so:
``` r
remotes::install_github("hypertidy/controlledburn")
```
## Example
This is a basic example, this is fast, and shows that it works.
```{r example}
pols <- silicate::inlandwaters
library(vaster)
## define a raster (xmin, xmax, ymin, ymax), (ncol, nrow)
ext <- unlist(lapply(silicate::sc_vertex(pols), range))
dm <- c(50, 40)
r <- controlledburn:::burn_polygon(pols, extent = ext,
dimension = dm)
## our index is triplets of start,end,line where the polygon edge was detected -
## this essentially an rle by scanline of start,end polygon coverage
index <- matrix(unlist(r, use.names = F), ncol = 4L, byrow = TRUE) + 1 ## plus one because 0-index internally
## plot just the start and ends of each scanline detected
xy0 <- vaster::xy_from_cell(dm, ext, vaster::cell_from_row_col(dm, index[,c(3, 3)], index[,1:2]))
plot(xy0)
plot(silicate::PATH0(pols), add = TRUE)
## expand out to every cell
cr <- do.call(rbind, apply(index, 1,\(.x) cbind(seq(.x[1], .x[2]), .x[3], .x[4])))
xy <- vaster::xy_from_cell(dm, ext, vaster::cell_from_row_col(dm, cr[,2], cr[,1]))
plot(xy, pch = 19, cex = .3, col = palr::d_pal(cr[,3]))
plot(silicate::PATH0(pols), add = TRUE)
```
It scales to very large tasks, with small output.
```{r fast}
dm <- c(500000, 400000)
system.time(r <- controlledburn:::burn_polygon(pols, extent = ext,
dimension = dm))
length(r)
## consider a prod(dm) raster of type double (or even bool) obviously would compress again but why churn
pryr::object_size(r)
```
The following is inefficient, but shows that we get the right result.
```{r bad}
dm <- c(500, 400)
system.time(r <- controlledburn:::burn_polygon(pols, extent = ext,
dimension = dm))
index <- matrix(unlist(r, use.names = F), ncol = 4L, byrow = TRUE) + 1 ## plus one because 0-index internally
## now go inefficient, this is every column,row index, then converted to cell, converted to xy
cr <- do.call(rbind, apply(index, 1, \(.x) cbind(seq(.x[1], .x[2]), .x[3], .x[4])))
xy <- vaster::xy_from_cell(dm, ext, vaster::cell_from_row_col(dm, cr[,2], cr[,1]))
plot(xy, pch = ".", col = palr::d_pal(cr[,3]))
rr <- terra::rast(cbind(xy, 0), type = "xyz")
rr[terra::cellFromXY(rr, xy)] <- 1
terra::plot(rr ,col = "firebrick", asp = NA)
plot(silicate::SC0(pols), add = TRUE)
```
## Code of Conduct
Please note that the controlledburn project is released with a [Contributor Code of Conduct](https://contributor-covenant.org/version/2/1/CODE_OF_CONDUCT.html). By contributing to this project, you agree to abide by its terms.