Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hybrid tiler to Windshaft service #3655

Merged
merged 4 commits into from
Oct 16, 2024
Merged

Conversation

KlaasH
Copy link
Contributor

@KlaasH KlaasH commented Oct 10, 2024

Overview

Adds an endpoint to the Tiler service that passes requests along to a TiTiler-MosaicJSON endpoint, then returns the resulting tiles to the user but also caches them to S3 so they won't have to be generated again for subsequent requests.

The endpoint is minimally integrated with Windshaft—it seemed like it should be possible to create an http tile source and use most of the Windshaft machinery to do something like this, but after multiple failed attempts I gave up and implemented it more like the health-check endpoint, i.e. as an endpoint that's run by the same service as Windshaft but pretty much does its own thing, just using the Express.js functionality of the Windshaft server. To take advantage of the S3 caching behavior, however, the function that handles that had to be pulled out from the Windshaft class and made into a stand-alone utility function. So it (and sendError) live in src/tiler/http/utils.js now. The changes required to break them out were fairly minimal.

The TiTiler service identifies layers by a UUID that gets assigned when they're first created, so we need this endpoint to provide the right UUID to the upstream TiTiler service, one way or another. Rather than feed those UUIDs to the front end and end up with long, inscrutable URLs for the tiler service endpoint, I set it up so that the service translates nice human-readable params (layer name and year) into the right UUID. That means the layer name and year -> UUID mapping needs to be provided to the service, and the answer is different between staging and production. So I added a variable for it, titiler_layer_map, which gets fed into an environment variable, MMW_TITILER_LAYER_MAP, in the tiler VM. To avoid having to try to pass a slightly complex JSON structure into an Ansible variable, I used a strategy that has been used elsewhere on the project, encoding each layer's values into a single string with __ between the terms, and passing in a comma-separated list of those terms.

To confirm that the CloudFormation changes work and that the hybrid tiler works as expected in a deployed environment, I got Jenkins to build and deploy this branch. So the current site on staging (as of right now, on Oct 10) is showing the changes here, and caching any tiles it generates to the staging tile cache.

Connects WikiWatershed/mmw-tiler#11

Demo

image

image

image

image

Notes

  • The colormap for the TiTiler endpoint has to be passed as a query parameter, and it's pretty gnarly. In theory it would make sense for it to be provided as a variable, like the layer UUIDs, but for now, when there's only one, hard-coded seems fine. And it might be fine in the long term, too, since in contrast to the UUIDs, it doesn't need to change between environments, so having a hard-coded layer->colormap config object in the code seems like a reasonable approach.
  • Currently three's only one dataset being provided by TiTiler (global land cover) but, per @rajadain's suggestion, I included a short version of the dataset name in the endpoint URL, so that the variable format and the code that parses it won't need to be changed if more TiTiler-provided datasets are added.

Testing Instructions

  • Confirm it's working on the staging site, and that you get cached tiles from the CloudFront source and new tiles from either green-tiles.staging.modelmywatershed.org or blue-tiles.staging.modelmywatershed.org, depending on which stack is active.
  • Spin up your development instance and load up the site. Go to the "Coverage grid" tab of the "Layers" map widget and you should see the seven (2017-2023) IO LULC layers. Selecting any of them should get you tiles. The local TILER_URL does not pull from the cache, so they'll load somewhat slowly. And the local tiler doesn't have a cache bucket configured, so they won't write the generated tiles to the cache.
  • If you want to test the caching behavior locally, you can get your local tiler to write to cache:
    • set a value (e.g. tiler.staging.modelmywatershed.org) in /etc/mmw.d/env/MMW_TILECACHE_BUCKET within your tiler virtual machine
    • stop the running tiler service (sudo systemctl stop mmw-tiler)
    • set AWS credentials in your environment (e.g. by copying the access key from the ModelMW Staging account at https://stroudcenter.awsapps.com/start/#/)
    • start the tiler manually (cd /opt/tiler && npm run watch)
  • Note: Getting the cache to redirect to your local tiler would require either creating a new cache bucket with a rule that missing tiles redirect to http://33.33.34.35/ or modifying the staging bucket to do that. It doesn't seem worth the trouble, since we can confirm it's working on the staging site.

Copy link
Member

@rajadain rajadain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 tested on staging, this is working well. Great work! I agree with all the design choices you've made.

As we go through the process of promoting this to production, we should also document how the TiTiler bits work. We may repurpose WikiWatershed/mmw-tiler#6 for this.

utils.cacheTile(req, tile, bucket);
}
}
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
);
);

Adds a custom endpoint to our Windshaft implementation to proxy requests
to TiTiler and cache the resulting tiles on S3.

Some details:
- Moves the `sendError` and `cacheTile` functions out of windshaftServer
  and into a standalone 'utils.js' so they can be used in both in the
  existing class and in the new endpoint.
- Adds a ContentType param to the S3 upload so tiles from the cache will
  come back as 'image/png' and get shown in a browser rather than downloaded
  (Leaflet doesn't care but it's nice for debugging).
- Uses a `/titiler/YEAR/` URL format that gets translated in code into the
  proper URL for TiTiler-Mosiacjson, i.e. with the moscaic UUID in the URL
  and the colormap as a query parameter.
  TODO: the year->UUID mapping and the base URL are currently hard-coded, so
  they need to be moved into environment variables.
Gets rid of the hard-coded values for the TiTiler URL and layer UUID mappings
and puts the values in environment variables instead. The layer mapping is
complex and gets encoded in the variable as a comma-separated list of strings
with the values (layer, year, and UUID) separated by '__'.
The coded parses that into a {layer: { year: UUID }} object.

Note that the setup this is replacing didn't have the 'layer' value, it just
assumed there was only one. Currently there's only one, but including it in
the config will make it easier to add more in the future.
Adds the new global land cover layers, from 2017 to 2023, and removes the
one from the previous proof-of-concept hybrid tiler setup.
To be able to make the URL dynamic by environment (but without creating
an import loop between layer_settings.py and base.py), this moves the
MMW_TILER_HOST definition into here and adds it to the list of variables
imported into `base.py`.
@KlaasH KlaasH force-pushed the kjh/hybrid-tiler-new-strategy branch from 28c560c to 20cc0f7 Compare October 16, 2024 02:22
@KlaasH KlaasH merged commit 953fc15 into develop Oct 16, 2024
@KlaasH KlaasH deleted the kjh/hybrid-tiler-new-strategy branch October 16, 2024 02:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants