datasets.SEN12MS - dB values are cast to int32 #500

khdlr · 2022-04-06T14:36:06Z

When loading samples from the SEN12MS dataset, the Sentinel-1 dB values (floats ranging from around -30 to 0) are cast to int32, discarding a lot of important information.

This line seems to be the culprit.

Fixing this would require either breaking the current behaviour where S1 and S2 imagery are stacked into a single tensor, or casting everything to float32 (not sure if this is okay for S2 data)

adamjstewart · 2022-04-06T15:20:19Z

I vote for casting everything to float32. I think PyTorch will automatically do this for us, so all you have to do is remove the cast to int32. Want to open a PR?

khdlr · 2022-04-06T15:44:51Z

Sure, I'm happy to open a PR 😊

I believe the reason for the cast is that the Sentinel-2 imagery comes as uint16 data, which is not a thing in torch. In general, the geo-tiffs have the following datatypes:

Sentinel-1: float32
Sentinel-2: uint16
Label:      uint8

My current workaround is to just cast the uint16 to int32 and leave the others as they are. As you said, PyTorch will automatically cast the result to float32 when stacking.

Also not sure about the labels – but I don't believe having them as int32 is that useful.

adamjstewart added this to the 0.2.2 milestone Apr 6, 2022

adamjstewart added the datasets Geospatial or benchmark datasets label Apr 6, 2022

khdlr mentioned this issue Apr 6, 2022

Fix data casting for the SEN12MS dataset #502

Merged

adamjstewart closed this as completed in #502 Apr 11, 2022

adamjstewart modified the milestones: 0.2.2, 0.3.0 Jul 2, 2022

adamjstewart mentioned this issue Jul 11, 2022

0.3.0 release #664

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

datasets.SEN12MS - dB values are cast to int32 #500

datasets.SEN12MS - dB values are cast to int32 #500

khdlr commented Apr 6, 2022

adamjstewart commented Apr 6, 2022

khdlr commented Apr 6, 2022

datasets.SEN12MS - dB values are cast to int32 #500

datasets.SEN12MS - dB values are cast to int32 #500

Comments

khdlr commented Apr 6, 2022

adamjstewart commented Apr 6, 2022

khdlr commented Apr 6, 2022